NYC is amazing! I get it now. The food is cheap and delicious, the city runs 24/7, walkable streets with decent public transpo…amazing!

Obssessing over the e-ink theme
Recently, I’m enjoying “e-ink” themes for some of my websites. For example, the website version of this micro blog tries to emulate this set-up. It is fake e-ink, as I don’t really use these sites in the context of an e-ink reader. I just enjoy these high-contrast and pure black-and-white themes with a “retro aesthetic” as they look timeless and classy.
For example, Obsidian has this gorgeous e-ink mode that I’ve been using since:

I’ve also been itching to try out this nice Sacred Computer theme, as it embodies the same style and vibe I’m hoping to achieve.
Where I'm getting AI News nowadays
Although I still maintain a Twitter and Bluesky account to follow researcher accounts, I avoid opening these websites as they can get a bit toxic towards both sides of the political spectrum. For the past year, I find the following resources helpful for getting up to date in AI research:
- AI News is really good. Every night they send you a summary of topics and discussions in various social media sites, Discord servers, and subreddits. A good way to get an overview of all things AI.
- Scholar Inbox is perfect for academics. You curate your list of research topics and then they send you recommendations of new arXiv papers that release everyday. This is really nice to get the “first scoop” of preprints before they are announced in social media.
- Interconnects is like Stratechery for NLP written by someone who’s actually in the trenches. A lot of insightful opinions and it gives you a nice synthesis of the field itself.
- r/LocalLlama is probably the best AI/NLP subreddit right now. Most of the comments are informative, and seem to come from actual practitioners. I think it’s a good way to get a general vibe of the field itself.
On building my personal LLM benchmark
Over the past week, I’ve experimented upon building a personal LLM benchmark that tests models based on my specific tastes and requirements. I’d definitely encourage anyone to make their own suites, but it’s very involved. I’ll probably write about my experience soon in an actual blog post…
Anyway, upon investigating my test-cases, I learned that most of the things I care about are a matter of taste and style. For example, I prefer matplotlib
code to be done a certain way (using fig, ax = plt.subplots(...)
vs direct plt
) or concise responses over verbose ones. I wonder if there’s a way we can incorporate these personal preferences during finetuning (assuming we have the resources to do so) with zero-to-few human annotation effort.
This reminds me of a slide from John Schulman’s talk on open problems in High-Quality Human Feedback (we actually took a stab on this problem in a previous work):

Again, this is something I want to write more about in the future. Still organizing my thoughts on this. And by the way, seems like Qwen/Qwen2.5-14B-Instruct-1M is the best model in my personal benchmark so far when accounting for performance and cost-efficiency.
An observation: it seems like tooling for LLM inference diverges between research/academia and industry in some ways. For the past year, I’ve been using a lot of vLLM for inference and (recently) curator for data generation–mostly for research. But I’ve seen a lot of my colleagues from the industry use outlines, LangGraph, and PydanticAI.
Kingdom Come Deliverance II (KCD 2) is so immersive that it might as well be my Game of the Year (I know, it’s too early to say). The potion-making tutorial is so good, it feels like I’m doing it for real!

Blacksmithing is also relaxing! We made so many horseshoes and axes in my first day :)

This is definitely a “game you play during the holidays,” so I can’t wait for summer to sink hours into this gem. I’m still too early to the story yet, but I already recommend this for everyone!
TIL: arXiV-ready LaTeX files from Overleaf
One persistent problem I often come across is how the LaTeX files from Overleaf that I upload to arXiv have source or compatibility errors. Today, I learned that instead of downloading the zip archive to get the LaTeX source, I should use the “Submit” button.
Don’t click the archive button:

Instead, go to your Overleaf project > Submit > Submit Your Papers to ArXiV:

This creates an optimized zip file that is 100% compatible with arXiv!
On Filipino NLP
Over the holidays, I’ve been thinking a lot about what it means to do Filipino NLP now that we’re in the age of LLMs. Multilingual LLMs are getting better and core NLP tasks such as NER or sentiment analysis are now streamlined by models like GPT-4.
I’ve decided to bet on post-training and adaptation. I believe that this unlocks several opportunities for resource-constrained and small Filipino NLP research communities to contribute to a larger whole. Here’s an excerpt from my blog post:
While I still believe in building artisanal Filipino NLP resources, I now see that we need to simultaneously support the development of multilingual LLMs by creating high-quality Filipino datasets and benchmarks. This way, we can actively push for the inclusion of Philippine languages in the next generation of multilingual LLMs, rather than just waiting for improvements to happen on their own.