Blog
Short notes, half‑baked research thoughts, and occasional write‑ups when something finally works (or fails in an interesting way).
-
Detecting Hallucinations via Hyperbolic Geometry: An Internal Representation Approach
Language models hallucinate. They state falsehoods with confidence, fabricate citations, and produce plausible-sounding nonsense. The sta...
-
What Makes a Good RLHF Task? Lessons from Training Data Research
Since November 2024, I’ve been working at Abundant AI designing training data for RLHF (Reinforcement Learning from Human Feedback) that ...
-
Beyond Accuracy: Evaluating Chain-of-Thought Reasoning in Production
Chain-of-Thought (CoT) prompting has become the de facto standard for complex reasoning tasks with LLMs. But there’s a dirty secret: a co...
-
Machine Unlearning: Making Models Forget Without Breaking Everything Else
Imagine you’ve trained a language model on millions of documents, and then discover that some training data contains: Private informat...