Blog | Arnav Raj

Short notes, half‑baked research thoughts, and occasional write‑ups when something finally works (or fails in an interesting way).

Jan 22, 2026 Detecting Hallucinations via Hyperbolic Geometry: An Internal Representation Approach
Language models hallucinate. They state falsehoods with confidence, fabricate citations, and produce plausible-sounding nonsense. The sta...
Dec 13, 2025 What Makes a Good RLHF Task? Lessons from Training Data Research
Since November 2024, I’ve been working at Abundant AI designing training data for RLHF (Reinforcement Learning from Human Feedback) that ...
Oct 24, 2025 Beyond Accuracy: Evaluating Chain-of-Thought Reasoning in Production
Chain-of-Thought (CoT) prompting has become the de facto standard for complex reasoning tasks with LLMs. But there’s a dirty secret: a co...
Sep 16, 2025 Machine Unlearning: Making Models Forget Without Breaking Everything Else
Imagine you’ve trained a language model on millions of documents, and then discover that some training data contains: Private informat...