Arnav Raj
Dual Degree (B.Tech + M.Tech) Computer Science & Engineering, IIT Delhi
Fourth-year dual degree student focused on AI safety and LLM reliability. Currently at Google DeepMind (SFT and evaluation data for internal Gemini models) and Abundant AI (YC-24). Two sole-authored submissions to ICML 2026 workshops on RLHF reward-model calibration and delay-aware RLHF; accepted work at ICLR 2026 on probing LLM reasoning with hyperbolic geometry; co-author on KG-MuLQA accepted at ACL 2026. Interested in building evaluation and interpretability tooling that makes models more robust and trustworthy.
Google DeepMind (IC)
Authoring supervised fine-tuning and evaluation data for internal Gemini models on expert-level machine-learning and deep-learning reasoning tasks. Designing structured rubrics for correctness, reasoning depth, and pedagogical clarity that feed directly into Gemini's training-data quality pipeline.
Abundant AI (YC-24)
Building GPU-backed evaluation infrastructure for long-running research agents, enabling reinforcement-learning-driven data curation on tasks that require deep-learning compute rather than CPU-only sandboxes. Previously designed adversarial ML tasks that exposed systematic failures in frontier LLMs; datasets adopted by three of the top five global AI labs.
Georgia Institute of Technology Financial Services Innovation Lab
Co-developed KG-MuLQA (accepted at ACL 2026), a framework for generating multi-hop knowledge-graph-grounded questions. Built the end-to-end pipeline producing 20,139 QA pairs across 170 financial documents.
Harvard University Edge Computing Lab
Built a LangChain benchmarking framework for evaluating LLM-generated RTL hardware designs across GPT-4 and Llama. Implemented end-to-end validation (syntax, testbench, PPA analysis) and compared prompting strategies for hardware code generation.
PEBS: Per-rater Empirical-Bayes Shrinkage for RLHF Reward-Model Calibration
Solo-authored. Introduced a lightweight statistical layer that calibrates RLHF reward models to each individual human annotator rather than collapsing thousands of raters into a single average. Improves alignment to genuine human preferences on standard pluralistic-alignment benchmarks while leaving the underlying reward model unchanged.
Retroactive Advantage Correction: Closed-Form V-Trace Bias Correction for Delay-Aware RLHF
Solo-authored. Proposed a correction primitive for production RLHF pipelines where reward signals arrive late, such as slow code verifiers, large judge ensembles, and queued human review. Allows policy training to continue using delayed feedback instead of discarding it, with a closed-form proof of unbiasedness and substantial improvement over standard wait-or-drop strategies.
KG-MuLQA: Multi-hop Question Answering over Knowledge Graphs for Long-Context Evaluation
Benchmark for evaluating multi-hop reasoning in long-context language models, built from knowledge graphs over real-world financial documents. Designed and implemented the end-to-end pipeline for question generation, answer synthesis, and evaluation across frontier LLMs.
arXiv →Hyperbolic Geometry of Reasoning: Probing LLM Hidden States
Solo-authored. Interpretability study of how chain-of-thought reasoning models internally encode hierarchical structure. Showed that hyperbolic geometric probes recover this structure across the entire network while standard Euclidean probes fail in the deepest layers, evidence that reasoning models compress hierarchy into the representations responsible for the final answer.
OpenReview →LLM Code Agent Evaluation Framework
Comprehensive evaluation suite for LLM code generation inspired by SWE-bench and MLE-bench, achieving 73% task completion with iterative self-correction.
RL Agent for Code Optimization
PPO-based reinforcement learning agent that optimizes code performance through iterative refinement, achieving 35% runtime reduction and 20% memory improvement.
Hangman AI: Transformer-Based Game Solver
Transformer-driven Hangman solver achieving >60% success using character-level modeling, morphological augmentation, and multi-strategy guess selection.
Graph Neural Network for User Personality Prediction
Bipartite user–product interaction graph leveraged with GNN architecture to infer user personality traits with high accuracy.
Data Search & Retrieval Engine
Neural-augmented retrieval system combining classical inverted indexing with dense vector search for hybrid relevance scoring across semi-structured technical documents.
Context-Aware Spelling Correction
Noisy-channel + smoothed N-gram language modeling system achieving 88% accuracy on context-dependent spelling errors.
TotalRecall: Cognitive Health App
Cross-platform Flutter app with AI-assisted memory support workflows for Alzheimer’s patients, backed by Firebase services.
Advanced Analytic Tool
A modular analytical platform that ingests heterogeneous datasets (CSV/JSON/SQL streams) and provides extensible pipelines for preprocessing, feature engineering, and rapid experimentation with classical ML and lightweight deep models.
AI Player for Havannah Board Game
Monte Carlo Tree Search (MCTS) agent with UCB exploration achieving >80% win rate over strong RAVE-only baselines.
SDN-based Intelligent Network Controller
High‑throughput OpenFlow controller with proactive L2 learning, shortest‑path routing, and loop prevention for complex topologies.
Application-Layer Reliable Transport & Congestion Control
Implemented a reliable transport protocol over UDP plus Reno & CUBIC congestion control variants with detailed performance benchmarking.
OS Kernel Enhancements in xv6
Implemented a page swapping subsystem in the xv6 teaching OS, including victim selection, swap slot management, and page fault handling.
I'm a fourth-year dual degree student at IIT Delhi (B.Tech + M.Tech in Computer Science) working at the intersection of AI safety, LLM evaluation, and interpretability. My research focuses on understanding how language models reason and where they fail, particularly through geometric and probing-based approaches.
At Google DeepMind, I author supervised fine-tuning and evaluation data for internal Gemini models, designing rubrics that feed directly into the training-data quality pipeline. At Abundant AI, I build GPU-backed evaluation infrastructure for long-running research agents, and previously designed adversarial ML benchmarks that are now part of the training pipeline for three of the world's top five AI labs. My solo-authored paper on probing LLM hidden states with hyperbolic geometry was accepted at the ICLR 2026 Workshop (GRaM). Two further solo-authored papers, PEBS on per-rater calibration of RLHF reward models and Retroactive Advantage Correction on delay-aware RLHF, are under review at ICML 2026 workshops. I co-authored KG-MuLQA, a long-context multi-hop evaluation framework, accepted at ACL 2026.
Outside of research, I co-founded the AI Safety Club at IIT Delhi and completed alignment training through BlueDot Impact and ARENA. I enjoy working on problems where evaluation methodology, model behavior, and safety considerations intersect.
Indian Institute of Technology Delhi
NK Security Scholar
Merit-based scholarship awarded to top 30 students at IIT Delhi for academic and technical excellence
Smart India Hackathon
National Top 5 Finalist in both 2023 and 2024 editions (India's largest student innovation competition)
JEE Advanced 2022
All India Rank 1,158 out of 1,000,000+ candidates (top 0.1%)
KVPY SX Fellowship 2021
National science fellowship awarded by the Government of India and IISc Bangalore
Codeforces Expert
1700+ rating in competitive programming
National Science Olympiads
Top 250 Astronomy, Top 300 Chemistry in India
Founding Member & Technical Lead / AI Safety Club, IIT Delhi
Co-founded student research group on AI alignment, interpretability, and evaluation. Led reading groups on mechanistic interpretability (TransformerLens) and safety frameworks. Completed BlueDot Impact AI safety training and ARENA curriculum.
Technical Consultant / STEM AI Hackathon 2026 (AI-Collab Hack)
Providing technical mentorship to 20+ teams building AI agents for STEM education at a hackathon jointly organized by IIT Delhi, Imperial College London, and Microsoft Garage.
Senior Editor / Tech Ambit (Pan-IIT Magazine)
Led 15-member editorial team across 23 IITs; curated and edited 30+ technical articles on AI and systems research.
Mess Secretary / Zanskar Hostel, IIT Delhi
Elected by 400+ residents; managed operations team of 13. Awarded Best Mess Secretary for digitalization initiatives.