Arnav Raj

Dual Degree (B.Tech + M.Tech) Computer Science & Engineering, IIT Delhi

Fourth‑year dual degree student focused on AI safety and LLM reliability. Currently contributing to RLHF training data at Abundant AI for top global AI labs. Actively exploring reinforcement learning while leveraging experience with operating systems, networks, and performance‑minded backend work. Submitted ICLR 2026 workshop paper on hallucination detection in Hyperbolic space. I like designing retrieval‑augmented generation pipelines, building observability + evaluation tooling that turns model behavior into measurable signals, and squeezing latency/quality trade‑offs in large‑scale inference.

AI Safety RLHF RAG LLM Evaluation Model Interpretability RL (Exploring)

Open to Summer 2026 internships and collaborative projects in AI & RL.

Education

  • Indian Institute of Technology Delhi
    B.Tech and M.Tech in Computer Science & Engineering • 2022-2027
  • Mess Secretary, Zanskar Hostel
    Leadership & Operations • Jun 2024 – May 2025
  • Senior Editor, Tech Ambit (Pan-IIT Magazine)
    Editorial & Tech Strategy • 2023 – 2025

Honors & Awards

  • Founding member of AI Safety Club, IIT Delhi
  • JEE Advanced All India Rank 1158 (1M+ candidates)
  • 2× Smart India Hackathon National Top 5 Finalist
  • KVPY SX Fellowship (Gov. India & IISc Bangalore)
  • Best Mess Secretary, IIT Delhi
  • National Science Olympiads: Top 250 Astronomy, Top 300 Chemistry
  • Codeforces Expert (1700+ Rating)

Experience

  • Harvard University Research Intern – Edge Computing Lab May 2024 – Dec 2024 • Remote (Cambridge, MA)
    • Built LangChain benchmarking framework for RTL code generation across GPT‑4 and Llama models.
    • Implemented end‑to‑end validation pipeline: syntax checking → testbench validation → PPA analysis with automated re‑prompting for failing designs.
    • Compared Chain‑of‑Thought, zero‑shot, and few‑shot prompting strategies across graded design complexity.
    • Tracked accuracy and latency metrics across different prompt engineering approaches.
  • Georgia Institute of Technology Research Intern – FSI Lab May 2024 – Jun 2025 • Remote (Atlanta, GA)
    • Co‑developed KG‑MuLQA framework for generating multi‑hop knowledge‑graph questions (ACL 2025 ARR submission).
    • Created dataset of 20,139 long‑context multi‑hop QA pairs for structured reasoning evaluation.
    • Built scalable LLM benchmarking pipeline with auto‑chunking, batched generation, and multi‑chunk answer synthesis across 170 credit agreements.
    • Designed evaluation infrastructure for long‑context understanding in financial documents.

Recent News

Quick Links