Context-Aware Spelling Correction

Implemented a probabilistic spelling correction pipeline emphasizing contextual disambiguation beyond edit distance heuristics.

Language Models: Trained unigram–5gram models; evaluated perplexity across smoothing variants (Kneser‑Ney, interpolated, Add‑k, Good‑Turing, Stupid Backoff).
Noisy Channel: Candidate generation via Damerau‑Levenshtein edits + phonetic approximations; ranked with P(word|context)*P(error|true).
Smoothing Study: Comparative analysis highlighted modified Kneser‑Ney superiority on sparse tail distributions.
Evaluation: 88% accuracy on curated context‑sensitive confusion set (homophones, morphological variants).

Results: Demonstrated robust context modeling outperforming naive frequency and pure edit-distance baselines.

Tech: Python, NLTK.

Error analysis logs false corrections with context windows to guide smoothing and candidate generation adjustments.