# TR Quality Pipeline: KenLM Perplexity + Slur/PII Filter + Educational-Value

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-tr-quality-pipeline-kenlm-pii
> Updated: 2026-05-14T14:42:56.127Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part IX — Turkish-First & Localization Engineering
**TLDR:** From raw TR corpus to quality FT data: KenLM 5-gram TR perplexity (gibberish/MT artifact filter), TR slur filter, TR PII detection (TC ID, phone, email), educational-value scorer (FineWeb adaptation). Clean 100GB TR corpus in 4h on RTX 4090.

