# Streaming ASR: faster-whisper + distil-whisper — Real-Time Latency Budget < 200ms

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-streaming-asr-faster-whisper-distil
> Updated: 2026-05-14T14:42:54.946Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part VII — Speech & Audio Fine-Tuning
**TLDR:** Whisper is fast offline (batch) but not optimized for streaming. Solution: **faster-whisper** (CTranslate2 + INT8), **distil-whisper** (50% layers reduced student). Latency budget < 200 ms first-token, 70× real-time. Turkish streaming setup on RTX 4090: chunking, VAD, partial hypotheses.

