Skip to content
Technical GlossarySpeech, Voice and Audio AI

Voice Activity Detection

A core timing task that determines which parts of an audio signal contain speech.

Voice activity detection acts as a front gate for diarization and ASR systems. Finding where speech starts and ends is critical for both efficiency and accuracy. Poor VAD decisions can mistake silence for speech or miss short utterances. It also plays a central role in latency and stability for real-time speech systems.