Skip to content
Technical GlossarySpeech, Voice and Audio AI

Mask-Based Speech Enhancement

An approach that predicts masks over time-frequency representations to preserve speech components while suppressing noise.

Mask-based speech enhancement is a powerful framework widely used in modern speech enhancement systems. The model attempts to determine which parts of a spectrogram belong to speech and which to noise. This can lead to significant quality improvements, especially for ASR preprocessing and in low-SNR environments.