Glossary Library

Technical GlossarySpeech, Voice and Audio AI

Wav2Vec 2.0 Pretraining

A self-supervised approach that learns strong speech representations from unlabeled audio and improves ASR and speech tasks.

Wav2Vec 2.0 pretraining made it possible to learn high-quality speech representations from large volumes of unlabeled audio. This is especially valuable in languages and domains where annotation cost is high. Strong ASR performance can then be achieved with relatively little labeled data during fine-tuning. It is one of the methods that fundamentally changed the self-supervised learning paradigm in speech.

You Might Also Like

Explore these concepts to continue your artificial intelligence journey.

Glossary Cover

yapay-zeka-temelleri

Representation Learning

An approach in which informative, discriminative, and task-relevant internal representations are learned automatically from raw data.

Glossary Cover

makine-ogrenmesi

Non-negative Matrix Factorization

A dimensionality reduction technique that produces part-based and interpretable representations in non-negative data.

Glossary Cover

makine-ogrenmesi

Autoencoder-Based Dimensionality Reduction

An approach that learns lower-dimensional representations of data through neural-network-based compression.