Glossary Library

Technical GlossaryGenerative AI and LLM

KV Cache

A mechanism that stores previous attention computations to reduce repeated work in autoregressive generation.

KV cache is one of the most fundamental components of LLM inference optimization. Avoiding recomputation of key and value representations for prior tokens yields major speed gains in long generations. However, memory usage grows with context length, so careful resource management is required.

You Might Also Like

Explore these concepts to continue your artificial intelligence journey.

Glossary Cover

yapay-zeka-temelleri

Generative AI

A class of AI systems capable of generating new content such as text, images, audio, video, or code.

Glossary Cover

yapay-zeka-temelleri

Inference

The stage in which a trained model performs prediction or generation on new data in real-world use.

Glossary Cover

matematik-istatistik-optimizasyon

Bayes' Theorem

A fundamental probability theorem that allows updating the probability of a hypothesis as new observations arrive.