Glossary Library

Technical GlossaryGenerative AI and LLM

Multimodal Transformer

In One Line

A model design that processes different data types such as text, images, audio, or video within a shared attention architecture.

A multimodal Transformer aims to learn relationships across different modalities inside a shared representation space. By combining contextual signals from multiple data types, it enables richer reasoning and generation. It plays a central role in multimodal agent systems and the broader vision of unified foundation models.

You Might Also Like

Explore these concepts to continue your artificial intelligence journey.

Glossary Cover

uretken-yapay-zeka-ve-llm

Abstention

The ability of a model to avoid fabricating certainty and instead decline or express uncertainty when it is not confident.

Glossary Cover

uretken-yapay-zeka-ve-llm

Adapters

A parameter-efficient approach that inserts small modules into the base model to enable task adaptation.

Glossary Cover

Additive Attention

An early attention approach that compares query and context representations through a learnable combination function.