# Mixture-of-Experts Transformer > Source: https://sukruyusufkaya.com/en/glossary/mixture-of-experts-transformer > Updated: 2026-05-13T21:09:23.824Z > Type: glossary > Category: derin-ogrenme **TLDR:** A Transformer approach that improves scaling efficiency by activating selected expert subnetworks rather than the full model on every input.

Mixture-of-Experts Transformer architectures aim to increase model capacity without requiring all parameters to be active for every input. A routing mechanism decides which expert subnetworks should process the incoming example. This creates a new balance between computational efficiency and model scale. In large-scale systems, it embodies the idea of efficient specialization.