# Tensor Parallelism

> Source: https://sukruyusufkaya.com/en/glossary/tensor-parallelism
> Updated: 2026-05-13T19:59:50.407Z
> Type: glossary
> Category: uretken-yapay-zeka-ve-llm
**TLDR:** A technique that scales inference and training by splitting large model computations across devices within layers.

<p>Tensor parallelism is one of the fundamental distributed computing techniques for running models that do not fit into the memory of a single device. Matrix operations inside layers are split across multiple GPUs. This not only makes model execution possible, but also shapes performance design in large-scale inference serving.</p>