# Vision Transformer Features > Source: https://sukruyusufkaya.com/en/glossary/vision-transformer-features > Updated: 2026-05-23T14:16:42.670Z > Type: glossary > Category: bilgisayarli-goru **TLDR:** A modern visual feature structure that splits images into patch tokens and learns representations through global attention.

Vision Transformer features are among the strongest examples of a representation learning paradigm outside CNNs. The image is split into fixed-size patches, which are then processed like tokens. This approach is especially strong at learning global contextual relations. In recent years, it has become a powerful and increasingly standard representation family for classification, segmentation, and multimodal systems.