Embedding Modelleri: OpenAI · Voyage · Cohere · BGE

Name: Embedding Modelleri: OpenAI · Voyage · Cohere · BGE
Author: Şükrü Yusuf KAYA

Embedding model seçimi: kalite, dil, boyut, fiyat trade-off'ları. MTEB benchmark'ta en iyi 5 model.

Şükrü Yusuf KAYA

8 min read

5/11/2026

Intermediate

Embedding Modelleri (2026)

Cloud (API)#

OpenAI

text-embedding-3-large (3072 dim)
text-embedding-3-small (1536 dim, ucuz)
Multilingual ✓

Voyage

voyage-3 (1024 dim)
En iyi MTEB skoru genellikle
voyage-3-lite ucuz alternatif

Cohere

embed-multilingual-v3 (1024 dim)
100+ dil, Türkçe iyi

Açık Kaynak#

BGE (BAAI)

bge-m3 (1024 dim)
Multilingual, ücretsiz
self-host edilebilir

E5 (Microsoft)

multilingual-e5-large
Açık ağırlık

Mistral Embed

mistral-embed (1024 dim)
API, Avrupa odaklı

Türkçe Performansı#

text

MTEB Türkçe alt-küme (yaklaşık skor):
- Voyage 3:                  72.4
- Cohere multilingual v3:    71.1
- OpenAI text-embedding-3-l: 70.8
- BGE-M3:                    69.5
- multilingual-e5-large:     68.2
 
Genel Önerim: Voyage 3 (cloud) veya BGE-M3 (self-host).

Türkçe MTEB sonuçları

python

# Embedding karşılaştırma
from openai import OpenAI
import os
 
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
 
texts = [
    "İstanbul'da hava bugün yağmurlu",
    "Ankara'da yarın güneşli olacak",
    "Python programlama dili çok popüler",
]
 
r = client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)
 
import numpy as np
embeddings = [np.array(d.embedding) for d in r.data]
 
def cosine(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
 
print(f"İstanbul-Ankara hava: {cosine(embeddings[0], embeddings[1]):.3f}")
print(f"İstanbul-Python:       {cosine(embeddings[0], embeddings[2]):.3f}")
 
# Beklenen: hava cümleleri yüksek, hava-Python düşük

Embedding semantic similarity testi.

Yorumlar & Soru-Cevap

(0)

Yorum yazmak için giriş yap.

Yorumlar yükleniyor...

Pillar topics this article maps to

Pillar Topic

RAG (Retrieval-Augmented Generation) Architecture

RAG (Retrieval-Augmented Generation) is an architecture that grounds large-language-model answers in chunks retrieved from the organization's own documents or data sources, providing both freshness and citations.

Embedding Modelleri: OpenAI · Voyage · Cohere · BGE

Embedding Modelleri (2026)

Cloud (API)#

Açık Kaynak#

Türkçe Performansı#

Yorumlar & Soru-Cevap

Related Content

Bu Eğitim Hakkında ve Verimli Çalışma Yöntemi

Yapay Zekâ → Üretken AI → LLM: Bağlamsal Harita

LLM'ler Aslında Nasıl Düşünür? (Token, Embedding, Attention)

Pillar topics this article maps to

RAG (Retrieval-Augmented Generation) Architecture

Subscribe to Newsletter