Yi-1.5 / InternLM2.5 / Aya Expanse: Underdog Comparative TR-MMLU

Llama / Qwen / Gemma are popular but not the only options. Yi-1.5 (01.AI), InternLM2.5 (Shanghai AI Lab), Aya Expanse (Cohere) — which shines in TR? Same recipe comparison on RTX 4090.

Şükrü Yusuf KAYA

28 min read

5/14/2026

Advanced

Yi-1.5 / InternLM2.5 / Aya Expanse: Underdog'ların TR-MMLU Karşılaştırması

1. 4-Model Karşılaştırma Tablosu#

Model	Vocab	Pre-train	TR-MMLU base	Lisans
Yi-1.5 6B/9B/34B	64,000	3.6T (CN+EN heavy)	25.4 / 28.7 / 38.2	Apache 2.0
InternLM2.5 7B/20B	92,544	2T multilingual	30.1 / 35.6	Apache 2.0
Aya Expanse 8B/32B	256,000	200K hours synthetic (101 lang)	42.3 / 47.1	CC-BY-NC (research)
Llama 3.1 8B (ref)	128,256	15T multilingual	32.4	Llama license
Qwen 2.5 7B (ref)	151,936	18T multilingual	38.1	Apache 2.0

Aya Expanse 8B TR-MMLU 42.3 (!) — popüler modellerden iyi. Ama:

Lisans: CC-BY-NC — commercial use yasak
Cohere Research License
Production'da kullanılamaz, sadece research

Karar matrisi:

Commercial + TR → Qwen 2.5 7B (38.1)
Research + TR → Aya Expanse 8B (42.3)
Math/Code → Phi-4 (English) veya Qwen 2.5 Coder
Edge → SmolLM3 1.7B

2. Aya Expanse — Cohere'in 101-Language Specialist'ı#

Aya Expanse 8B (Cohere, Kasım 2024):

256K vocab (Gemma seviyesinde)
101 dil pre-train + SFT
Aya datasetler family (Cohere Aya Initiative — community translations)
TR specifically high quality (Türkçe data %2.3 — büyük ratio)

Reçete: Aya Expanse 8B + custom TR domain SFT → cookbook'un Part IX'unda detaylı.

✅ Teslim

4 modeli aynı 1000 TR Alpaca subset ile FT et. 2) TR-MMLU + MT-Bench-TR ölç, tablo çıkar. 3) Sonraki ders: 3.11 — Comparative Lab: Same Recipe 10 Models.

Yorumlar & Soru-Cevap

(0)

Yorum yazmak için giriş yap.

Yorumlar yükleniyor...

Yi-1.5 / InternLM2.5 / Aya Expanse: Underdog Comparative TR-MMLU

1. 4-Model Karşılaştırma Tablosu#

2. Aya Expanse — Cohere'in 101-Language Specialist'ı#

Yorumlar & Soru-Cevap

Related Content

Welcome to the Fine-Tuning Cookbook: System, Stage Taxonomy, and the Reproducibility Contract

Reproducibility Stack: Seeds, cuDNN Flags, and Deterministic CUDA — End the 'Works on My Machine' Problem

Environment Pinning: uv + pyproject.toml, CUDA Version Matrix, and Container Recipes

Subscribe to Newsletter