DeepSeek-Coder-V2 16B / 236B: MoE Code Model + Multi-File Context

DeepSeek-Coder-V2 (DeepSeek 2024) — MoE arch (16B / 236B), one of strongest open code LLMs with Apache 2.0. 338 programming languages, 128K context, multi-file repo understanding. 16B (2.4B active) QLoRA possible on RTX 4090; 236B cloud only.

Şükrü Yusuf KAYA

24 min read

5/14/2026

Advanced

DeepSeek-Coder-V2 16B / 236B: MoE Code Model + Multi-File Context

1. DeepSeek-Coder-V2 Specs#

Model	Total	Active	Context	HumanEval	Lisans
DeepSeek-Coder-V2-Lite 16B	16B	2.4B	128K	90.2%	Apache 2.0
DeepSeek-Coder-V2 236B	236B	21B	128K	96.3%	Apache 2.0
DeepSeek-Coder-V2-Lite-Instruct	16B	2.4B	128K	92.1%	Apache 2.0

Lite 16B'in avantajı: Active param 2.4B → 7B-class compute, 16B-class kalite. RTX 4090'da rahat QLoRA.

✅ Teslim

DeepSeek-Coder-V2-Lite 16B'yi RTX 4090'da load et. 2) HumanEval bench. 3) Sonraki ders: 8.4 — StarCoder 2 + CodeLlama.

Yorumlar & Soru-Cevap

(0)

Yorum yazmak için giriş yap.

Yorumlar yükleniyor...

DeepSeek-Coder-V2 16B / 236B: MoE Code Model + Multi-File Context

1. DeepSeek-Coder-V2 Specs#

Yorumlar & Soru-Cevap

Related Content

Welcome to the Fine-Tuning Cookbook: System, Stage Taxonomy, and the Reproducibility Contract

Reproducibility Stack: Seeds, cuDNN Flags, and Deterministic CUDA — End the 'Works on My Machine' Problem

Environment Pinning: uv + pyproject.toml, CUDA Version Matrix, and Container Recipes

Subscribe to Newsletter