# DeepSeek-V3 / R1 (671B, 37B Active): Shared Expert + Fine-Grained Routing — Where to LoRA?

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-deepseek-v3-r1-moe-shared-expert
> Updated: 2026-05-14T14:42:53.384Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part V — MoE Internals & Fine-Tuning
**TLDR:** DeepSeek-V3 (671B params, 37B active) — best open example of modern MoE. Shared expert (common knowledge for every token) + 256 routed experts (fine-grained). DeepSeek-R1 same arch + RL for reasoning. Impossible on RTX 4090; cookbook's cloud recipe 16×H100 NDR IB + ZeRO-Infinity + expert parallelism.

