# Long-CoT Stability: Repetition Collapse + Think-Loop Mitigation

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-long-cot-stability-repetition-loop
> Updated: 2026-06-26T01:36:29.185Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part XII — Reasoning Model FT (R1-style)
**TLDR:** Reasoning model's most common bug: **think-loop** — model keeps thinking same thing. Repetition collapse, length explosion (8K → 30K). Mitigation: entropy bonus, repetition penalty during training, max_think_tokens enforcement, reward shaping (length penalty), early-stopping heuristics.