# Reward Hacking Diagnostics: Gaming Detection, Length Bias, Sycophancy Probe

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-reward-hacking-diagnostics
> Updated: 2026-05-14T14:42:58.701Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part XI — Alignment & Preference Optimization
**TLDR:** Models 'hack' reward functions — gain reward via wrong path. Length bias (long answers = high reward), sycophancy (overly agreeable), format gaming, repetition. Detection: ablation, holdout probe, qualitative review. Lessons from Anthropic's 'reward over-optimization' report.

