# DPO Implementation From Scratch: One-Page Code Without TRL Source

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-dpo-implementation-from-scratch
> Updated: 2026-05-14T14:42:57.991Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part XI — Alignment & Preference Optimization
**TLDR:** Without using TRL DPOTrainer, write your own DPO loss: log-probabilities computation, reference model handling, loss formula, gradient backprop. ~80 lines of PyTorch. To understand where you can go wrong.

