# Reinforcement Learning from Human Feedback

> Source: https://sukruyusufkaya.com/en/glossary/reinforcement-learning-from-human-feedback
> Updated: 2026-05-13T20:59:21.400Z
> Type: glossary
> Category: uretken-yapay-zeka-ve-llm
**TLDR:** An alignment approach that uses reward signals to make model outputs more consistent with human preferences.

<p>RLHF helps large language models move beyond merely generating plausible text toward producing responses that are more useful and acceptable. Human preferences are translated directly or indirectly into reward structure, and the model is aligned with it. It played a major role in making modern LLM behavior more user-friendly.</p>