Technical GlossaryNatural Language Processing
Preference Optimization
An alignment approach that makes model output more useful by optimizing against human or system preference signals.
Preference optimization targets not only the correct answer but one that is more useful, safer, and presented in a more appropriate way. Human preference pairs, reward models, or direct preference optimization methods may all be used for this purpose. It is one of the central concepts in modern LLM alignment. It is especially important for user experience and safe behavior generation.
You Might Also Like
Explore these concepts to continue your artificial intelligence journey.
