Back to full roadmap
topiccore
Hallucination Defense
Force citation, ground, verify — three-layer defense + post-hoc fact-checking.
3 hours3 resources2 prereqs
Layer 1 — Prompt: "Stay only within <sources>, say 'I don't know' if unsure, cite a source ID for each claim" Layer 2 — Retrieval grounding: tight RAG context, shrink the model's 'guess' space Layer 3 — Post-hoc verification: after final answer, second LLM call checks "Is this consistent with <sources>?"
Citation mode — Anthropic Claude natively supports citations field: each claim names its source.
Hallucination detection libraries: TruLens, Patronus AI, RAGAS.