topicadvanced

Reflexion / Self-Critique Loop

Agent critiques its own output → writes lessons to memory → doesn't repeat the same mistake.

3 hours2 resources1 prereqs

Reflexion (Shinn et al., 2023): ReAct + episodic self-reflection.

3 layers:

Actor — attempt the task with ReAct
Evaluator — score the result (LLM-as-judge or rule-based)
Self-Reflection — on failure, ask "why did I fail?" and extract a lesson

The lesson is written to memory. The next attempt starts with past failures as prefix. Model reasons: "Last time I did X, didn't work, this time I'll try Y."

+20% over baseline on HotpotQA. Especially strong on coding tasks.

Prerequisites

ReAct Loop Deep Dive

Every line of the Thought → Action → Observation cycle. Ancestor of every modern agent.

→

Resources(2)

PPaper(1)

Reflexion: Language Agents with Verbal RL

· en

free

GGitHub(1)

Reflexion reference impl

· en

free

Related steps

Episodic Memory (Past Experiences)→

Plan-and-Execute Pattern

Loop Termination Strategies

Open the full interactive roadmap