Back to full roadmap
topiccore
ReAct Loop Deep Dive
Every line of the Thought → Action → Observation cycle. Ancestor of every modern agent.
4 hours2 resources1 prereqs
ReAct (Yao et al., 2022) = Reasoning + Acting. Each turn the model:
Thought:— analyze state, what needs to happen?Action:— which tool, what parameters?Observation:— tool result (appended to model input)- Goal reached? No → back to 1.
In modern APIs: thought/action/observation aren't explicitly tagged — function calling is native. The mental model is identical: model invokes a tool, you execute, return result, model thinks again.
Implementation skeleton (Python):
while not done and i < MAX_ITER:
response = llm.create(messages, tools=tools)
if response.stop_reason == "end_turn":
done = True
elif response.stop_reason == "tool_use":
results = [execute(call) for call in response.tool_calls]
messages.append(assistant_msg)
messages.append(tool_results_msg)
i += 1
Best practices: force the model to write one thinking sentence BEFORE every tool call ("never call a tool without explaining first"). Improves trajectory readability and debugging.
What you'll gain
You can write a working ReAct agent from scratch in 50 lines of code.