Skip to content
Back to full roadmap
topiccore

ReAct Loop Deep Dive

Every line of the Thought → Action → Observation cycle. Ancestor of every modern agent.

4 hours2 resources1 prereqs

ReAct (Yao et al., 2022) = Reasoning + Acting. Each turn the model:

  1. Thought: — analyze state, what needs to happen?
  2. Action: — which tool, what parameters?
  3. Observation: — tool result (appended to model input)
  4. Goal reached? No → back to 1.

In modern APIs: thought/action/observation aren't explicitly tagged — function calling is native. The mental model is identical: model invokes a tool, you execute, return result, model thinks again.

Implementation skeleton (Python):

while not done and i < MAX_ITER:
    response = llm.create(messages, tools=tools)
    if response.stop_reason == "end_turn":
        done = True
    elif response.stop_reason == "tool_use":
        results = [execute(call) for call in response.tool_calls]
        messages.append(assistant_msg)
        messages.append(tool_results_msg)
    i += 1

Best practices: force the model to write one thinking sentence BEFORE every tool call ("never call a tool without explaining first"). Improves trajectory readability and debugging.

What you'll gain

You can write a working ReAct agent from scratch in 50 lines of code.

Prerequisites

Resources(2)