Patterns & reasoning

What is ReAct (Reasoning + Acting)?

Also called: ReAct, reason and act

Updated June 24, 2026
Quick Definition

ReAct, short for reasoning and acting, is an agent pattern in which a model alternates between thinking about a problem and taking actions against its environment. Instead of reasoning to an answer in one shot, the model produces a thought, takes an action such as a tool call, observes the result, and uses that observation to inform its next thought — repeating until it reaches a conclusion.

Why ReAct matters

A model reasoning on its own can only work from what it already knows. It cannot check a fact, read a file, or query an API mid-thought, so it is prone to confident but ungrounded answers and has no way to recover when an assumption turns out to be wrong. ReAct addresses this by giving the reasoning process a way to touch the outside world between steps.

By interleaving thought with action, the model grounds each decision in fresh evidence rather than in its prior alone. An observation can correct a mistaken assumption before it compounds, and the visible chain of thought-action-observation makes the agent’s behavior easier to follow and debug than a single opaque answer. The cost is that each cycle is another model call and another action, so a ReAct agent trades some speed and predictability for grounding and adaptability.

How it works

A ReAct step repeats a small sequence until the task is done:

  1. Thought — the model reasons in natural language about the current state and what to do next.
  2. Action — it issues a concrete action, typically a tool or function call with arguments.
  3. Observation — the action’s result is returned and appended to the context.
  4. Repeat — the model reads the new observation and produces its next thought, continuing until it decides it can answer.

In early implementations this structure was elicited entirely through prompting. In modern systems the action and observation steps are handled by native function calling, while the thought step is the model’s reasoning between calls. Either way, the pattern is the same loop of reason, act, observe.

ReAct vs. plan-and-execute

The two differ in when decisions are made. ReAct is incremental — it commits to one action, sees the result, and only then decides the next, which suits open-ended tasks where the right path is not knowable in advance. Plan-and-execute commits to a full plan first and then runs it, trading adaptability for predictability and fewer model calls. Many systems blend the two: a plan sets the overall direction while a ReAct-style loop handles each step’s execution.

In practice

ReAct is the reasoning shape that most tool-using agents follow, and running it dependably means capturing every thought, action, and observation. A durable, observable runtime persists each step server-side, so the interleaved trace is inspectable and an interrupted run resumes rather than restarting. ReAct is one realization of the agent loop, it depends on tool use for its actions, and it sits alongside planning as a strategy for deciding what to do. For the loop in context, see the agent concepts.

Frequently asked questions

What is the difference between ReAct and chain-of-thought?

Chain-of-thought produces reasoning only, with no actions during the process. ReAct interleaves that reasoning with real actions and observations, so the model can gather information from the environment and revise its approach rather than reasoning in a closed loop.

Is ReAct still relevant with modern tool-calling models?

Yes. Native tool calling implements the act-and-observe half of ReAct directly, and the interleaving of reasoning with tool results is still the underlying pattern. ReAct now describes the shape of most tool-using agents rather than a separate prompting trick.

How does ReAct compare to plan-and-execute?

ReAct decides one action at a time and reacts to each observation, which adapts well to uncertainty. Plan-and-execute drafts a full plan first and then carries it out, which is more predictable but slower to recover when an early step fails.

See also in the docs

Related terms