The agent loop is the control cycle at the heart of an AI agent: the model reasons about the current state, chooses an action, the action is executed, its result is fed back as a new observation, and the cycle repeats. The loop continues until the agent produces a final answer or a stop condition halts it, which is what lets an agent take a variable number of steps rather than a fixed sequence.
Why the agent loop matters
A traditional program runs a path its author wrote in advance. An agent instead determines its own path at runtime, and the agent loop is the mechanism that makes that possible. Each pass through the loop gives the model a chance to read what just happened and decide what to do next, so the same agent can take three steps on an easy request and fifteen on a hard one.
This adaptability is also where the operational risk lives. Because the loop decides its own length, it can fail to converge — repeating a tool call, oscillating between two actions, or chasing a goal it cannot reach. Each iteration costs a model call and may trigger real side effects, so a loop that does not terminate cleanly is both expensive and difficult to reason about. Much of the engineering around agents is about keeping the loop bounded and observable without removing the flexibility that motivated it.
How it works
A single pass through the loop generally follows the same shape:
- Observe — the agent assembles the current context: the goal, the history so far, and the most recent tool result or message.
- Reason — the model decides on the next action, whether that is calling a tool, handing off to another agent, or answering directly.
- Act — the chosen action runs, producing a result such as a tool output or an error.
- Append — the result is added to the running context as a new observation.
- Check the stop condition — if the agent has produced a final answer, a termination rule has fired, or the step cap is reached, the loop exits; otherwise it returns to step one.
The stop condition is essential. In addition to a natural final answer, agents are typically given a maximum number of iterations and may use termination rules such as a token budget or a matched phrase, so a run that fails to converge is halted rather than allowed to continue.
In practice
Running the loop reliably means recording each iteration and surviving interruptions between them. A durable, observable runtime persists every step server-side and exposes the loop as a trace, so a stuck or runaway run is visible and a crash mid-loop resumes from the last completed step rather than restarting. This is the cycle that ReAct structures, that tool use drives, and that tracing makes inspectable. For the loop as a primitive, see the agent concepts.