Agentic AI Glossary

Plain-language definitions of the concepts behind production AI agents — how they reason and act, and what it takes to run them durably, observably, and under human control.

Foundations

Durability, observability & control

Approval Gate An approval gate is a checkpoint where an agent pauses for human sign-off before a sensitive action runs, holding state until someone responds. Crash Recovery Crash recovery lets an agent resume from its last completed step after a process dies, restoring durable state instead of restarting from scratch. Durable Execution Durable execution persists an agent's state server-side so a run survives crashes and restarts, resuming from the last completed step instead of starting over. Evaluation (Evals) Agent evaluation measures how well an agent performs by scoring its runs against criteria such as task success, tool-call accuracy, and faithfulness. Guardrails Guardrails are validation checks on an agent's inputs and outputs that enforce rules at runtime, blocking or correcting unsafe behavior. Human-in-the-Loop (HITL) Human-in-the-loop pauses an agent so a person can review, approve, or correct an action before the run continues on high-stakes steps. Observability Agent observability is the ability to see what an agent did and why, capturing its steps, tool calls, inputs, outputs, and decisions for debugging. Orchestration Agent orchestration is the layer that coordinates an agent's steps, tool calls, and sub-agents, deciding what runs when and recovering from failures. State Management Agent state management is how a run's execution position and data are stored and updated, so the agent always knows where it is and what it has done. Tracing & Spans Agent tracing records each step of a run as nested, timed spans, producing a structured trace that shows what an agent did, in what order, and why.

Patterns & reasoning