Glossary definitionBrowse the neighboring terms

Agent Ops / Standard term

Checkpoint

A checkpoint is a saved snapshot of an agent run's progress, recorded after each completed step, so a run that crashes can resume from where it stopped instead of starting over.

A checkpoint is a saved snapshot of an agent run's progress, recorded after each completed step, so a run that crashes can resume from where it stopped instead of starting over. After every step finishes, the agent writes down which steps are done and what they produced, then continues. Say you ask an agent to process two hundred files one at a time. It crashes at file one hundred and twenty. With checkpoints, the next run reads the saved progress, skips the files it already handled, and picks up at file one hundred and twenty-one. Without them, the run begins again at file one and repeats all the finished work.

Builder example

Checkpoints decide whether a long job survives an interruption or wastes hours redoing settled work. Picture an agent that sends a welcome message to every name on a list, then loses its connection halfway through. If it saves progress after each send, the resumed run skips the people it already reached and avoids messaging anyone twice. Ask your AI assistant to save progress after each step and confirm a resumed run continues from the last saved point rather than repeating completed actions.

Common confusion: A checkpoint records how far a run got so it can resume; a backup copies your data so you can restore it later. What separates them is purpose: a checkpoint tracks step-by-step run progress to avoid repeating work, while a backup preserves the underlying records themselves.