Glossary definitionBrowse the neighboring terms

Agent Ops / Standard term

Rate limit

A rate limit is the cap a service puts on how many requests it will accept in a window of time. An agent that fires faster than the cap gets its extra requests refused until the window resets.

A rate limit is the cap a service puts on how many requests it will accept in a window of time. An agent that fires faster than the cap gets its extra requests refused until the window resets. Say your agent has 200 contacts to email and the mail service allows 60 sends per minute. If the agent tries to push all 200 at once, the service accepts the first 60 and rejects the rest with a too-many-requests response. The fix is to pace the loop: send a batch, wait for the window to clear, then send the next batch, so every contact goes out instead of most of them bouncing back as errors.

Builder example

Rate limits decide whether a long-running agent finishes or stalls partway through. A summarization agent looping over a thousand documents, or a posting agent working through a backlog, will hit a service cap and start collecting rejections if it never slows down. Tell your AI assistant to read the limit the service publishes, process work in paced batches, and treat a too-many-requests response as a cue to wait and retry rather than a hard failure, so a busy run completes instead of dropping items.

Common confusion: A rate limit is a cap on request volume, separate from a usage quota or a spending cap. The rate limit controls how fast you may send in a short window and resets quickly; a quota controls how much you may consume over a billing period. Hitting the rate limit pauses the agent for seconds; exhausting a quota stops it until the next cycle.