Glossary definitionBrowse the neighboring terms

Agents / Industry term

Computer use / browser use

An AI capability where the model sees a screenshot of a screen and controls it with mouse clicks, keyboard input, and navigation, the same way a person would.

Computer use is an AI capability where the model sees a screenshot of a screen and controls it with mouse clicks, keyboard input, and navigation, the same way a person would. This lets AI automate software that has no API: filling out legacy web forms, navigating enterprise dashboards, clicking through multi-step approval workflows, or extracting data from a screen that only a human could normally read. Most AI tool use works through structured API requests, but computer use operates directly on the visual interface.

Builder example

Computer use unlocks automation for any software your team can see on screen, which is especially valuable for legacy systems, internal tools without APIs, and workflows spanning multiple applications. The tradeoff: anything on screen (ads, pop-ups, phishing emails, manipulated web content) can influence what the model sees and does. You need tighter guardrails than with a structured API.

Common confusion: Computer use looks like a universal automation solution. In practice, it is slower, more brittle, and more vulnerable than API-based tool use. Screens change layout, popups interrupt flows, and the model can misread visual elements. Use APIs when they exist; reserve computer use for when they do not.