Skip to main content

Computer Use

An AI's ability to control a computer's screen, mouse, and keyboard.

Computer use is a tool capability where an LLM is given screenshots of a desktop or browser and returns actions (click at coordinates, type text, scroll, take screenshot). Anthropic shipped this as a Claude API feature in 2024 and it now powers a wave of browser-using agents — Perplexity Comet, OpenAI Operator, ChatGPT Atlas, Browser Use.

The key insight: instead of building API integrations to every service, you let the AI use the same UI a human would. This is enormously powerful (works on every website, no vendor cooperation needed) and enormously fragile (UI changes break agents, captchas defeat them, latency stacks up).

For operators, computer-use agents are best evaluated on transactional tasks (book this flight, fill this form) where success is binary and verifiable. Open-ended research is harder to grade.

Coming soon

Get the weekly digest

New tools, reviews, and prompts every Friday.