Imagine controlling your Mac with just a few lines of natural language. That dream is now a reality, thanks to Claude's new Computer Use tool. Whether you're automating tedious UI workflows, simulating user input, or creating demos that interact with macOS interfaces, Claude’s Computer Use tool offers a powerful and surprisingly intuitive solution.
In this article, we’ll walk through what this feature is, how to use it, and break down the inner workings of the tool’s core. Whether you're a developer looking to automate repetitive tasks, or just someone who wants to control apps hands-free, this guide is a comprehensive walkthrough to get started.

What is Claude's Computer Use?
Computer Use is a Claude-specific beta tool released by Anthropic that allows an AI agent to directly interact with a Mac’s keyboard, mouse, and screen. This interaction is achieved programmatically using macOS command-line utilities under the hood.
Claude, using this tool, can:
- Simulate typing or pressing specific keys
- Move the mouse cursor to a location
- Perform left, right, or double clicks
- Take screenshots of the current screen
- Get the cursor’s position
All these actions are exposed through an API-like interface and wrapped in a Python-based tool that Anthropic agents can call.
Why Automate macOS with Claude?
Traditional macOS automation tools like AppleScript or Automator can be powerful but tend to be brittle, application-specific, or limited in scope. With Claude’s Computer Use API, you’re no longer constrained by those rules. You can interact with the system as a whole — navigating apps, clicking, typing, dragging, and even interpreting the screen visually — just as a human would.
Claude acts like a smart co-pilot, interpreting what’s on your screen and executing tasks in real time using natural language instructions and low-level system commands.
What You’ll Need
To begin, make sure you have the following:
- A Mac running macOS 12 (Monterey) or later
- Python 3.8+ installed
- Homebrew (the macOS package manager)
- A terminal application like Terminal.app or iTerm2
Access to the Claude Computer Use API and your API key
You’ll also be using a command-line utility called cliclick
for low-level interaction like keyboard typing and mouse control.
Setting Up Your macOS Environment
Before Claude can control your Mac, you need to grant the terminal accessibility permissions:
- Open System Settings
- Go to Privacy & Security → Accessibility
- Enable control for the terminal application you’re using
Without these permissions, the automation won't work.
How It Works: Claude + cliclick + Python
The system is built on three key layers:
- Claude’s Computer Use API – Handles screen interpretation, decides what actions to take.
- cliclick – A command-line tool that simulates mouse movement, clicks, and keyboard input.
- Python Bridge (
computer.py
) – Connects Claude’s commands to cliclick and your macOS system.
The Claude API interprets visual information (like what apps are open or where buttons are located) and issues high-level commands. These commands are then executed on your Mac through cliclick, orchestrated by the Python layer.
Installing the Tools
Follow these steps to install and run the automation setup:
1. Install cliclick
brew install cliclick
2. Clone the Quickstart Repository
git clone https://github.com/anthropics/anthropic-quickstarts.git
cd anthropic-quickstarts/computer-use-demo
3. Replace the Core Script
Replace the existing computer.py
file with the modified version provided in the Automating macOS using Claude Computer Use guide.
4. Run the Setup Script
./setup.sh
This script creates a Python virtual environment and installs dependencies.
5. Activate the Environment
source .venv/bin/activate
6. Set Your Environment Variables
Replace the placeholders with your actual data.
export ANTHROPIC_API_KEY=sk-xxxxxx
export WIDTH=1512 # Your screen width
export HEIGHT=982 # Your screen height
You can find your resolution under Apple Menu > About This Mac > Displays.
7. Start the Streamlit App
python -m streamlit run computer_use_demo/streamlit.py
A local browser will open up where you can start issuing commands to Claude.
Automating Real-World Tasks on macOS
Now that everything is up and running, let’s look at what you can do.
1. Launching Applications
Ask Claude to “Open Safari” or “Launch Spotify.” Claude will visually identify the icons or menu entries and simulate the necessary clicks and keystrokes.

2. Typing Text in Apps
You can ask Claude to open Notes and type a message. This is useful for creating automated logs or daily journals.
3. Navigating Menus and Windows
Claude can simulate keyboard shortcuts, click through menus, or drag windows to specific positions. This is great for creating multi-step workflows like exporting files or setting up your workspace.
Fasinated by Computer Use? Let's Dive Deeper:
The computer.py
script acts as a middleware that handles:
- Translating screen coordinates based on resolution
- Executing mouse and keyboard actions with precise timing
- Capturing and encoding screenshots for visual confirmation
- Each command issued by Claude (e.g.,
left_click
,mouse_move
,type
) is validated, parsed, and then handed off to cliclick.
Example: Telling Claude to Open Safari. Once set up, you can prompt Claude with something  like:
"Please open Safari, go to apple.com, and take a screenshot."
Under the hood, Claude will:
- Use
cliclick
to pressCmd+Space
- Type "Safari"
- Press
Enter
- Wait for the browser to load
- Type "apple.com"
- Press
Enter
- Use
screenshot()
to capture the screen
All these steps are abstracted away in natural language.
It also supports feedback loops, like returning the current mouse position or a screenshot of the screen, so Claude can "see" what happened and respond intelligently. Think about what the Claude Computer Use can do for you:
- Content Creation: Automate opening Photoshop, loading a template, and exporting a design.
- Meetings: Open Zoom, join meetings, and mute/unmute using simple prompts.
- Coding: Open your IDE, load a project, and compile — all triggered by a natural language instruction.
- System Cleanup: Open Finder, go to Downloads, and delete old files.
How Claude's Computer Use Works Under the Hood
At the core of this feature is the computer.py
file, a tool implementation that exposes an API-like interface to an AI agent.
Let’s dissect the major components of computer.py
.
1. Tool Configuration and Setup
class ComputerTool(BaseAnthropicTool):
name: Literal["computer"] = "computer"
api_type: Literal["computer_20241022"] = "computer_20241022"
This class sets the name and API type of the tool. It inherits from BaseAnthropicTool
, which standardizes how tools communicate with Claude.
The constructor loads screen width, height, and display number from environment variables. This ensures that mouse coordinate mapping works correctly on high-resolution displays.
self.width = int(os.getenv("WIDTH") or 0)
self.height = int(os.getenv("HEIGHT") or 0)
2. Executing Actions
The tool handles various actions such as mouse_move
, type
, key
, and screenshot
. Each action triggers a different shell command:
if action == "mouse_move":
return await self.shell(f"cliclick m:{x},{y}")
Typing is handled by breaking input text into chunks and simulating keystrokes:
for chunk in chunks(text, TYPING_GROUP_SIZE):
cmd = f"cliclick t:'{chunk}'"
results.append(await self.shell(cmd, take_screenshot=False))
This mimics a user typing character-by-character, including a screenshot afterward.
3. Screenshot Functionality
The screenshot()
function takes a screenshot using screencapture
, resizes it using ImageMagick’s convert
, and returns it encoded in base64:
screenshot_cmd = f"{self._display_prefix}screencapture {path}"
await self.shell(f"convert {path} -resize {x}x{y}! {path}")
This allows Claude to "see" what’s happening on screen before or after performing actions.
4. Coordinate Scaling
Not all screens have the same resolution. The scale_coordinates()
method adjusts coordinates so that interactions remain consistent across displays:
x_scaling_factor = target_dimension["width"] / self.width
y_scaling_factor = target_dimension["height"] / self.height
This ensures that when the AI says "click at (400, 300)", it lands in the right spot, regardless of the actual screen size.
5. Error Handling and Validation
Throughout the code, errors like missing text or invalid coordinates are caught early with helpful messages:
if text is None:
raise ToolError(f"text is required for {action}")
This safeguards the tool and ensures predictable behavior when Claude interacts with a system.
Final Thoughts
Claude’s Computer Use API offers a futuristic approach to automation — less scripting, more intelligence. By interpreting screen visuals and responding like a human assistant, Claude brings powerful automation to any macOS user without requiring deep technical skills.
With just Python, a few tools, and your API key, you can build workflows that adapt to your habits and preferences — giving you more time to focus on what really matters.