Ultimate Guide to Browser Use Cloud: AI-Powered Browser Automation for Developers

Unlock AI-powered browser automation with Browser Use Cloud. Learn how to create, monitor, and manage intelligent web agents via API, compare LLM costs, and integrate into Python or TypeScript projects. Discover how Apidog fits into modern API workflows.

Mark Ponomarev

Mark Ponomarev

30 January 2026

Ultimate Guide to Browser Use Cloud: AI-Powered Browser Automation for Developers

Harnessing browser automation with AI unlocks powerful workflows for API developers, QA engineers, and backend teams. In this comprehensive guide, you'll learn how to leverage Browser Use Cloud to automate data extraction, streamline web app testing, and build robust monitoring agents—complete with live previews, cost transparency, and real-world examples.

💡 Looking for an API testing tool that generates beautiful API documentation and boosts team productivity? Try Apidog—an all-in-one platform that can replace Postman at a more affordable price.

button

What Is Browser Use Cloud?

Browser Use Cloud is a developer-centric platform for creating and managing intelligent browser automation agents through a simple API. Imagine having a programmable fleet of virtual assistants that perform web tasks—like scraping, UI testing, or form submissions—using natural language instructions.

How Does It Work?


Getting Started: Obtaining Your API Key

To interact with the Browser Use Cloud API, you'll need an API key tied to your subscription.

Note: An active subscription is required. Retrieve your API key from the billing page: cloud.browser-use.com/billing.

Security Tip:
Store your API key securely as an environment variable. Do not expose it in client-side code or commit it to version control.

export BROWSER_USE_API_KEY="your_api_key_here"

Transparent and Flexible Pricing Model

Browser Use Cloud uses a straightforward pay-as-you-go approach, ideal for both small projects and enterprise automation.

Cost Breakdown

LLM Pricing Table

Model Cost per Step
GPT-4o $0.03
GPT-4.1 $0.03
Claude 3.7 Sonnet (2025-02-19) $0.03
GPT-4o mini $0.01
GPT-4.1 mini $0.01
Gemini 2.0 Flash $0.01
Gemini 2.0 Flash Lite $0.01
Llama 4 Maverick $0.01

Example: Calculating Task Cost

Suppose your automation involves logging in, navigating, and extracting data in about 15 steps using GPT-4o:

This clarity lets you estimate and control your automation spend.


Step-by-Step: Creating Your First Browser Agent

Let’s launch a simple automation—searching for "Browser Use" on Google—using the API.

curl -X POST https://api.browser-use.com/api/v1/run-task \
  -H "Authorization: Bearer $BROWSER_USE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Go to google.com and search for Browser Use"
  }'

Command Breakdown:

Understanding the Response

A successful request returns something like:

{
  "task_id": "ts_2a9b4e7c-1d0f-4g8h-9i1j-k2l3m4n5o6p7",
  "status": "running",
  "live_url": "https://previews.browser-use.com/ts_2a9b4e7c-1d0f-4g8h-9i1j-k2l3m4n5o6p7"
}

Live Previews: Debug, Monitor, and Intervene

The live_url feature stands out for hands-on development and debugging. Unlike static logs or screenshots, you get a fully interactive remote session:

Embed Example:

<!DOCTYPE html>
<html>
<head>
  <title>Agent Live Preview</title>
  <style>
    body, html { margin: 0; padding: 0; height: 100%; overflow: hidden; }
    iframe { width: 100%; height: 100%; border: none; }
  </style>
</head>
<body>
  <iframe src="YOUR_LIVE_URL_HERE"></iframe>
</body>
</html>

Replace YOUR_LIVE_URL_HERE with your returned live_url.


Managing Task Lifecycle via API

Automation often needs control—pause, resume, or stop agents as workflows change.

Pausing a Task

curl -X POST https://api.browser-use.com/api/v1/pause-task \
  -H "Authorization: Bearer $BROWSER_USE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "YOUR_TASK_ID_HERE"
  }'

Resuming a Task

curl -X POST https://api.browser-use.com/api/v1/resume-task \
  -H "Authorization: Bearer $BROWSER_USE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "YOUR_TASK_ID_HERE"
  }'

Stopping a Task

To terminate and clean up a task:

curl -X POST https://api.browser-use.com/api/v1/stop-task \
  -H "Authorization: Bearer $BROWSER_USE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task_id": "YOUR_TASK_ID_HERE"
  }'

Note: Stopped tasks cannot be resumed; browser sessions and resources are released.


Advanced Task Customization

For more control, you can specify parameters such as model selection or complex instructions in your API call.

Selecting an LLM Model

To use, for example, Claude 3.7 Sonnet:

curl -X POST https://api.browser-use.com/api/v1/run-task \
  -H "Authorization: Bearer $BROWSER_USE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Go to reddit.com/r/programming and find the top post of the day.",
    "model": "claude-3.7-sonnet-20250219"
  }'

If omitted, a cost-effective default is chosen (typically GPT-4o mini).


Building Your Own API Client

While curl is great for quick tests, production use demands robust, type-safe clients. Browser Use Cloud provides an OpenAPI specification for easy client code generation.

Python: Async Client Generation

  1. Install generator:
    pipx install openapi-python-client --include-deps
    
  2. Generate client:
    openapi-python-client generate --url http://api.browser-use.com/openapi.json
    
  3. Install and use:
    import asyncio
    from browser_use_api import Client
    from browser_use_api.models import RunTaskRequest
    
    async def main():
        client = Client(base_url="https://api.browser-use.com/api/v1")
        request = RunTaskRequest(task="Go to ycombinator.com and list the top 3 companies.")
    
        response = await client.run_task.api_v1_run_task_post(
            client=client,
            json_body=request,
            headers={"Authorization": f"Bearer {YOUR_API_KEY}"}
        )
    
        if response:
            print(f"Task created with ID: {response.task_id}")
            print(f"Live URL: {response.live_url}")
    
    if __name__ == "__main__":
        asyncio.run(main())
    

TypeScript/JavaScript: Typed API Integration

  1. Install and generate types:
    npm install -D openapi-typescript
    npx openapi-typescript http://api.browser-use.com/openapi.json -o src/browser-use-api.ts
    
  2. Use with your HTTP client:
    import { paths } from './src/browser-use-api';
    
    const API_URL = "https://api.browser-use.com/api/v1";
    
    type RunTaskRequest = paths["/run-task"]["post"]["requestBody"]["content"]["application/json"];
    type RunTaskResponse = paths["/run-task"]["post"]["responses"]["200"]["content"]["application/json"];
    
    async function createTask(task: string, apiKey: string): Promise<RunTaskResponse> {
      const body: RunTaskRequest = { task };
    
      const response = await fetch(`${API_URL}/run-task`, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'Authorization': `Bearer ${apiKey}`,
        },
        body: JSON.stringify(body),
      });
    
      if (!response.ok) {
        throw new Error(`API request failed with status ${response.status}`);
      }
    
      return response.json() as Promise<RunTaskResponse>;
    }
    
    async function run() {
      const apiKey = process.env.BROWSER_USE_API_KEY;
      if (!apiKey) {
        throw new Error("API key not found in environment variables.");
      }
    
      try {
        const result = await createTask("Find the current weather in New York City.", apiKey);
        console.log("Task created:", result);
      } catch (error) {
        console.error("Failed to create task:", error);
      }
    }
    
    run();
    

💡 Want an API testing platform that generates beautiful documentation and streamlines workflow? Explore Apidog, designed for API-focused teams who want seamless collaboration and maximum productivity. Apidog also offers a more affordable alternative to Postman.

button

Explore more

What Is Gemini 3.1 Pro? How to Access Google's Most Intelligent AI Model for Complex Reasoning Tasks?

What Is Gemini 3.1 Pro? How to Access Google's Most Intelligent AI Model for Complex Reasoning Tasks?

Learn what Gemini 3.1 Pro is—Google’s 2026 preview model with 1M-token context, state-of-the-art reasoning, and advanced agentic coding. Discover detailed steps to access it via Google AI Studio, Gemini API, Vertex AI, and the Gemini app.

19 February 2026

How Much Does Claude Sonnet 4.6 Really Cost ?

How Much Does Claude Sonnet 4.6 Really Cost ?

Claude Sonnet 4.6 costs $3/MTok input and $15/MTok output, but with prompt caching, Batch API, and the 1M context window you can cut bills by up to 90%. See a complete 2026 price breakdown, real-world cost examples, and formulas to estimate your Claude spend before going live.

18 February 2026

What API keys or subscriptions do I need for OpenClaw (Moltbot/Clawdbot)?

What API keys or subscriptions do I need for OpenClaw (Moltbot/Clawdbot)?

A practical, architecture-first guide to OpenClaw credentials: which API keys you actually need, how to map providers to features, cost/security tradeoffs, and how to validate your OpenClaw integrations with Apidog.

12 February 2026

Practice API Design-first in Apidog

Discover an easier way to build and use APIs