How to Control Your Browser with AI Using Browser Use, Ollama, and DeepSeek: A Beginner’s Guide

Learn to control your browser with AI using Browser Use, Ollama, and DeepSeek in this beginner’s guide. Automate web tasks like flight searches locally!

Ashley Goolam

Ashley Goolam

22 April 2025

How to Control Your Browser with AI Using Browser Use, Ollama, and DeepSeek: A Beginner’s Guide

Want your AI to take the wheel and surf the web for you—booking flights, scraping data, or even filling out forms? With Browser Use, Ollama, and DeepSeek, you can create a local AI agent that controls your browser like a pro. This open-source trio delivers privacy-focused automation without pricey subscriptions. In this beginner’s guide, I’ll be walking you through setting up Browser Use with Ollama and DeepSeek to automate web tasks. Ready to make your browser an AI-powered sidekick? Let’s dive in!

💡
But before we jump into Browser Use, a quick shoutout to Apidog—an awesome tool for API lovers! It simplifies designing, testing, and documenting APIs, perfect for tweaking your AI agent’s integrations. Check it out at apidog.com—it’s a coder’s dream!
button
apidog ui image

What is Browser Use with Ollama and DeepSeek?

Browser Use is an open-source Python library that lets AI agents control web browsers, automating tasks like searching, clicking links, or submitting forms. Combined with Ollama, a platform for running local large language models (LLMs), and DeepSeek, a powerful open-source reasoning model, you get a free, private setup that rivals premium tools like ChatGPT Operator. Browser Use uses Playwright to interact with browsers (Chrome, Firefox, etc.), while DeepSeek’s smarts handle complex instructions. Why’s it awesome? It’s local, customizable, and lets your AI tackle tasks like finding flights on Kayak, drafting Google Docs and many more. Let’s build it!

browser-use image

Setting Up Your Environment: The Basics

Before we unleash Browser Use, let’s get your system ready with the tools you’ll need. This setup is beginner-friendly, with each step explained so you know exactly what’s happening.

Step 1: Prerequisites

Ensure you have the following installed:

Step 2: Create a Project Folder

Let’s keep your project organized by creating a dedicated folder:

mkdir browser-use-agent
cd browser-use-agent

This folder will hold all your Browser Use files, and cd moves you into it so you’re ready for the next steps.

Step 3: Clone the Repository

Grab the Browser Use source code from GitHub:

git clone https://github.com/browser-use/browser-use.git
cd browser-use

The git clone command downloads the latest Browser Use code, and cd browser-use puts you inside the project directory where the magic happens.

Step 4: Set Up a Virtual Environment

To avoid conflicts with other Python projects, create a virtual environment:

python -m venv venv

Activate it:

You’ll see (venv) in your terminal, meaning you’re now in a clean Python environment. This keeps Browser Use’s dependencies isolated, preventing version clashes.

Step 5: Open in VS Code

Launch your project in Visual Studio Code for easy coding:

code .

VS Code will open the browser-use folder, ready for you to create and run scripts. If you don’t have VS Code, install it from their official website or use another editor, but VS Code’s Python integration is super handy.

Installing Ollama and DeepSeek

Now, let’s set up Ollama to run DeepSeek locally, giving your Browser Use agent a brain that’s both powerful and private. Each step is crucial, so I’ll break it down clearly.

Step 1: Install Ollama

Head to ollama.com and download the installer for your OS (Mac, Windows, or Linux). Run the installer and follow the prompts—it’s a quick “next, next, finish” deal. Verify it’s working:

ollama --version

You should see a version number, like 0.1.44 (April 2025). If it fails, ensure Ollama is added to your system’s PATH (check the installer’s instructions). Ollama acts as the server that hosts DeepSeek, connecting it to Browser Use.

download ollama

Step 2: Download DeepSeek

We’ll use the deepseek/seed model, a 7B-parameter LLM optimized for reasoning and perfect for our needs:

ollama pull deepseek/seed

This downloads the model, which is about 12GB (if the model is too large or you don't have a gpu on your system try: qwen2.5:14b which is about 4gb), so it might take a few minutes depending on your internet speed. Once done, check it’s installed:

ollama list

Look for deepseek/seed:latest in the list. This model will power your Browser Use agent, handling tasks like searching for Boston’s weather with ease.

pull deepseek model

Installing Browser Use

With your environment ready, let’s install Browser Use and its dependencies to enable browser automation. This is where your project starts coming together!

Step 1: Install Browser Use and Dependencies

In your activated virtual environment (inside the browser-use folder), install Browser Use with its development dependencies:

pip install . ."[dev]"

This command installs Browser Use from the cloned repo, including extra tools for development. The "[dev]" part ensures you get testing and debugging goodies, which are helpful for beginners.

Step 2: Install LangChain and Ollama

Add the packages needed to connect Browser Use to DeepSeek:

pip install langchain langchain-ollama

langchain provides the framework for LLM interactions, and langchain-ollama is the specific connector for Ollama’s models, making it easy to integrate DeepSeek.

Step 3: Install Playwright

Get Playwright, the engine that lets Browser Use control browsers:

playwright install

This downloads browser binaries (e.g., for Chrome) that Browser Use uses to navigate the web. If you hit issues, ensure you’re using Python 3.11+ or run playwright install-deps for extra system dependencies.

Configuring Browser Use with Ollama and DeepSeek

Let’s get Browser Use ready to work with Ollama’s DeepSeek model by starting the Ollama server. This step is short and sweet, as our script will handle the rest of the connection!

Start Ollama Server: Ensure Ollama is running to serve the DeepSeek model. In a separate terminal (outside your virtual environment), run:

ollama serve

This starts the Ollama server at http://localhost:11434, allowing Browser Use to communicate with DeepSeek. Keep this terminal open during your project, as it’s the bridge between your agent and the LLM. If it’s not running, your script will fail, so double-check!

Building Your Browser Use Agent

Now for the fun part—building an AI agent that controls your browser with Browser Use! We’ll create a script to make DeepSeek use Google to find the weather in Boston, Massachusetts, and run it in VS Code. Each step is detailed to ensure you nail it.

1. Create a File Called test.py: In VS Code, with your browser-use project open, create a new file named test.py in the browser-use folder. Paste this code:

import os
import asyncio
from browser_use import Agent
from langchain_ollama import ChatOllama

# Task: Use Google to find the weather in Boston, Massachusetts
async def run_search() -> str:
    agent = Agent(
        task="Use Google to find the weather in Boston, Massachusetts",
        llm=ChatOllama(
            model="deepseek/seed",
            num_ctx=32000,
        ),
        max_actions_per_step=3,
        tool_call_in_content=False,
    )
    result = await agent.run(max_steps=15)
    return result

async def main():
    result = await run_search()
    print("\n\n", result)

if __name__ == "__main__":
    asyncio.run(main())

This script sets up a Browser Use agent that:

2. Select the Python Interpreter in VS Code: To run the script, you need the Python interpreter from your project’s virtual environment:

3. Run the Code: With test.py open, click the “Run” button in VS Code (the top-right triangle) or use the terminal (inside the browser-use folder with the virtual environment active):

python test.py

Your Browser Use agent will launch a browser, go to Google, search for “weather in Boston, Massachusetts,” and extract the result.

browser-use search

When I ran this, it printed something like “The current temperature in Boston, MA, is 26°F.” If it doesn’t work, ensure Ollama’s server is running (ollama serve) and port 11434 is open. Check ~/.ollama/logs for errors if it stalls.

browser-use search result

Prompt Engineering for Better Results

To get the best from Browser Use, craft precise prompts:

Adding “sort by price” to my flight prompt saved me cash—prompts are key!

Why Browser Use, Ollama, and DeepSeek Rock

This setup shines because:

It’s a budget-friendly alternative to premium AI agents, with full control.

Pro Tips for Browser Use Success

Wrapping Up: Your Browser Use Adventure Begins

Congrats—you’ve built an AI agent that controls your browser with Browser Use, Ollama, and DeepSeek! From booking flights to automating web tasks, you’re ready to let AI do the heavy lifting. Try scraping job listings or automating emails next—the sky’s the limit. Visit the Browser Use GitHub for more examples, and join the AI hype. And don't forget to swing by apidog.com for that API polish.

button

Explore more

Cursor Is Down? Cursor Shows Service Unavailable Error? Try These:

Cursor Is Down? Cursor Shows Service Unavailable Error? Try These:

This guide will walk you through a series of troubleshooting steps, from the simplest of checks to more advanced solutions, to get you back to coding.

22 June 2025

Top 10 Best AI Tools for API and Backend Testing to Watch in 2025

Top 10 Best AI Tools for API and Backend Testing to Watch in 2025

The digital backbone of modern applications, the Application Programming Interface (API), and the backend systems they connect to, are more critical than ever. As development cycles accelerate and architectures grow in complexity, traditional testing methods are struggling to keep pace. Enter the game-changer: Artificial Intelligence. In 2025, AI is not just a buzzword in the realm of software testing; it is the driving force behind a new generation of tools that are revolutionizing how we ensur

21 June 2025

Why I Love Stripe Docs (API Documentation Best Practices)

Why I Love Stripe Docs (API Documentation Best Practices)

As a developer, I’ve had my fair share of late nights fueled by frustration and bad documentation. I think we all have. I can still vividly recall the cold sweat of trying to integrate a certain legacy payment processor years ago. It was a nightmare of fragmented guides, conflicting API versions, and a dashboard that felt like a labyrinth designed by a committee that hated joy. After hours of wrestling with convoluted SOAP requests and getting absolutely nowhere, I threw in the towel. A colleagu

20 June 2025

Practice API Design-first in Apidog

Discover an easier way to build and use APIs