The rise of open-source large language models (LLMs) has made it easier than ever to create AI-driven tools that rival proprietary solutions like OpenAI’s ChatGPT Operator. Among these open-source models, DeepSeek R1 stands out for its robust reasoning capabilities, free accessibility, and adaptability. By combining DeepSeek R1 with tools like Browser Use, you can build a powerful, fully open-source alternative to ChatGPT Operator without spending hundreds of dollars on premium subscriptions.
This article will guide you through the process of setting up DeepSeek R1 and Browser Use to create an AI agent capable of performing complex tasks, including web automation, reasoning, and natural language interactions.
Whether you're a beginner or an experienced developer, this step-by-step guide will help you get started.
What is ChatGPT Operator and Why You Need an Open Source Alternative?
ChatGPT Operator is a premium feature offered by OpenAI that allows users to create advanced AI agents capable of performing complex tasks such as reasoning, web automation, and multi-step problem-solving.
ChatGPT Operator costs $200 per month, making it less accessible for individuals, small businesses, or organizations with limited budgets.
4. Booking a one-way flight from Zurich to Vienna using the Booking integration
— Rowan Cheung (@rowancheung) January 23, 2025
This one required a bit of back and forth, with ChatGPT Operator pinging me and asking for my flight preference and having me take control of entering payment details pic.twitter.com/XZiqUsQgVh
ChatGPT Operator was booking a plan ticket in the above video
Why You Need an Open Source Alternative
While ChatGPT Operator is powerful, it has several limitations that make an open source alternative appealing:
- Cost: The $200/month subscription fee can be prohibitive for many users.
- Data Privacy: Using proprietary APIs requires sending data to external servers, which may not comply with privacy policies or regulatory requirements.
- Limited Customization: Proprietary solutions often restrict fine-tuning or task-specific optimizations, limiting their adaptability for specialized use cases.
By opting for open-source tools like DeepSeek R1 and Browser Use, you can overcome these challenges and unlock several benefits:
- Cost Savings: Both DeepSeek R1 and Browser Use are completely free and open source, eliminating subscription fees.
- Full Control: Hosting the tools locally or on your own server ensures complete data privacy and security.
- Customizability: You can fine-tune the model for specific tasks, integrate it with other tools, and modify the system to meet your unique requirements.
An open source approach not only reduces dependency on proprietary platforms but also empowers you to build a solution tailored to your needs while maintaining control over costs and data.
You have to take a look at Apidog, the All-in-One API Testing tool that runs you through the whole cycle, from API design down to API Documentation, and turbo-charge your development team's productivity!
Key Components: DeepSeek R1 and Browser Use
DeepSeek R1
DeepSeek R1 is an open-source LLM optimized for reasoning tasks. It excels in chain-of-thought problem solving, coding assistance, and natural language understanding. It is available in multiple sizes (e.g., 1.5B, 7B parameters), making it adaptable to different hardware capabilities.
Browser Use
Browser Use is an open-source tool that enables AI agents to perform browser-based tasks such as web scraping, form filling, and automated navigation. It provides a user-friendly interface and can be integrated with LLMs like DeepSeek R1 for enhanced functionality.
Step 1: Setting Up Your Environment
Hardware Requirements
- For smaller versions of DeepSeek R1 (e.g., 1.5B parameters), a CPU or mid-range GPU (8GB VRAM) is sufficient.
- Larger versions require high-end GPUs (e.g., NVIDIA A100 or RTX 4090).
Operating System
- Linux or macOS is recommended for ease of setup. Windows users can use WSL (Windows Subsystem for Linux).
Python Environment
Create a Python virtual environment to isolate dependencies:
python -m venv venv
source venv/bin/activate # On Linux/macOS
# On Windows:
# venv\Scripts\activate
Install the required libraries:
pip install torch torchvision transformers sentencepiece
Step 2: Run DeepSeek with API or Locally with Ollama
DeepSeek API Usage
To interact with the DeepSeek API, follow these updated steps:
Obtain an API Key:
- Register on the DeepSeek platform and generate an API key from the "API Keys" section. Save this key securely as it will not be shown again.
Make Your First API Call:
The DeepSeek API is compatible with OpenAI's API format, making it easy to integrate with existing OpenAI SDKs or software. Below is an example of a Python implementation:
from openai import OpenAI
client = OpenAI(api_key="<Your_DeepSeek_API_Key>", base_url="https://api.deepseek.com")
response = client.chat.completions.create(
model="deepseek-reasoner", # Use 'deepseek-reasoner' for DeepSeek-R1
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum entanglement."}
],
stream=False # Set to True if you want streaming responses
)
print(response.choices[0].message.content)
cURL Example:
If you prefer using cURL, here’s how you can make a request:
curl https://api.deepseek.com/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <Your_DeepSeek_API_Key>" \
-d '{
"model": "deepseek-reasoner",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"stream": false
}'
Model Selection:
- Specify
model="deepseek-reasoner"
for DeepSeek-R1. - Use
model="deepseek-chat"
for general-purpose chat tasks.
The base_url
can also be set to https://api.deepseek.com/v1
for OpenAI-compatible configurations, though the /v1
path has no relationship with model versions.
Running DeepSeek Locally with Ollama
Ollama simplifies running large language models like DeepSeek-R1 on your local machine. Here’s how to correctly set up and use it:
Install Ollama:
- Download and install Ollama from its official website.
Pull the Desired Model:
Use the following commands to download specific versions of DeepSeek-R1:
# For the 7B model (default):
ollama pull deepseek-r1:7b
# For a smaller 1.5B model:
ollama pull deepseek-r1:1.5b
# For larger models like 70B:
ollama pull deepseek-r1:70b
Run the Model Locally:
Once downloaded, run the model using:
ollama run deepseek-r1:7b
This will start an interactive session where you can interact with the model directly.
Model Variants:
DeepSeek offers several distilled versions based on Qwen and Llama architectures, optimized for different use cases:
DeepSeek-R1-Distill-Qwen-7B:
ollama run deepseek-r1:7b-qwen-distill
DeepSeek-R1-Distill-Llama-70B:
ollama run deepseek-r1:70b-llama-distill
Hardware Considerations:
- Smaller models like 1.5B or 7B can run on consumer-grade GPUs or even CPUs.
- Larger models (e.g., 70B) require high-end GPUs with significant VRAM (e.g., NVIDIA A100 or RTX 4090).
Interactive Chat via API:
Ollama provides an API to integrate locally running models into your applications:
curl http://localhost:11434/api/chat -d '{
"model": "deepseek-r1:7b",
"messages": [
{"role": "user", "content": "Write a short poem about the stars."}
]
}'
Step 3: Installing Browser Use
Browser Use enables your AI agent to interact with web browsers. Follow these steps:
Installation
Clone the Browser Use repository from GitHub:
git clone https://github.com/browser-use/browser-use.git
cd browser-use
pip install -r requirements.txt
Configuration
Set up the Browser Use WebUI:
python webui.py
Open the WebUI in your browser to configure agent settings. You can specify:
- The LLM model (e.g., DeepSeek R1)
- Browser settings (e.g., window size)
Step 4: Combining DeepSeek R1 and Browser Use
To create a functional AI agent that integrates both tools:
Agent Configuration
Modify the agent settings in Browser Use to connect it with DeepSeek R1:
{
"model": "deepseek-r1",
"base_url": "http://localhost:5000",
"browser_settings": {
"window_height": 1080,
"window_width": 1920,
"keep_browser_open": true
}
}
Running the Agent
Start both DeepSeek R1 and Browser Use:
# Start DeepSeek R1 API server
python -m deepseek.api_server
# Start Browser Use WebUI
python webui.py
Once both services are running, the agent can perform tasks such as filling forms, scraping data, or navigating websites autonomously.
Step 5: Prompt Engineering for Better Results
To optimize the performance of your AI agent, use prompt engineering techniques. For example:
General Prompt Template
<instructions>
You are an AI assistant tasked with automating web tasks using Browser Use.
Follow these steps:
1. Navigate to [website].
2. Perform [specific task].
3. Return results in a structured format.
</instructions>
<example>
Navigate to https://example.com and extract all hyperlinks.
</example>
This structure ensures clarity and improves task execution accuracy.
Here are some demos that you can try out by running:
uv pip install gradio
python examples/gradio_demo.py
Example 1.
Prompt: Write a letter in Google Docs to my Papa, thanking him for everything, and save the document as a PDF.
Example 2.
Prompt: Find flights on kayak.com from Zurich to Beijing from 25.12.2024 to 02.02.2025.
Example 3.
Prompt: Read my CV & find ML jobs, save them to a file, and then start applying for them in new tabs, if you need help, ask me.'
Conclusion
By combining DeepSeek R1 with Browser Use, you can build a fully functional ChatGPT Operator alternative that is free, open source, and highly customizable. This setup not only saves costs but also gives you full control over data privacy and system behavior.
Whether you're automating web tasks, building conversational agents, or experimenting with advanced AI features like Retrieval-Augmented Generation, this guide provides everything you need to get started. Embrace the power of open source and create your own intelligent assistant today!
You have to take a look at Apidog, the All-in-One API Testing tool that runs you through the whole cycle, from API design down to API Documentation, and turbo-charge your development team's productivity!