How to Recreate OpenAI Deep Research, But Open Source

In the rapidly evolving field of artificial intelligence, open-source initiatives are gaining momentum, offering accessible alternatives to proprietary models. A notable example is the "Open Deep Research" project, an open-source alternative to proprietary AI research tools.

Emmanuel Mumba

Emmanuel Mumba

15 June 2025

How to Recreate OpenAI Deep Research, But Open Source

Deep research in artificial intelligence is not a single monolithic model—it is rather a process, an iterative workflow that involves searching, reading, and reasoning until an answer is found. OpenAI’s proprietary systems, such as those powering ChatGPT or GPT-4, use complex pipelines that continuously refine responses. Now imagine being able to build a similar system using open-source tools. This article explains how to recreate a Deep Research system using the jina-ai/node-DeepResearch project. We will break down the code, detail each component, and show you how to set up and extend the system.

button

1. Overview and Purpose

DeepResearch is built around a simple yet powerful idea:

Keep searching and reading webpages until finding the answer (or exceeding the token budget).

The system takes a query (for example, “who is bigger? cohere, jina ai, voyage?”) and enters a loop. At each step, the agent (an intelligent module) decides on an action. It might search for new keywords, read the contents of URLs, reflect by generating follow-up questions, or provide an answer if it is certain. This iterative cycle continues until the answer is definitive or the token budget (a proxy for computational resources) is exceeded.

Installation and Setup

Before diving into the code, you need to install the required dependencies and set your API keys. The project uses Gemini for language modeling, Brave or DuckDuckGo for web search, and the Jina Reader for fetching webpage content. Here’s how you set up the project:

export GEMINI_API_KEY=...  # for Gemini API, ask Han
export JINA_API_KEY=jina_...  # free Jina API key, get from https://jina.ai/reader
export BRAVE_API_KEY=...  # optional; if not provided, it defaults to DuckDuckGo search

git clone https://github.com/jina-ai/node-DeepResearch.git
cd node-DeepResearch
npm install

The README even provides examples for running the system with different queries:

In addition to a command-line interface, the project also includes a web server API that exposes endpoints for submitting queries and streaming progress updates.


2. Architecture and Key Components

Let’s break down the major components of the system by exploring the core files:

2.1 agent.ts – The Core Logic

The agent.ts file is the heart of the system. It implements the logic for the “deep research” cycle: generating prompts, deciding on actions, and iterating through search, read, reflect, and answer steps.

Key Elements in agent.ts:

Imports and Setup:

The file begins by importing various tools and libraries:

Sleep Function:

async function sleep(ms: number) {
  const seconds = Math.ceil(ms / 1000);
  console.log(`Waiting ${seconds}s...`);
  return new Promise(resolve => setTimeout(resolve, ms));
}

This helper function is used to delay operations—useful to avoid rate-limiting when calling external APIs.

Schema Generation:

The getSchema function defines the JSON schema for the agent’s response. It dynamically builds a schema that includes properties for:

By enforcing a strict JSON schema, the agent’s output remains consistent and machine-readable.

Prompt Generation:

The getPrompt function creates a detailed prompt that is sent to the language model. It aggregates several sections:

This layered prompt guides the generative AI model to “think” step-by-step and select one action at a time.

Main Loop in getResponse:

The function getResponse is the core of the agent’s iterative loop. It sets up the initial context:

Inside a while loop, the agent:

If the loop runs out of budget or too many bad attempts occur, the system enters “Beast Mode,” where a final, aggressive attempt to answer is made.

Context Storage:

The storeContext function writes the current prompt and various memory states (context, queries, questions, and gathered knowledge) to files. This archival process aids debugging and allows for further analysis of the decision-making process.

Final Execution:

The main() function at the end of agent.ts uses the command-line argument (the query), invokes getResponse, and prints the final answer along with a summary of token usage.


2.2 config.ts – Configuring the Environment

The config.ts file is where the environment and model configurations are defined:

This configuration file makes it easy to change settings and adapt the system to different environments or model behaviors.


2.3 server.ts – The Web Server API

To allow users to interact with DeepResearch via HTTP requests, the system includes a simple Express-based server in server.ts. This file sets up endpoints that handle query submissions and stream progress updates in real time.

Key Points in server.ts:

Express Setup:

The server uses Express and CORS to support cross-origin requests. It listens on port 3000 (or a port specified in the environment).

Query Endpoint (POST /api/v1/query):

Streaming Endpoint (GET /api/v1/stream/:requestId):

Task Storage and Retrieval:

The server writes task results to the file system (under a tasks directory) and provides an endpoint (GET /api/v1/task/:requestId) to retrieve a stored result.

This web server component makes the research agent accessible over HTTP, enabling both interactive experiments and integration into larger systems.


The file test-duck.ts is a standalone script that uses Axios to send an HTTP GET request to an external API (in this case, jsonplaceholder.typicode.com) as a test. Although its primary function is to verify that HTTP requests work correctly (including setting proper headers and handling errors), it serves as an example of how external requests are handled within the system. In a more complex setup, similar patterns are used when querying search APIs like DuckDuckGo or Brave.


2.5 types.ts – Defining Consistent Data Structures

The types.ts file defines all the custom types used across the project:

Action Types:
These include the various actions the agent can perform:

Response Types:
The file defines structured responses for search results, URL reading, evaluation, error analysis, and more. This helps maintain consistency and ensures that every module interprets the data in the same way.

Schema Types:
The JSON schema definitions ensure that responses generated by the language model strictly adhere to the expected format. This is crucial for downstream processing.

Tracker Context:
Custom types for the token and action trackers are also defined, which are used to monitor the state of the conversation and the research process.


3. The Iterative Deep Research Process

The overall system follows a methodical, iterative process that mimics how a human researcher might work:

Initialization:
The process begins with the original question, which is added to a “gaps” list (i.e., the unknowns that need to be filled).

Prompt Generation:
The agent builds a prompt using the current question, previous context, gathered knowledge, and even unsuccessful attempts. This prompt is then sent to the generative AI model.

Action Selection:
Based on the model’s output, the agent selects one of several actions:

Context Update:
Each step updates the internal trackers (token usage and action state) and archives the current state to files. This ensures transparency and allows for debugging or later review.

Evaluation and Looping:
When an answer is proposed, an evaluation step checks whether it is definitive. If not, the system stores the failed attempt details and adjusts its strategy. The cycle continues until a satisfactory answer is found or the token budget is exhausted.

Beast Mode:
If normal steps fail to yield a definitive answer within the constraints, the system enters “Beast Mode.” In this mode, the generative AI is forced to produce an answer based on the accumulated context—even if it means making an educated guess.


4. Real-Time Progress and Feedback

An integral feature of the DeepResearch system is its real-time feedback mechanism. Through the web server’s streaming endpoint:

For example, a progress event might look like this:

data: {
  "type": "progress",
  "trackers": {
    "tokenUsage": 74950,
    "tokenBreakdown": {
      "agent": 64631,
      "read": 10319
    },
    "actionState": {
      "action": "search",
      "thoughts": "The text mentions several investors in Jina AI but doesn’t specify ownership percentages. A direct search is needed.",
      "URLTargets": [],
      "answer": "",
      "questionsToAnswer": [],
      "references": [],
      "searchQuery": "Jina AI investor ownership percentages"
    },
    "step": 7,
    "badAttempts": 0,
    "gaps": []
  }
}

This detailed progress reporting allows developers to see how the agent’s reasoning evolves over time, providing insights into both successes and areas needing improvement.


5. Extending and Customizing DeepResearch

The open-source nature of this project means you can adapt the system for your needs. Here are some ideas for extending DeepResearch:

Custom Search Providers:
You might integrate additional search providers or customize the query rewriting process for domain-specific searches.

Enhanced Reading Modules:
If you require more detailed text processing, you can integrate alternative NLP models or adjust the Jina Reader component to handle new content types.

Improved Evaluation:
The evaluator module currently checks if an answer is definitive. You could expand this to incorporate more nuanced metrics, such as sentiment analysis or fact-checking algorithms.

User Interface:
While the current system uses a command-line interface and a simple web server for streaming events, you could build a full-fledged web or mobile interface for interactive research sessions.

Scalability Enhancements:
The current implementation runs as a single-node service. For production use, consider containerizing the application and deploying it using Kubernetes or another orchestration platform to handle high traffic and distributed processing.


6. Security, Performance, and Best Practices

When deploying an AI-driven system like DeepResearch, there are a few additional considerations:

API Key Management:
Ensure that your API keys (for Gemini, Jina, and Brave) are securely stored and never hardcoded in your source code. Environment variables and secure vaults are recommended.

Rate Limiting:
The built-in sleep function helps avoid rate limiting by delaying successive requests. However, consider implementing additional rate-limiting mechanisms at the server or API gateway level.

Data Validation:
Strictly validate input queries and responses. The JSON schema defined in the agent helps, but you should also validate incoming HTTP requests to prevent malicious inputs.

Error Handling:
Robust error handling (as seen in the server code and test-duck.ts) is critical. This ensures that unexpected API failures or malformed responses do not crash the system.

Resource Monitoring:
Tracking token usage is essential. The TokenTracker and ActionTracker classes provide insights into resource consumption. Monitoring these metrics can help in fine-tuning the system’s performance and avoiding excessive usage.


7. Conclusion

The DeepResearch project by Jina AI exemplifies how complex, iterative research processes can be built using open-source tools. By integrating search engines, generative AI models, and intelligent reasoning loops, the system continuously refines its answer until it is certain—or until resource limits are reached.

In this article, we explored how to recreate OpenAI Deep Research using an open-source approach:

By making these advanced research techniques available as open-source, projects like DeepResearch democratize access to cutting-edge AI methods. Whether you’re a researcher, developer, or enterprise looking to integrate deep research capabilities into your workflows, this project serves as both an inspiration and a practical foundation for building your own solution.

The iterative design—combining search, reading, reflection, and answering in a continuous loop—ensures that even ambiguous or complex queries are handled with multiple layers of scrutiny. And with a detailed architecture that tracks token usage and provides live feedback, you gain deep insights into the reasoning process behind each answer.

If you are eager to experiment, clone the repository, set up your environment as described, and run queries ranging from simple arithmetic to multifaceted research questions. With a little customization, you can tailor the system to new domains and even enhance its reasoning capabilities. Open-source projects like this pave the way for community-driven innovation in AI research.


By following this detailed breakdown and analysis, you can recreate and extend the ideas behind OpenAI’s Deep Research in a fully open-source manner. Whether you’re looking to build on the existing codebase or integrate similar methodologies into your projects, the roadmap is clear: iterate, refine, and push the boundaries of automated research.

button

Explore more

How to Use Amazon EKS MCP Server

How to Use Amazon EKS MCP Server

Discover how to install and use Amazon EKS MCP Server with Cline in VS Code. Create EKS clusters, deploy NGINX, and troubleshoot pods with AI-driven ease!

19 June 2025

What Cursor’s Pro Plan "Unlimited-with-Rate-Limits" Means

What Cursor’s Pro Plan "Unlimited-with-Rate-Limits" Means

Cursor’s Pro plan is now unlimited with rate limits. Learn what that means, how rate limits work, what burst and local limits mean and why users are confused.

19 June 2025

Cursor Pro Plan Goes Unlimited (with Rate Limits)

Cursor Pro Plan Goes Unlimited (with Rate Limits)

Cursor’s new Pro plan promises an unlimited-with-rate-limits model, but what does that really mean? Dive into the details, user concerns, and find out whether it is a smart upgrade or sneaky shift.

19 June 2025

Practice API Design-first in Apidog

Discover an easier way to build and use APIs