Pyspur: the Open Source AI Agent Builder

What is Pyspur?

Pyspur is an open-source platform designed to accelerate the development of AI agents by providing a visual, node-based environment. It enables engineers to build, debug, and deploy complex AI workflows by connecting modular components on a drag-and-drop canvas.

The core problem Pyspur solves is the lack of transparency and the slow iteration cycle common in AI development. It tackles "prompt hell" and "workflow blindspots" by allowing developers to inspect the inputs and outputs of every step in their agent's logic in real-time. The platform includes built-in support for advanced patterns like Retrieval-Augmented Generation (RAG), allows for human-in-the-loop breakpoints, and can deploy any workflow as a production-ready API with a single click. Ultimately, Pyspur helps engineers build more reliable and debuggable AI systems faster.

💡

Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demands, and replaces Postman at a much more affordable price!

button

Let's get started!

1. Environment Setup

Choose the setup option that best suits your objective. For local development and experimentation, the pip installation is sufficient. For scalable or production systems, the Docker-based setup is recommended as it provides a reproducible, containerized environment with a dedicated PostgreSQL instance.

Option A: Local `pip` Installation

Prerequisites: Python 3.11+

Install from PyPI:

pip install pyspur

Initialize Project Directory: This command creates a project scaffold, including a .env file for configuration.

pyspur init my-pyspur-project && cd my-pyspur-project

Launch Server: The --sqlite flag directs Pyspur to use a local SQLite database, removing the dependency on a separate database server.

pyspur serve --sqlite

Access UI: Navigate to http://localhost:6080 in your browser.

Option B: Docker Setup

Prerequisites: Docker Engine

Execute Setup Script: This command downloads and runs a shell script that clones the Pyspur repository, configures the docker-compose.dev.yml file, and launches the application stack (frontend, backend, database).

curl -fsSL https://raw.githubusercontent.com/PySpur-com/pyspur/main/start_pyspur_docker.sh | bash -s pyspur-project

Access UI: Navigate to http://localhost:6080.

2. Let's Build a Workflow with Pyspur

Instead of building from scratch, we will load and analyze an existing Pyspur template. This approach provides a realistic look at a non-trivial workflow.

Load the Template:

On the Pyspur dashboard, click "New Spur".
In the modal that appears, select the "Templates" tab.
Choose the "Joke Generator" template. The canvas will populate with a pre-built workflow.

Workflow Analysis:
This workflow is designed to generate a joke and then refine it. It uses a BestOfNNode, an advanced component that runs an LLM prompt N times, uses another LLM call to rate the N outputs, and selects the best one.

Let's break down the key nodes as defined in joke_generator.json:

input_node (InputNode): This node defines the workflow's entry point.

config.output_schema: { "topic": "string", "audience": "string" }
This specifies that the workflow requires two string inputs: the joke's topic and its intended audience.

JokeDrafter (BestOfNNode): This is the first stage of joke creation.

config.system_message: "You are a stand-up comedian who uses dark humor like Ricky Gervais or Jimmy Carr..."
config.user_message: "Your audience is: {{input_node.audience}}\nThe topic should be about {{input_node.topic}}"
This Jinja2 template dynamically inserts the data from the input_node.
config.samples: 10
This instructs the node to generate 10 joke drafts.
config.rating_prompt: "Rate the following joke on a scale from 0 to 10..."
After generating 10 jokes, this prompt is used to have an LLM rate each one. The node then selects the highest-rated joke.
config.output_schema: { "initial_joke": "string" }
The node's output is a single string: the best joke from the 10 samples.

JokeRefiner (BestOfNNode): This node takes the drafted joke and improves it.

config.system_message: "Your goal is to refine a joke to make it more vulgar and concise..."
config.user_message: {{JokeDrafter.initial_joke}}
Crucially, this node's input is the output of the JokeDrafter node.
config.samples: 3
It generates 3 refined versions of the initial joke.
It also uses a rating prompt to select the best of the three refinements.
config.output_schema: { "final_joke": "string" }

Links: The links array in the JSON defines the data flow:
input_node -> JokeDrafter -> JokeRefiner.

Execution and Inspection:

In the test panel on the right, use the pre-populated test input or create your own (e.g., Topic: "AI assistants", Audience: "Developers").
Click Run.
As the workflow executes, click on the JokeDrafter node. You can inspect its run_data, which will show all 10 generated samples and their corresponding ratings, giving you a transparent view into the agent's "thought process."
The final, refined joke will be available in the output of the JokeRefiner node.

3. Implementing a RAG Pipeline

While not part of the joke generator, Retrieval-Augmented Generation (RAG) is a critical Pyspur capability. Here is the technical process for adding knowledge to an agent:

Document Ingestion (Collection): Navigate to the RAG section. When you create a "Document Collection" and upload a file (e.g., a PDF), Pyspur initiates a backend process that parses the document into text, segments it into configurable chunks based on token length, and stores these chunks with source metadata in its database.
Vectorization (Index): Creating a "Vector Index" from a collection triggers another process. Pyspur iterates through each text chunk, makes an API call to a specified embedding model (e.g., OpenAI's text-embedding-ada-002) to get a vector representation, and upserts these vectors into a configured vector database (e.g., ChromaDB, PGVector).
Retrieval (Workflow Node): In a workflow, the Retriever Node is configured to point to a specific Vector Index. At runtime, its input query is embedded using the same model, and a semantic search (approximate nearest neighbor) is performed against the vector database to fetch the most relevant text chunks. These chunks are then passed as context to a downstream LLM.

4. Deployment as a Production API

When your workflow is finalized, you can expose it as a secure HTTP endpoint.

Initiate Deployment: Click the "Deploy" button in the top navigation bar.

Select API Call Type:

Blocking (Synchronous): For fast-executing workflows. The client receives the result in the same HTTP response.
Endpoint: POST /api/wf/{workflow_id}/run/?run_type=blocking
Non-Blocking (Asynchronous): For long-running workflows (like our joke generator with 10+ LLM calls). The client immediately receives a run_id and must poll a separate endpoint for the result. This prevents client timeouts.
Start Endpoint: POST /api/wf/{workflow_id}/start_run/?run_type=non_blocking
Status Endpoint: GET /api/runs/{run_id}/status/

Integrate with Your Application:
The deployment modal generates client code. The body of the POST request must be a JSON object where the initial_inputs key contains an object whose keys match the title of your input nodes.

Example Python Client (for the Joke Generator):

import requests
import json
import time

PYSUR_HOST = "http://localhost:6080"
WORKFLOW_ID = "your_workflow_id_here" # Get this from the deploy modal

# The keys in this dict must match the 'output_schema' of the input_node
payload = {
    "initial_inputs": {
        "input_node": {
            "topic": "Python decorators",
            "audience": "Senior Software Engineers"
        }
    }
}

# 1. Start the non-blocking run
start_url = f"{PYSUR_HOST}/api/wf/{WORKFLOW_ID}/start_run/?run_type=non_blocking"
start_resp = requests.post(start_url, json=payload)
run_id = start_resp.json()['id']
print(f"Workflow started with run_id: {run_id}")

# 2. Poll for the result
status_url = f"{PYSUR_HOST}/api/runs/{run_id}/status/"
while True:
    status_resp = requests.get(status_url)
    data = status_resp.json()
    status = data.get("status")
    print(f"Current status: {status}")
    if status in ["COMPLETED", "FAILED"]:
        print("Final Output:")
        print(json.dumps(data.get("outputs"), indent=2))
        break
    time.sleep(2)

Conclusion

Pyspur provides a robust, transparent, and technically-grounded environment for AI agent development. By abstracting individual operations into modular nodes and providing a visual canvas, it allows engineers to focus on the high-level logic of their agents. The platform's true power lies in its deep inspectability—allowing you to trace data flow and debug each component's I/O—and its seamless transition from visual prototype to production-ready API. The move from simple single-call agents to complex, multi-stage workflows utilizing patterns like Best-of-N or RAG is not just possible, but intuitive. By building with Pyspur, you are not just connecting boxes; you are engineering reliable, debuggable, and scalable AI systems.

💡