How to Use the ERNIE 5.1 API?

Step-by-step guide to calling Baidu's ERNIE 5.1 via the Qianfan API: keys, curl, Python, Node.js, streaming, tool calls, and testing with Apidog.

Ashley Innocent

Ashley Innocent

14 May 2026

How to Use the ERNIE 5.1 API?

Apidog for Enterprise

On-Premises Deploy

SSO & RBAC

SOC 2 Compliant

Explore Apidog Enterprise

ERNIE 5.1 shipped on May 9, 2026, and within a week the Qianfan API was live for it. If you want to call the model from your own code, route tool calls through it, or wire it into an agent loop with Apidog, this guide walks the full path: account, key, request body, streaming, tool use, error handling.

We’ll keep it practical. By the end you’ll have working curl, Python, and Node snippets, plus a request collection you can drop into Apidog.

If you have not read the ERNIE 5.1 launch breakdown yet, skim it first; it covers benchmarks and trade-offs versus DeepSeek V4 and Kimi K2.6. This post is the implementation companion.

Step 1: Get a Qianfan API key

ERNIE 5.1 is served through Baidu Intelligent Cloud’s Qianfan platform. There is no separate “ERNIE API”; everything routes through Qianfan.

  1. Go to cloud.baidu.com and create or sign in to a Baidu Intelligent Cloud account. International developers can use email signup; some enterprise features still need a mainland phone number.
  2. Open the Qianfan console at console.bce.baidu.com/qianfan.
  3. Under API Key Management (API Key 管理), click Create API Key. Pick the workspace and grant access to the chat-completions service.
  4. Copy the key. It looks like bce-v3/ALTAK-xxxx/xxxx. Store it in an env var, not in source.
export QIANFAN_API_KEY="bce-v3/ALTAK-xxxx/xxxx"

Two things to know up front. First, the new v2 endpoint uses a single Bearer token; the older v1 OAuth access_token flow is being deprecated and you should not build new code on it. Second, ERNIE 5.1 is a paid model from day one. Top up a small balance (¥10 is enough to test) before your first request.

Step 2: Hit the OpenAI-compatible endpoint with curl

Qianfan exposes an OpenAI-compatible chat-completions endpoint, so anything in your stack that already speaks OpenAI’s format will work with a base-URL swap and a model-ID change.

Base URL: https://qianfan.baidubce.com/v2 Model ID: ernie-5.1 (also: ernie-5.1-preview for early-access features)

Minimum viable request:

curl https://qianfan.baidubce.com/v2/chat/completions \
  -H "Authorization: Bearer $QIANFAN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ernie-5.1",
    "messages": [
      {"role": "system", "content": "You are a senior API designer."},
      {"role": "user", "content": "Sketch a REST schema for a GitHub-style PR review API. Be concise."}
    ],
    "temperature": 0.3
  }'

You get back a standard OpenAI-shaped response:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1746780000,
  "model": "ernie-5.1",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 318,
    "total_tokens": 360
  }
}

If you see 401 Unauthorized, your key is wrong or expired. If you see 403, the key is valid but the model is not enabled on this workspace; go back to the console and add ERNIE 5.1 to the workspace’s allowed models.

Step 3: Call ERNIE 5.1 from Python

Because the endpoint is OpenAI-compatible, the official openai Python SDK works as-is. Point it at Qianfan.

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["QIANFAN_API_KEY"],
    base_url="https://qianfan.baidubce.com/v2",
)

response = client.chat.completions.create(
    model="ernie-5.1",
    messages=[
        {"role": "system", "content": "You explain APIs in plain English."},
        {"role": "user", "content": "Why would I use server-sent events over WebSockets for a chat UI?"},
    ],
    temperature=0.4,
)

print(response.choices[0].message.content)
print(f"\nTokens used: {response.usage.total_tokens}")

If you already have wrappers around the OpenAI SDK in your codebase, swapping ERNIE 5.1 in for A/B testing is a one-line change. The same trick works for DeepSeek’s API and most other Chinese model providers.

Step 4: Stream tokens for chat-style UIs

For any user-facing chat, you want streaming. Set stream: true and consume server-sent events.

stream = client.chat.completions.create(
    model="ernie-5.1",
    messages=[{"role": "user", "content": "Write a haiku about API versioning."}],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Curl equivalent for debugging:

curl https://qianfan.baidubce.com/v2/chat/completions \
  -H "Authorization: Bearer $QIANFAN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ernie-5.1",
    "stream": true,
    "messages": [{"role": "user", "content": "Stream a 3-sentence joke."}]
  }' \
  --no-buffer

The stream format is identical to OpenAI’s: data: {...} lines terminated by data: [DONE].

Step 5: Use ERNIE 5.1 with tools (the agentic part)

This is where ERNIE 5.1 earns its launch headline. The model scored above DeepSeek-V4-Pro on τ³-bench and SpreadsheetBench-Verified, which means tool-calling works in production, not just in demos.

Same schema as OpenAI function calling:

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name, e.g. Singapore"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["city"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="ernie-5.1",
    messages=[{"role": "user", "content": "What's the weather in Tokyo right now?"}],
    tools=tools,
    tool_choice="auto",
)

tool_calls = response.choices[0].message.tool_calls
if tool_calls:
    call = tool_calls[0]
    print(f"Model wants to call: {call.function.name}({call.function.arguments})")

After your code runs the actual tool, append the result as a tool role message and call again. The loop terminates when finish_reason == "stop" and tool_calls is empty.

One gotcha: ERNIE 5.1 occasionally returns tool arguments as a stringified JSON inside a code fence rather than as a clean JSON string. Parse defensively with json.loads() wrapped in try/except, and if it fails, strip ```json fences before retrying.

Step 6: Call ERNIE 5.1 from Node.js

Drop-in for any Node project using openai v5+:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.QIANFAN_API_KEY,
  baseURL: "https://qianfan.baidubce.com/v2",
});

const completion = await client.chat.completions.create({
  model: "ernie-5.1",
  messages: [
    { role: "user", content: "Return a JSON object with 3 API design tips." },
  ],
  response_format: { type: "json_object" },
});

console.log(completion.choices[0].message.content);

response_format: { type: "json_object" } works and is reliable. Strict JSON schemas (json_schema) are still being rolled out on Qianfan; verify the response shape in code rather than trusting the constraint.

Step 7: Test and compare with Apidog

If you are deciding between ERNIE 5.1, DeepSeek V4, and Kimi K2.6, do not do it from the terminal. Use Apidog to build a single workspace with one folder per provider, identical request bodies, and saved environments per API key.

The 60-second setup:

  1. Open Apidog and create a new project called “LLM bake-off.”

Add an environment with QIANFAN_API_KEY, DEEPSEEK_API_KEY, MOONSHOT_API_KEY as variables.

Create three requests pointing at each provider’s base URL with model set to ernie-5.1, deepseek-chat, and kimi-k2-6 respectively.

Pin the same messages array on all three. Use Apidog’s “Run” feature to fire them in parallel and diff outputs.

The free tier handles this comfortably. Apidog saves the request history per environment, so you can come back next week and re-run the exact same eval against a new model version. Beats babysitting curl in a tmux pane.

For more on multi-provider testing, see Test local LLMs as APIs and our GLM 5.1 API guide.

Pricing, rate limits, and quotas

Public Qianfan pricing for ERNIE 5.1 was not in the release post; check the live console rate card before quoting numbers internally. Three practical tips while you wait:

Error handling that will save you

The errors you will hit in practice, in rough order of frequency:

Status Meaning Fix
401 Bearer token wrong or expired Regenerate from console
403 Model not enabled on this workspace Add ERNIE 5.1 in console
429 Rate limit hit Backoff + retry with jitter
400 (invalid messages) Wrong message-role ordering Ensure user/assistant alternation
500/502 Qianfan-side blip Retry once; if it persists, check status page

Wrap every call in retry-with-exponential-backoff capped at 3 attempts. For production, log request_id from response headers; Baidu support needs it to debug your case.

A minimal production-shaped wrapper

If you want to drop ERNIE 5.1 into a real app today, here is the smallest wrapper that is not embarrassing:

import os, time, random, json
from openai import OpenAI, RateLimitError, APIError

client = OpenAI(
    api_key=os.environ["QIANFAN_API_KEY"],
    base_url="https://qianfan.baidubce.com/v2",
)

def chat(messages, *, model="ernie-5.1", temperature=0.3, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model=model,
                messages=messages,
                temperature=temperature,
            )
        except RateLimitError:
            time.sleep((2 ** attempt) + random.random())
        except APIError as e:
            if e.status_code and e.status_code >= 500 and attempt < max_retries - 1:
                time.sleep(1 + attempt)
                continue
            raise
    raise RuntimeError("ERNIE 5.1 retries exhausted")

That handles the 80% case. For tool-loops and streaming, build on top.

Frequently asked questions

Is the ERNIE 5.1 API free? No. Qianfan is pay-as-you-go. There is no permanent free tier; new accounts sometimes get trial credits. For free experimentation use the ernie.baidu.com chat UI or look at free LLM options.

Can I run ERNIE 5.1 locally? No. There are no public weights. If on-prem is a hard requirement, look at how to run DeepSeek V4 locally or the best local LLMs in 2026 instead.

Does the OpenAI SDK work without changes? Yes, with base_url set to https://qianfan.baidubce.com/v2 and api_key set to your Qianfan key. The model field takes Qianfan model IDs, not OpenAI ones. Function calling, streaming, and response_format: json_object all work. Strict json_schema validation is still rolling out.

How does ERNIE 5.1 handle Chinese vs English prompts? Both are first-class. The Arena Search score of 1,223 came from a mixed-language voter pool. For technical English tasks (code, API design), it is competitive with the closed frontier; for Chinese creative writing it is best-in-class among Chinese models.

What is the max output length? Not officially published. In practice, single-turn responses cap around 8K tokens before the model wraps up. For long-form generation, chunk and continue.

Building an agent on ERNIE 5.1? Download Apidog and use the OpenAI-compatible request collection to mock, test, and document the Qianfan endpoint alongside the rest of your services.

Explore more

How to Build Claude Workflows That Run Without You

How to Build Claude Workflows That Run Without You

Build Claude workflows that run without you. Learn headless execution, the verification gate, guardrails, scheduling, and handoffs that make unattended agents safe.

8 June 2026

Stop Prompting Your Coding Agent. Build the Loop That Prompts It Instead

Stop Prompting Your Coding Agent. Build the Loop That Prompts It Instead

Stop prompting your coding agent one shot at a time. Learn how to design self-correcting agent loops, why the verification gate matters most, and how API tests close the loop.

8 June 2026

How to Secure API Collaboration with Role-Based Access Control (RBAC)

How to Secure API Collaboration with Role-Based Access Control (RBAC)

A practical guide for protecting shared API workspaces, endpoints, credentials, docs, mocks, tests, and production environments during API collaboration.

5 June 2026

Practice API Design-first in Apidog

Discover an easier way to build and use APIs