How to Use the Grok 4.3 API ?

Complete developer guide to xAI's Grok 4.3 API. Endpoints, pricing ($1.25/$2.50 per 1M), 1M-token context, native video input, reasoning effort, function calling, OpenAI compatibility, and Apidog testing.

Ashley Innocent

Ashley Innocent

8 May 2026

How to Use the Grok 4.3 API ?

xAI rolled out Grok 4.3 in stages: beta on April 17, 2026, API access on April 30, and full general availability on May 6. The pitch is direct: a 1,000,000-token context window, native video input for the first time on the Grok line, always-on reasoning, and a price cut of roughly 40% against Grok 4.20. Eight legacy Grok models retire on May 15, so anyone running on grok-3 or grok-4 series should plan a migration this week.

This guide covers how to call Grok 4.3 from your code: endpoint shape, authentication, the OpenAI-compatible base URL, the reasoning effort parameter, video input, function calling, and a working test setup in Apidog.

For the voice side of the same release, see How to use Grok Voice for free. For the head-to-head against OpenAI’s flagship voice model, see Grok Voice vs GPT-Realtime.

button

TL;DR

What changed in Grok 4.3

The headline upgrades, in order of impact for most teams:

The Intelligence Index of 53 (Artificial Analysis) puts Grok 4.3 above the average of 35 for its price tier, and tenth out of 146 models tracked.

Prerequisites

Before the first request, line up four things:

Export the key once:

export XAI_API_KEY="xai-..."

Endpoint and authentication

Grok 4.3 ships on the OpenAI-compatible Chat Completions surface, with xAI’s base URL.

POST https://api.x.ai/v1/chat/completions

Auth is a bearer token. Headers are standard:

Authorization: Bearer $XAI_API_KEY
Content-Type: application/json

The OpenAI compatibility means you can drop the OpenAI Python or Node SDK in and change the base_url. That is the path of least resistance for most teams migrating from gpt-4 or gpt-5.

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["XAI_API_KEY"],
    base_url="https://api.x.ai/v1",
)

response = client.chat.completions.create(
    model="grok-4.3",
    messages=[
        {"role": "user", "content": "Summarize the trade-offs of GraphQL vs REST in three bullets."}
    ],
    reasoning_effort="medium",
)

print(response.choices[0].message.content)

If you prefer the xAI SDK, the call shape is the same; the only change is the import.

Request parameters

The full parameter map for Grok 4.3:

Parameter Type Values Notes
model string grok-4.3 Required.
messages array OpenAI message shape Required. Supports role: system / user / assistant.
reasoning_effort string low, medium, high Optional. Default: medium. Higher levels increase latency and output tokens.
max_tokens int 1–32768 Caps output.
temperature float 0.0–2.0 Default 1.0.
top_p float 0.0–1.0 Nucleus sampling.
stream bool true / false Server-sent events when true.
tools array OpenAI tool shape Function calling.
tool_choice string / object auto, none, or specific tool Standard OpenAI semantics.
response_format object { type: "json_object" } Structured output.
seed int any For reproducibility on temperature: 0.

A working curl request:

curl https://api.x.ai/v1/chat/completions \
  -H "Authorization: Bearer $XAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "grok-4.3",
    "messages": [
      {"role": "system", "content": "You are a senior backend engineer."},
      {"role": "user", "content": "Review this query plan and flag the bottleneck."}
    ],
    "reasoning_effort": "high"
  }'

The response carries the standard OpenAI shape: choices[].message.content, plus a usage object with prompt_tokens, completion_tokens, reasoning_tokens, and total_tokens broken out.

Reasoning effort

Three levels, with concrete guidance:

Always-on reasoning means even low does some thinking; that is what drives the factual-accuracy gain over Grok 4.20. Don’t expect to save money by avoiding reasoning altogether; it is baked in.

Function calling

Standard OpenAI shape works directly. Declare tools, the model emits a tool_calls array on the assistant message, you execute, you reply with a tool role message:

tools = [{
    "type": "function",
    "function": {
        "name": "lookup_user",
        "description": "Look up a user by ID.",
        "parameters": {
            "type": "object",
            "properties": {"user_id": {"type": "string"}},
            "required": ["user_id"],
        },
    },
}]

response = client.chat.completions.create(
    model="grok-4.3",
    messages=[{"role": "user", "content": "Find user u_42 and tell me their last login."}],
    tools=tools,
    reasoning_effort="medium",
)

tool_calls = response.choices[0].message.tool_calls

The 300 Elo gain on GDPval-AA shows up here in practice; Grok 4.3 picks better tools, fewer redundant calls, and recovers from a tool error without spinning. If you are testing tool flows, MCP server testing in Apidog covers the replay setup we use internally.

Video input

Grok 4.3 is the first Grok model with native video input. Pass a video URL inside a content block:

response = client.chat.completions.create(
    model="grok-4.3",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe what happens in this clip and flag any anomalies."},
            {"type": "video_url", "video_url": {"url": "https://example.com/clip.mp4"}},
        ],
    }],
)

Video tokens count against the input meter. Long clips burn context fast; downsample or trim before you send if cost matters. The model reasons over frames natively, so you don’t need to extract keyframes manually.

1M-token context

The 1M context window is a real production tool, not a benchmark trophy. Common patterns:

Cached input at $0.20/1M makes this affordable. A 400k-token system prompt that you keep stable burns $0.08 per cached call instead of $0.50 fresh.

Migration from legacy Grok models

Eight legacy Grok models retire on May 15, 2026, 12:00 PM PT. If you are running on any of them, swap the model string to grok-4.3 before the cutoff. Most calls work without further change because the request shape is unchanged.

Two things to watch:

For the full price comparison across the OpenAI line, see GPT-5.5 pricing; for the head-to-head reasoning models, see How to use the GPT-5.5 API.

Testing in Apidog

The fastest way to validate Grok 4.3 against your own use case:

  1. Create an Apidog environment with XAI_API_KEY and BASE_URL = https://api.x.ai/v1.
  2. Save a request collection with three variants: low, medium, high reasoning. Same prompt, different effort.
  3. Run all three. Compare the response, the latency, and the usage.reasoning_tokens count side by side.
  4. Add a fourth variant pointing at OpenAI’s base URL to compare Grok 4.3 against GPT-5.5 on identical input. Same SDK, different model and base URL.

Download Apidog to run the comparison. The collection ports cleanly when you swap providers, which is the point. For broader API testing strategy, see API testing tool for QA engineers.

Rate limits

Tier limits on the xAI Console run from a baseline of a few thousand requests per minute on Tier 1 to multi-hundred-thousand on enterprise tiers. Concrete numbers shift; check the console dashboard. The 159 tokens/second throughput xAI advertises is per-stream output speed, not aggregate; concurrent requests scale linearly within tier caps.

If you hit rate limits, the API returns a 429 with a retry-after header. Standard exponential backoff handles it.

FAQ

Is Grok 4.3 OpenAI-compatible end to end?For Chat Completions, yes. Drop in the OpenAI SDK, change the base_url, change the model. Function calling, structured output, and streaming all work identically.

Does it support the Responses API?The xAI surface is Chat Completions today. The Responses API is OpenAI-only.

What is the actual context limit in practice?1,000,000 tokens. Long inputs cost real money even at $1.25/1M; cache aggressively if your prompt is stable.

How does always-on reasoning affect latency?First-token latency is slightly higher than non-reasoning models, but Grok 4.3 streams output at ~159 tokens/second, so end-to-end response time is competitive. The trade-off is worth it on accuracy-sensitive workloads.

Can I use Grok 4.3 with Grok Voice?Yes. The voice agent (grok-voice-think-fast-1.0) calls Grok 4.3 under the hood when it reasons. You can also call Grok 4.3 directly from a voice loop you build on top of TTS and STT primitives.

What happens to my old Grok 3 / Grok 4 calls after May 15?They will fail with a 410 (model retired). Migrate before the cutoff.

Does Grok 4.3 support image input?Yes, alongside the new video input. Pass an image URL in a content block, same shape as OpenAI.

Wrapping up

Grok 4.3 is the most aggressive price-performance move xAI has shipped. The 40% cut, the 1M context, the always-on reasoning, and the native video together make it a serious daily driver for most agent workloads. The OpenAI compatibility means migration is a base-URL change, not a rewrite.

The fastest validation path: script three reasoning variants in Apidog, drop in your real prompts, measure latency and reasoning tokens. Migrate before May 15.

button

Explore more

How to Use Grok Voice for Free: Console Setup, Voice Cloning, and Real-Time Voice Agents

How to Use Grok Voice for Free: Console Setup, Voice Cloning, and Real-Time Voice Agents

Grok Voice ships free on the xAI Console. Full guide: TTS, STT, voice agent over WebSocket, custom voice cloning in under 2 minutes, code examples, and Apidog test setup.

8 May 2026

What Is GPT-Realtime-2 and How to Use the GPT-Realtime-2 API

What Is GPT-Realtime-2 and How to Use the GPT-Realtime-2 API

OpenAI's GPT-Realtime-2 brings GPT-5-class reasoning to speech-to-speech voice agents. Specs, pricing, WebSocket setup, SIP, MCP, image input, voices, and a working Apidog test workflow.

8 May 2026

How to Access and Use GPT-5.5 Instant: ChatGPT + API Guide

How to Access and Use GPT-5.5 Instant: ChatGPT + API Guide

Learn how to use GPT-5.5 Instant in ChatGPT for free or call it via the OpenAI API at $5/$30 per million tokens. Limits, pricing, code samples.

6 May 2026

Practice API Design-first in Apidog

Discover an easier way to build and use APIs