OpenAI’s GPT-5.5 ships with a paid API: $5 per million input tokens, $30 per million output. For a side project, a hackathon build, or a free public app, that bill stops the work before it starts. There is one structural workaround: Puter.js exposes the entire OpenAI catalog (GPT-5.5, GPT-5.5 Pro, every GPT-5.x variant, GPT-Image-2, DALL-E, OpenAI TTS) without an OpenAI key, and bills the end user instead of you. For the developer, the surface is free and unlimited.
TL;DR
- Puter.js gives developers free, unlimited access to the full OpenAI model catalog with no API key, no OpenAI account, no server.
- Supported text models include gpt-5.5, gpt-5.5-pro, gpt-5.4, gpt-5, gpt-5-mini, o1, o3, gpt-4.1, gpt-4o, plus every chat and codex variant.
- Image: gpt-image-2, gpt-image-1.5, dall-e-3. TTS: gpt-4o-mini-tts, tts-1, tts-1-hd.
- One
<script>tag, one function call (puter.ai.chat), and you are talking to GPT-5.5. - Streaming, function calling, vision input, image generation, and text-to-speech all work in browser.
- The end user covers their usage from a Puter account; you pay zero, forever.
- Use Apidog to benchmark the same prompt against Puter and the official OpenAI API for migration planning.
How “free unlimited” works
Puter.js flips the LLM billing model. Instead of you holding the OpenAI key and eating every token cost, your end user signs in to Puter (free account) and the call charges against their balance. New Puter accounts get starter credit; users top up if they want more.
For the developer, three things follow:
- No OpenAI account, no key in your repo. No leak risk, no rotation, no project-scoped key management.
- No usage cap on your side. Every user runs against their own account, so your “limit” scales linearly with your user base.
- No billing exposure. You never see a Stripe invoice from OpenAI; you do not need to negotiate enterprise terms.
The trade-off: this is browser-first. A backend Node script cannot use Puter without a logged-in user session. For backend use, the official OpenAI API is still the right path.
Step 1: Install
One CDN tag, no build step:
<script src="https://js.puter.com/v2/"></script>
That is the entire installation. Or for a bundled app:
npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
The CDN version works in any HTML file, hackathon prototype, static site, or browser extension. The NPM version gives you tree-shaking and TypeScript types.
Step 2: Pick a model
Puter exposes the full GPT-5.x lineup plus everything older. The shortlist:
| Model ID | When to use |
|---|---|
gpt-5.5-pro |
Deepest reasoning; coding agents, complex analysis |
gpt-5.5 |
Default daily driver; strong cost/quality balance |
gpt-5.4-nano |
Cheapest, fastest text; high-volume classification |
gpt-5.4-mini |
Mid-tier; good for chat UIs |
gpt-5.3-codex |
Code-specific tasks |
o3 |
Complex reasoning chains |
o1-pro |
Agentic multi-step planning |
gpt-4.1, gpt-4o, gpt-4o-mini |
Stable, well-understood baseline |
Image generation:
gpt-image-2: latest, sharp output, fast.gpt-image-1.5/gpt-image-1/dall-e-3/dall-e-2: older but stable.
Text-to-speech:
gpt-4o-mini-tts: latest, sounds the most natural.tts-1,tts-1-hd: classic TTS, lower latency.
Step 3: Make GPT-5.5 talk
The minimum viable chat call:
<!DOCTYPE html>
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat(
"Explain WebSockets in three sentences",
{ model: "gpt-5.5" }
).then(response => {
puter.print(response);
});
</script>
</body>
</html>
Open in a browser. Puter handles the call, the user signs in (or creates a free Puter account on first run), and the response prints to the page. No API key, no environment variable, no server.
Step 4: Stream the response
For chat UIs and long answers, streaming is the right default. Pass stream: true and consume the iterator:
const response = await puter.ai.chat(
"Explain the theory of relativity in detail",
{ model: "gpt-5.5", stream: true }
);
for await (const part of response) {
puter.print(part?.text);
}
Each part.text is a token chunk. Append to your UI bubble; the user sees text appear word by word.
Step 5: Vision (image input)
Pass an image URL as the second argument; the model reads the image and answers the prompt about it:
puter.ai.chat(
"What do you see in this image? Describe colors, objects, and mood.",
"https://assets.puter.site/doge.jpeg",
{ model: "gpt-5.5" }
).then(response => {
puter.print(response);
});
This works on every GPT-5.x model and the GPT-4o variants. Use cases: alt-text generation, visual QA, screenshot analysis, OCR, accessibility tooling.
Step 6: Generate images
Puter’s txt2img returns an <img> element with the generated image already loaded:
puter.ai.txt2img(
"A futuristic cityscape at night, cinematic, neon, rain",
{ model: "gpt-image-2" }
).then(imageElement => {
document.body.appendChild(imageElement);
});
The user pays the image generation cost from their Puter account (typically a few cents per image). For a free public image generator, this is the cleanest setup that exists today.
Step 7: Text-to-speech
OpenAI’s TTS line is exposed through txt2speech. The function returns an <audio> element with the generated voice:
puter.ai.txt2speech(
"Welcome back. Your account balance is $1,247.50.",
{ provider: "openai", model: "gpt-4o-mini-tts" }
).then(audio => {
audio.setAttribute("controls", "");
document.body.appendChild(audio);
});
Use it for voice prompts, app voiceovers, podcast intros, or accessibility narration.
Step 8: Function calling
Standard OpenAI shape. Declare tools, the model emits a tool_calls array, you execute, you reply:
const tools = [{
type: "function",
function: {
name: "get_weather",
description: "Get the current weather for a city.",
parameters: {
type: "object",
properties: { city: { type: "string" } },
required: ["city"],
},
},
}];
const response = await puter.ai.chat(
"What's the weather in Tokyo right now?",
{ model: "gpt-5.5", tools }
);
const toolCalls = response.message.tool_calls;
if (toolCalls?.length) {
// Execute the function on your side, then reply with the result
console.log(toolCalls[0].function.name, toolCalls[0].function.arguments);
}
The function-calling shape mirrors OpenAI’s, so any tool definitions you have today port directly. For testing tool-driven flows in production-grade settings, see MCP server testing in Apidog.
Step 9: Tune temperature and max_tokens
Pass standard OpenAI parameters in the options object:
const response = await puter.ai.chat(
"Tell me about Mars",
{
model: "gpt-5.5",
temperature: 0.2,
max_tokens: 200,
}
);
Lower temperature (0.0–0.3) for factual answers, higher (0.7–1.0) for creative writing. max_tokens caps cost on the user’s side; useful for keeping per-call charges predictable when you ship a public app.
What you get and what you don’t
Puter’s free unlimited GPT-5.5 access is real, but it is a subset of the official OpenAI API surface. The honest split:
You get:
- Full GPT-5.x catalog including 5.5, 5.5 Pro, 5.4 (mini, nano, pro), and every codex variant
- All older OpenAI models (GPT-4.1, GPT-4o, o1, o3)
- GPT-Image-2 and DALL-E for free image generation
- OpenAI TTS line including gpt-4o-mini-tts
- Streaming, vision, function calling, temperature control, max_tokens
You may not get:
- The Responses API (Puter uses Chat Completions shape)
- Prompt caching cost reduction
- The Files API (uploaded document context)
- Server-side use without a browser context
- Direct rate limit headers from OpenAI
- OpenAI’s structured output mode and JSON schema enforcement
For deep production-grade flows, the official OpenAI API is the right answer. For browser apps, side projects, and public tools, Puter is enough.
When to use Puter vs official OpenAI
The split:
Use Puter when:
- You are shipping a free public app and do not want billing exposure.
- You are prototyping and do not want to set up an OpenAI billing relationship.
- You want OpenAI access in a static site, hackathon project, or browser extension without a backend.
- Your users are happy to sign in to Puter (or already use it).
Use the official OpenAI API when:
- You need server-side calls (cron jobs, webhook handlers, batch processing).
- You need prompt caching for cost savings on stable system prompts.
- You need the Responses API, Files, or full structured outputs.
- You need a contractual relationship for compliance (BAAs, SOC 2, residency).
- Your users will not tolerate a Puter sign-in step.
Most projects start on Puter for prototyping and migrate to the official API when they hit one of the limits above. Migration is straightforward; the message shape is the same.
For paid production setup, see How to use the GPT-5.5 API.
Testing the integration in Apidog
Puter calls happen in the browser, so you cannot script them from a backend test runner directly. The pattern that works:
- Build a small static page with the Puter script and a query parameter for the prompt.
- Use Apidog to validate the upstream OpenAI API surface (when you eventually migrate).
- Keep both as separate environments in the same Apidog collection so you can swap with one click.

Download Apidog and set up two environments: puter-prototype (a localhost URL hosting your Puter page) and openai-prod (https://api.openai.com/v1). The collection ports cleanly when you graduate. For broader API testing patterns, see API testing tool for QA engineers.
FAQ
Is this truly unlimited, or is there a hidden cap?Unlimited from the developer’s side, yes. The end user has whatever balance is in their Puter account; new accounts get starter credit and users top up if they want more. There is no per-developer cap.
Do I need an OpenAI account?No. Puter handles the OpenAI relationship. You never see an OpenAI key.
Can I use this in production?Yes, for browser-based apps. Puter runs production infrastructure. The right question is whether your users are willing to sign in to Puter; if yes, ship it.
Does GPT-5.5 through Puter perform identically to the official API?The model output is the same; Puter calls the official OpenAI API on the user’s behalf. Latency may be marginally higher because of the extra hop, but the model behavior is unchanged.
What about prompt caching savings?Puter does not expose OpenAI’s prompt caching pricing controls today. If you have a stable 50k-token system prompt and need the cache discount, use the official API.
Can I use this in a backend service?Not cleanly. Puter is browser-first and assumes a user session. Backend services should use the official OpenAI API. For free server-side options, see How to use the GPT-5.5 API for free.
What model should I default to?gpt-5.5 for daily reasoning. gpt-5.4-nano for high-volume classification. gpt-5.5-pro for hard reasoning tasks. o3 when you need long reasoning chains.
Will my users be charged a lot?Most chat-style usage costs cents per session at OpenAI’s rates. A casual user can run dozens of conversations on Puter’s starter credit before they need to top up. Image generation is more expensive; cap max_tokens and avoid wasted gen calls.
Can I generate images for free with Puter?Yes through txt2img with gpt-image-2 or DALL-E. The user pays the image generation cost from their Puter balance. For the official paid API guide, see How to use the GPT-Image-2 API.
Wrapping up
Free unlimited GPT-5.5 through Puter.js is the cleanest path for any browser-based app that wants OpenAI-quality output without OpenAI-quality billing. Drop in the script, pick a model, write the prompt. The end user covers usage; you ship without a key.
For server-side workloads, prompt caching, the Responses API, or full structured outputs, the official OpenAI API is still the right answer. For prototypes, hackathon builds, free public apps, and static sites, Puter is the answer.
Build the request once in Apidog, benchmark Puter against the official API, and pick the path that matches your shape.
