How to Use ElevenLabs API

ElevenLabs turns text into natural speech and supports a wide range of voices, languages, and styles. The API makes it easy to embed voice into apps, automate narration pipelines, or build real-time experiences like voice agents. If you can send an HTTP request, you can generate audio in seconds.

💡

Want to test ElevenLabs API without writing code first? Download Apidog to send requests, inspect audio responses, and experiment with voice settings visually. Apidog lets you save API collections, share them with your team, and generate client code automatically—saving hours of setup time.

button

What Is ElevenLabs API?

ElevenLabs API provides programmatic access to AI models that generate, transform, and analyze audio. The platform started as a text-to-speech service but has expanded into a full audio AI suite.

Core capabilities:

Text-to-Speech (TTS): Convert written text into spoken audio with control over voice characteristics, emotion, and pacing
Speech-to-Speech (STS): Transform one voice into another while preserving the original intonation and timing
Voice Cloning: Create a digital replica of any voice from as little as 60 seconds of clean audio
AI Dubbing: Translate and dub audio/video content into different languages while maintaining the speaker's voice characteristics
Sound Effects: Generate sound effects from text descriptions
Speech-to-Text: Transcribe audio into text with high accuracy

The API works over standard HTTP and WebSocket protocols. You can call it from any language, but official SDKs exist for Python and JavaScript/TypeScript with type safety and streaming support built in.

Getting the ElevenLabs API Key

Before making any API call, you need an API key. Here's how to get one:

Step 1: Create a free account. Even the free plan includes API access with 10,000 characters per month.

Step 2: Log in and navigate to the Profile + API Key section. You can find this by clicking your profile icon in the bottom-left corner, or by going directly to the developer settings.

Step 3: Click Create API Key. Copy the key and store it securely—you won't be able to see the full key again.

Important security notes:

Never commit your API key to version control
Use environment variables or a secrets manager in production
API keys can be scoped to specific workspaces for team environments
Rotate keys regularly and revoke compromised keys immediately

Set it as an environment variable for the examples in this guide:

# Linux/macOS
export ELEVENLABS_API_KEY="your_api_key_here"

# Windows (PowerShell)
$env:ELEVENLABS_API_KEY="your_api_key_here"

ElevenLabs API Endpoints Overview

The API is organized around several resource groups. Here are the most commonly used endpoints:

Endpoint	Method	Description
`/v1/text-to-speech/{voice_id}`	POST	Convert text to speech audio
`/v1/text-to-speech/{voice_id}/stream`	POST	Stream audio as it's generated
`/v1/speech-to-speech/{voice_id}`	POST	Convert speech from one voice to another
`/v1/voices`	GET	List all available voices
`/v1/voices/{voice_id}`	GET	Get details for a specific voice
`/v1/models`	GET	List all available models
`/v1/user`	GET	Get user account info and usage
`/v1/voice-generation/generate-voice`	POST	Generate a new random voice

Base URL: https://api.elevenlabs.io

Authentication: All requests require the xi-api-key header:

xi-api-key: your_api_key_here

Text-to-Speech with cURL

The fastest way to test the API is with a cURL command. This example uses the Rachel voice (ID: 21m00Tcm4TlvDq8ikWAM), one of the default voices available on all plans:

curl -X POST "https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM" \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Welcome to our application. This audio was generated using the ElevenLabs API.",
    "model_id": "eleven_flash_v2_5",
    "voice_settings": {
      "stability": 0.5,
      "similarity_boost": 0.75,
      "style": 0.0,
      "use_speaker_boost": true
    }
  }' \
  --output speech.mp3

If successful, you'll get an speech.mp3 file with the generated audio. Play it with any media player.

Breaking down the request:

voice_id (in URL): The ID of the voice to use. Every voice in ElevenLabs has a unique ID.
text: The content to convert to speech. The Flash v2.5 model supports up to 40,000 characters per request.
model_id: Which AI model to use. eleven_flash_v2_5 offers the best balance of speed and quality.
voice_settings: Optional fine-tuning parameters (covered in detail below).

The response returns raw audio data. The default format is MP3, but you can request other formats by adding the output_format query parameter:

# Get PCM audio instead of MP3
curl -X POST "https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM?output_format=pcm_44100" \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello world", "model_id": "eleven_flash_v2_5"}' \
  --output speech.pcm

Using the Python SDK

The official Python SDK simplifies integration with type hints, built-in audio playback, and streaming support.

Installation

pip install elevenlabs

To play audio directly through your speakers, you may also need mpv or ffmpeg:

# macOS
brew install mpv

# Ubuntu/Debian
sudo apt install mpv

Basic Text-to-Speech

import os
from elevenlabs.client import ElevenLabs
from elevenlabs import play

client = ElevenLabs(
    api_key=os.getenv("ELEVENLABS_API_KEY")
)

audio = client.text_to_speech.convert(
    text="The ElevenLabs API makes it easy to add realistic voice output to any application.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",  # George voice
    model_id="eleven_multilingual_v2",
    output_format="mp3_44100_128",
)

play(audio)

Save Audio to File

from elevenlabs.client import ElevenLabs

client = ElevenLabs(api_key="your_api_key")

audio = client.text_to_speech.convert(
    text="This audio will be saved to a file.",
    voice_id="21m00Tcm4TlvDq8ikWAM",
    model_id="eleven_flash_v2_5",
)

with open("output.mp3", "wb") as f:
    for chunk in audio:
        f.write(chunk)

print("Audio saved to output.mp3")

List Available Voices

from elevenlabs.client import ElevenLabs

client = ElevenLabs(api_key="your_api_key")

response = client.voices.search()

for voice in response.voices:
    print(f"Name: {voice.name}, ID: {voice.voice_id}, Category: {voice.category}")

This prints all voices available in your account, including premade voices, cloned voices, and community voices you've added.

Async Support

For applications using asyncio, the SDK provides AsyncElevenLabs:

import asyncio
from elevenlabs.client import AsyncElevenLabs

client = AsyncElevenLabs(api_key="your_api_key")

async def generate_speech():
    audio = await client.text_to_speech.convert(
        text="This was generated asynchronously.",
        voice_id="21m00Tcm4TlvDq8ikWAM",
        model_id="eleven_flash_v2_5",
    )

    with open("async_output.mp3", "wb") as f:
        async for chunk in audio:
            f.write(chunk)

    print("Async audio saved.")

asyncio.run(generate_speech())

Using the JavaScript SDK

The official Node.js SDK (@elevenlabs/elevenlabs-js) provides full TypeScript support and works in Node.js environments.

Installation

npm install @elevenlabs/elevenlabs-js

Basic Text-to-Speech

import { ElevenLabsClient, play } from "@elevenlabs/elevenlabs-js";

const elevenlabs = new ElevenLabsClient({
  apiKey: process.env.ELEVENLABS_API_KEY,
});

const audio = await elevenlabs.textToSpeech.convert(
  "21m00Tcm4TlvDq8ikWAM",  // Rachel voice ID
  {
    text: "Hello from the ElevenLabs JavaScript SDK!",
    modelId: "eleven_multilingual_v2",
  }
);

await play(audio);

Save to File (Node.js)

import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { createWriteStream } from "fs";
import { Readable } from "stream";
import { pipeline } from "stream/promises";

const elevenlabs = new ElevenLabsClient({
  apiKey: process.env.ELEVENLABS_API_KEY,
});

const audio = await elevenlabs.textToSpeech.convert(
  "21m00Tcm4TlvDq8ikWAM",
  {
    text: "This audio will be written to a file using Node.js streams.",
    modelId: "eleven_flash_v2_5",
  }
);

const readable = Readable.from(audio);
const writeStream = createWriteStream("output.mp3");
await pipeline(readable, writeStream);

console.log("Audio saved to output.mp3");

Error Handling

import { ElevenLabsClient, ElevenLabsError } from "@elevenlabs/elevenlabs-js";

const elevenlabs = new ElevenLabsClient({
  apiKey: process.env.ELEVENLABS_API_KEY,
});

try {
  const audio = await elevenlabs.textToSpeech.convert(
    "21m00Tcm4TlvDq8ikWAM",
    {
      text: "Testing error handling.",
      modelId: "eleven_flash_v2_5",
    }
  );
  await play(audio);
} catch (error) {
  if (error instanceof ElevenLabsError) {
    console.error(`API Error: ${error.message}, Status: ${error.statusCode}`);
  } else {
    console.error("Unexpected error:", error);
  }
}

The SDK retries failed requests up to 2 times by default, with a 60-second timeout. Both values are configurable.

Streaming Audio in Real Time

For chatbots, voice assistants, or any application where latency matters, streaming lets you start playing audio before the entire response is generated. This is critical for conversational AI where users expect near-instant responses.

Python Streaming

from elevenlabs import stream
from elevenlabs.client import ElevenLabs

client = ElevenLabs(api_key="your_api_key")

audio_stream = client.text_to_speech.stream(
    text="Streaming allows you to start hearing audio almost instantly, without waiting for the entire generation to complete.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_flash_v2_5",
)

# Play streamed audio through speakers in real time
stream(audio_stream)

JavaScript Streaming

import { ElevenLabsClient, stream } from "@elevenlabs/elevenlabs-js";

const elevenlabs = new ElevenLabsClient();

const audioStream = await elevenlabs.textToSpeech.stream(
  "JBFqnCBsd6RMkjVDRZzb",
  {
    text: "This audio streams in real time with minimal latency.",
    modelId: "eleven_flash_v2_5",
  }
);

stream(audioStream);

WebSocket Streaming

For the lowest latency, use WebSocket connections. This is ideal for real-time voice agents where text arrives in chunks (e.g., from an LLM):

import asyncio
import websockets
import json
import base64

async def stream_tts_websocket():
    voice_id = "21m00Tcm4TlvDq8ikWAM"
    model_id = "eleven_flash_v2_5"
    uri = f"wss://api.elevenlabs.io/v1/text-to-speech/{voice_id}/stream-input?model_id={model_id}"

    async with websockets.connect(uri) as ws:
        # Send initial config
        await ws.send(json.dumps({
            "text": " ",
            "voice_settings": {"stability": 0.5, "similarity_boost": 0.75},
            "xi_api_key": "your_api_key",
        }))

        # Send text chunks as they arrive (e.g., from an LLM)
        text_chunks = [
            "Hello! ",
            "This is streaming ",
            "via WebSockets. ",
            "Each chunk is sent separately."
        ]

        for chunk in text_chunks:
            await ws.send(json.dumps({"text": chunk}))

        # Signal end of input
        await ws.send(json.dumps({"text": ""}))

        # Receive audio chunks
        audio_data = b""
        async for message in ws:
            data = json.loads(message)
            if data.get("audio"):
                audio_data += base64.b64decode(data["audio"])
            if data.get("isFinal"):
                break

        with open("websocket_output.mp3", "wb") as f:
            f.write(audio_data)
        print("WebSocket audio saved.")

asyncio.run(stream_tts_websocket())

Voice Selection and Management

ElevenLabs offers hundreds of voices. Picking the right one matters for your application's user experience.

Default Voices

These voices are available on all plans, including the free tier:

Voice Name	Voice ID	Description
Rachel	`21m00Tcm4TlvDq8ikWAM`	Calm, young female
Drew	`29vD33N1CtxCmqQRPOHJ`	Well-rounded male
Clyde	`2EiwWnXFnvU5JabPnv8n`	War veteran character
Paul	`5Q0t7uMcjvnagumLfvZi`	Ground reporter
Domi	`AZnzlk1XvdvUeBnXmlld`	Strong, assertive female
Dave	`CYw3kZ02Hs0563khs1Fj`	Conversational British male
Fin	`D38z5RcWu1voky8WS1ja`	Irish male
Sarah	`EXAVITQu4vr4xnSDxMaL`	Soft, young female

Finding Voice IDs

Use the API to search all available voices:

curl -X GET "https://api.elevenlabs.io/v1/voices" \
  -H "xi-api-key: $ELEVENLABS_API_KEY" | python3 -m json.tool

Or filter by category (premade, cloned, generated):

from elevenlabs.client import ElevenLabs

client = ElevenLabs(api_key="your_api_key")

# List only premade voices
response = client.voices.search(category="premade")

for voice in response.voices:
    print(f"{voice.name}: {voice.voice_id}")

You can also copy a voice ID directly from the ElevenLabs website: select a voice, click the three-dot menu, and choose Copy Voice ID.

Choosing the Right Model

ElevenLabs offers multiple models, each optimized for different use cases:

# List all available models with details
from elevenlabs.client import ElevenLabs

client = ElevenLabs(api_key="your_api_key")
models = client.models.list()

for model in models:
    print(f"Model: {model.name}")
    print(f"  ID: {model.model_id}")
    print(f"  Languages: {len(model.languages)}")
    print(f"  Max chars: {model.max_characters_request_free_user}")
    print()

Testing ElevenLabs API with Apidog

Before writing integration code, it helps to test API endpoints interactively. Apidog makes this straightforward—you can configure requests visually, inspect responses (including audio), and generate client code once you're satisfied.

button

Step 1: Set Up a New Project

Open Apidog and create a new project. Name it "ElevenLabs API" or add the endpoints to an existing project.

Step 2: Configure Authentication

Go to Project Settings > Auth and set up a global header:

Header name: xi-api-key
Header value: your ElevenLabs API key

This automatically attaches authentication to every request in the project.

Step 3: Create a Text-to-Speech Request

Create a new POST request:

URL: https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM
Body (JSON):

{
  "text": "Testing the ElevenLabs API through Apidog. This makes it easy to experiment with different voices and settings.",
  "model_id": "eleven_flash_v2_5",
  "voice_settings": {
    "stability": 0.5,
    "similarity_boost": 0.75
  }
}

Click Send. Apidog displays the response headers and lets you download or play the audio directly.

Step 4: Experiment with Parameters

Use Apidog's interface to quickly swap voice IDs, change models, or adjust voice settings without editing raw JSON. Save different configurations as separate endpoints in your collection for easy comparison.

Step 5: Generate Client Code

Once you've confirmed the request works, click Generate Code in Apidog to get ready-to-use client code in Python, JavaScript, cURL, Go, Java, and more. This eliminates manual translation from API docs to working code.

Try it now: Download Apidog free to test ElevenLabs API endpoints, compare voice outputs, and share API collections with your team—no credit card required.

Voice Settings and Fine-Tuning

Voice settings let you adjust how a voice sounds. These parameters are sent in the voice_settings object:

Parameter	Range	Default	Effect
`stability`	0.0 - 1.0	0.5	Higher = more consistent, less expressive. Lower = more variable, more emotional.
`similarity_boost`	0.0 - 1.0	0.75	Higher = closer to the original voice. Lower = more variation.
`style`	0.0 - 1.0	0.0	Higher = more exaggerated style. Increases latency. Only for Multilingual v2.
`use_speaker_boost`	boolean	true	Boosts similarity to the original speaker. Minor latency increase.

Practical examples:

from elevenlabs.client import ElevenLabs

client = ElevenLabs(api_key="your_api_key")

# Narration voice: consistent, stable
narration = client.text_to_speech.convert(
    text="Chapter One. It was a bright cold day in April.",
    voice_id="21m00Tcm4TlvDq8ikWAM",
    model_id="eleven_multilingual_v2",
    voice_settings={
        "stability": 0.8,
        "similarity_boost": 0.8,
        "style": 0.2,
        "use_speaker_boost": True,
    },
)

# Conversational voice: expressive, natural
conversational = client.text_to_speech.convert(
    text="Oh wow, that's actually a great idea! Let me think about how we could make it work.",
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    model_id="eleven_multilingual_v2",
    voice_settings={
        "stability": 0.3,
        "similarity_boost": 0.6,
        "style": 0.5,
        "use_speaker_boost": True,
    },
)

Guidelines:

For audiobooks and narration, use higher stability (0.7-0.9) for consistent delivery
For chatbots and conversational AI, use lower stability (0.3-0.5) for natural variation
For character voices, experiment with lower similarity_boost (0.4-0.6) to create distinct personalities
The style parameter only works with Multilingual v2 and adds latency—skip it for real-time applications

ElevenLabs API Pricing and Rate Limits

ElevenLabs uses a credit-based pricing system. Here's the breakdown:

Troubleshooting

Error	Cause	Fix
401 Unauthorized	Invalid or missing API key	Check your `xi-api-key` header value
422 Unprocessable Entity	Invalid request body	Verify voice_id exists and text isn't empty
429 Too Many Requests	Rate limit exceeded	Add exponential backoff, or upgrade your plan
Audio sounds robotic	Wrong model or settings	Try Multilingual v2 with stability at 0.5
Pronunciation errors	Text normalization issue	Spell out numbers/abbreviations, or use SSML-like formatting

Conclusion

The ElevenLabs API gives developers access to some of the most realistic speech synthesis available today. Whether you need a few lines of narration or a full real-time voice pipeline, the API scales from simple cURL calls to production WebSocket streams.

Ready to add realistic voice to your application? Download Apidog to test ElevenLabs API endpoints, experiment with voice settings, and generate client code—all free, no credit card required.

button