Apidog

All-in-one Collaborative API Development Platform

API Design

API Documentation

API Debugging

API Mocking

API Automated Testing

How to Use Google Veo 3 API on Vertex AI

This article explores the transformative potential of Veo 3.

Nikki Alessandro

Nikki Alessandro

Updated on May 21, 2025

For years, creating high-quality video content has been a complex, time-consuming, and often expensive endeavor, requiring specialized skills in cinematography, editing, sound design, and animation. Generative AI, particularly in video, is set to lower these barriers significantly. Imagine generating compelling b-roll footage, crafting dynamic social media animations, or even producing short cinematic sequences, all from textual descriptions or still images. This is the promise of models like Veo 3.

Google has been a significant contributor to AI research and development, and its commitment to generative media is evident in the continuous evolution of models available through Vertex AI. Vertex AI serves as a unified machine learning platform, providing access to Google's cutting-edge AI models, including those from DeepMind, and enabling users to build, deploy, and scale ML applications with ease. The introduction of Veo 3, Imagen 4, and Lyria 2 further solidifies Vertex AI as a powerhouse for creative AI.

💡
Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demans, and replaces Postman at a much more affordable price!
button

Introducing Veo 3: The Next Leap in AI Video Generation

Prompt: A medium shot, historical adventure setting: Warm lamplight illuminates a cartographer in a cluttered study, poring over an ancient, sprawling map spread across a large table. Cartographer: "According to this old sea chart, the lost island isn't myth! We must prepare an expedition immediately!"


Veo 3, developed by Google DeepMind, represents the latest advancement in Google's video generation technology. It aims to provide users with the ability to generate high-quality videos that are not only visually impressive but also rich in auditory detail. Key enhancements and features announced for Veo 3 include:

  • Improved Video Quality: Veo 3 is engineered to produce videos of superior quality when generated from both text and image prompts. This means more realistic textures, better motion coherence, and more faithful adherence to complex prompt details. The model is capable of handling intricate prompt details, translating nuanced textual descriptions into compelling visual narratives.
  • Integrated Speech Generation: A significant step forward is Veo 3's ability to incorporate speech, such as dialogue and voice-overs, directly into the generated videos. This feature opens up vast possibilities for storytelling, marketing content, and educational materials, allowing creators to add another layer of narrative depth without needing separate audio production workflows for basic speech.
  • Comprehensive Audio Integration: Beyond speech, Veo 3 can generate other audio elements, including music and sound effects. This means the model doesn't just create silent movies; it can produce videos with a more complete soundscape, enhancing the viewing experience and aligning the audio with the visual mood and events depicted.

The potential impact of these features is already being recognized by early adopters. Klarna, a leader in digital payments, has been leveraging Veo (and Imagen) on Vertex AI to boost content creation efficiency. They've noted significant reductions in production timelines for assets ranging from b-roll to YouTube bumpers. Justin Thomas, Head of Digital Experience & Growth at Klarna, remarked on the transformation: "With Veo and Imagen, we’ve transformed what used to be time-intensive production processes into quick, efficient tasks that allow us to scale content creation rapidly... What once took us eight weeks is now only taking eight hours, resulting in substantial cost savings.”

How to Use Google Veo API with Vertex AI

Google's Veo models are accessible on Vertex AI, allowing you to generate videos from text or image prompts. You can interact with Veo through the Google Cloud console or by making requests to the Vertex AI API. This guide focuses on using the API, with examples primarily using the Gen AI SDK for Python and REST calls.

Vertex AI Platform
エンタープライズ向けのフルマネージドの統合 AI 開発プラットフォーム。Vertex AI Studio、Agent Builder、160 以上の基盤モデルにアクセスして活用できます。

Prerequisites for Using Veo on Vertex AI

Before you can start generating videos with Veo, ensure you have the following set up:

  • Google Cloud Account and Project:
  • You'll need a Google Cloud account. New accounts often come with free credits.
  • Within the Google Cloud console, select an existing Google Cloud project or create a new one. If you're experimenting, creating a new project can make cleanup easier by allowing you to delete the project and all its associated resources afterward.
  • Enable Vertex AI API:
  • Navigate to the project selector page in the Google Cloud console.
  • Ensure the Vertex AI API is enabled for your project.
  • Authentication:
  • You need to set up authentication for your environment.
  • For REST API (local development): If you plan to use the REST API samples locally, the credentials you provide to the Google Cloud CLI (gcloud CLI) are used. Install the gcloud CLI and initialize it by running:
gcloud init

If you're using an external identity provider (IdP), sign in to the gcloud CLI with your federated identity first.

  • For Python SDK: The Gen AI SDK typically uses Application Default Credentials (ADC). Setting the GOOGLE_CLOUD_PROJECT environment variable and ensuring GOOGLE_GENAI_USE_VERTEXAI=True (as shown in later examples) helps configure the SDK to work with Vertex AI, leveraging your authenticated gcloud environment or service account credentials if configured.

Accessing Veo Models and Locations

  • Model Versions: Veo offers multiple video generation models. The documentation provides examples using veo-2.0-generate-001 and mentions veo-3.0-generate-preview (currently in Preview). Always refer to the official "Veo models" documentation for the most current list and their capabilities.
  • Locations: When making requests, you can specify a region (location) to control where your data is stored at rest. For a list of available regions, consult the "Generative AI on Vertex AI locations" documentation. The Python SDK examples often use environment variables to set the location.

Using the Veo API with the Python SDK (Gen AI SDK)

The Gen AI SDK for Python provides a convenient way to interact with Veo models on Vertex AI.

Installation

Install or upgrade the google-genai library:

pip install --upgrade google-genai

Environment Variable Setup

Set the following environment variables. Replace GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_LOCATION with your project ID and desired Google Cloud location (e.g., global or a specific region like us-central1).

export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
export GOOGLE_CLOUD_LOCATION=YOUR_LOCATION
export GOOGLE_GENAI_USE_VERTEXAI=True

Initializing the Client

from google import genai

client = genai.Client()

Generating Video from Text

You can generate videos using a descriptive text prompt. The output is a long-running operation, and the generated video is typically saved to a Google Cloud Storage (GCS) URI you specify.

import time
from google import genai
from google.genai.types import GenerateVideosConfig

client = genai.Client()

# !!! IMPORTANT: Update and uncomment the GCS URI for output !!!
# output_gcs_uri = "gs://your-bucket-name/your-output-prefix/"
# Ensure this bucket exists and your project/service account has write permissions.

try:
    operation = client.models.generate_videos(
        model="veo-2.0-generate-001",  # Or other available Veo model
        prompt="a cat reading a book",
        config=GenerateVideosConfig(
            aspect_ratio="16:9",
            output_gcs_uri=output_gcs_uri, # Specify your GCS path
        ),
    )

    print("Video generation operation started. Polling for completion...")
    while not operation.done:
        time.sleep(15) # Wait for 15 seconds before checking status
        operation = client.operations.get(operation) # Refresh operation status
        print(f"Operation status: {operation.metadata.state if operation.metadata else 'Processing...'}")

    if operation.response and operation.result.generated_videos:
        print(f"Video generated successfully: {operation.result.generated_videos[0].video.uri}")
    elif operation.error:
        print(f"Error during video generation: {operation.error.message}")
    else:
        print("Operation finished but no video URI found or an unknown error occurred.")

except AttributeError:
    print("Error: 'output_gcs_uri' is not defined. Please set the 'output_gcs_uri' variable.")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

Remember to replace "gs://your-bucket-name/your-output-prefix/" with your actual GCS bucket and desired output path.

Generating Video from an Image (and optional text)

You can also generate videos starting from an input image, optionally guided by a text prompt.

import time
from google import genai
from google.genai.types import GenerateVideosConfig, Image

client = genai.Client()

# !!! IMPORTANT: Update and uncomment the GCS URI for output !!!
# output_gcs_uri = "gs://your-bucket-name/your-output-prefix-image/"
# Ensure this bucket exists and your project/service account has write permissions.

# Example using a public GCS image. Replace with your image URI.
input_image_gcs_uri = "gs://cloud-samples-data/generative-ai/image/flowers.png"

try:
    operation = client.models.generate_videos(
        model="veo-2.0-generate-001",  # Or other available Veo model
        image=Image(
            gcs_uri=input_image_gcs_uri,
            mime_type="image/png", # Adjust mime_type based on your image
        ),
        prompt="the flowers sway gently in the breeze", # Optional text prompt
        config=GenerateVideosConfig(
            aspect_ratio="16:9", # Or match to your image/desired output
            output_gcs_uri=output_gcs_uri,
        ),
    )

    print("Image-to-video generation operation started. Polling for completion...")
    while not operation.done:
        time.sleep(15)
        operation = client.operations.get(operation)
        print(f"Operation status: {operation.metadata.state if operation.metadata else 'Processing...'}")

    if operation.response and operation.result.generated_videos:
        print(f"Video generated successfully: {operation.result.generated_videos[0].video.uri}")
    elif operation.error:
        print(f"Error during video generation: {operation.error.message}")
    else:
        print("Operation finished but no video URI found or an unknown error occurred.")

except AttributeError:
    print("Error: 'output_gcs_uri' is not defined. Please set the 'output_gcs_uri' variable.")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

Using the Veo API with REST

You can directly call the Veo API using HTTP requests. This involves sending a POST request to a specific endpoint.

Endpoint and HTTP Method

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning

Replace PROJECT_ID and MODEL_ID (e.g., veo-2.0-generate-001 or veo-3.0-generate-preview).

Request JSON Body

{
  "instances": [
    {
      "prompt": "TEXT_PROMPT"
      // For image input, the structure within "instances" will differ. Consult API reference.
    }
  ],
  "parameters": {
    "storageUri": "OUTPUT_STORAGE_URI", // Optional: GCS URI for output. e.g., "gs://video-bucket/output/"
                                        // If not provided, video bytes might be returned in the operation response for some configurations (check docs).
    "sampleCount": "RESPONSE_COUNT",    // Number of videos to generate (e.g., 1-4).
    "durationSeconds": "DURATION",      // Desired video length in seconds (e.g., 5-8).
    "enhancePrompt": "ENHANCED_PROMPT"  // Boolean: True (default) or False.
    // Add other parameters like "aspectRatio", "fps" as per the API reference.
  }
}

Make sure to replace placeholders like TEXT_PROMPT, OUTPUT_STORAGE_URI, etc., with actual values.

Authentication and Sending the Request (Example with curl)

Save your request body in a file (e.g., request.json).

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://us-central1-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT_ID/locations/us-central1/publishers/google/models/YOUR_MODEL_ID:predictLongRunning"

This command returns an operation name (e.g., projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID).

Handling Long-Running Operations

The predictLongRunning endpoint initiates an asynchronous operation. You'll need to use the returned operation name to poll its status until completion, similar to how the Python SDK handles it.

What are Veo 3 Prompts & How to Write Better Prompts for Veo 3

Google Veo Models generates videos based on your textual descriptions. More detailed prompts generally result in higher quality and more relevant videos. Consider describing:

  • Subjects and actions.
  • Setting and environment.
  • Cinematic styles, camera motions.
  • Mood and tone.

For models supporting audio (like veo-3.0-generate-preview), you can include descriptions for transcription (dialogue) and sound effects.

  • Prompt Rewriter (Prompt Enhancement):
    Veo includes an LLM-based prompt enhancement tool. This feature can rewrite your prompts to add more descriptive details, camera motions, transcriptions, and sound effects, aiming for higher quality video output.
  • Enabled by Default: This feature is enabled by default for models like veo-2.0-generate-001 and veo-3.0-generate-preview.
  • Disabling: You can turn prompt enhancement off by setting the enhancePrompt parameter to False in your REST API call (or a similar parameter in the SDK if available).
  • Important for veo-3.0-generate-preview: You cannot disable the prompt rewriter when using the veo-3.0-generate-preview model.
  • Rewritten Prompt in Response: If the original prompt is fewer than 30 words long, the rewritten prompt used by the model is delivered in the API response.

Okay, here's the additional section on using a tool like APIDog for testing the Veo REST API, followed by a conclusion for the article.

Testing the Veo REST API with a Tool like APIDog

While curl is excellent for command-line testing, GUI-based API testing tools like APIDog, Postman, or Insomnia can offer a more visual and organized way to construct and manage your API requests, especially when dealing with complex JSON bodies or managing multiple API endpoints.

💡
Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demans, and replaces Postman at a much more affordable price!
button

Conclusion

Google's Veo models on Vertex AI represent a significant advancement in generative AI, particularly for video creation. By providing intuitive API access through both the Gen AI SDK for Python and direct REST endpoints, Google empowers developers and creators to integrate powerful text-to-video and image-to-video capabilities into their workflows and applications.

Top 10 Docsify Alternatives for Stellar DocumentationViewpoint

Top 10 Docsify Alternatives for Stellar Documentation

Docsify has earned its stripes as a "magical" documentation site generator, transforming Markdown files into polished websites on the fly. Its simplicity, real-time preview, and lightweight nature have made it a favorite for many developers and technical writers. However, as projects scale and requirements evolve, the very magic of Docsify – its client-side rendering and minimal build process – can present limitations. Users may seek alternatives offering better SEO, more robust built-in feature

Emmanuel Mumba

May 20, 2025

How to Use DeepL API for Free with DeepLXViewpoint

How to Use DeepL API for Free with DeepLX

In an increasingly interconnected world, the need for fast, accurate, and accessible translation services is paramount. DeepL has emerged as a leader in this space, renowned for its nuanced and natural-sounding translations powered by advanced neural machine translation. However, accessing its official API often comes with costs that might not be feasible for all users, developers, or small-scale projects. Enter DeepLX, an open-source project by the OwO-Network that offers a free alternative pat

Mark Ponomarev

May 20, 2025

Gemini 2.5 Flash: Google Models Are Getting Even BetterViewpoint

Gemini 2.5 Flash: Google Models Are Getting Even Better

Explore Google’s Gemini 2.5 Flash, an AI model with enhanced reasoning, speed, and efficiency. Learn its technical details and how to integrate it with Apidog for seamless API development.

Ashley Innocent

May 20, 2025