For years, creating high-quality video content has been a complex, time-consuming, and often expensive endeavor, requiring specialized skills in cinematography, editing, sound design, and animation. Generative AI, particularly in video, is set to lower these barriers significantly. Imagine generating compelling b-roll footage, crafting dynamic social media animations, or even producing short cinematic sequences, all from textual descriptions or still images. This is the promise of models like Veo 3.
Google has been a significant contributor to AI research and development, and its commitment to generative media is evident in the continuous evolution of models available through Vertex AI. Vertex AI serves as a unified machine learning platform, providing access to Google's cutting-edge AI models, including those from DeepMind, and enabling users to build, deploy, and scale ML applications with ease. The introduction of Veo 3, Imagen 4, and Lyria 2 further solidifies Vertex AI as a powerhouse for creative AI.
Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?
Apidog delivers all your demans, and replaces Postman at a much more affordable price!
Introducing Veo 3: The Next Leap in AI Video Generation
Prompt: A medium shot, historical adventure setting: Warm lamplight illuminates a cartographer in a cluttered study, poring over an ancient, sprawling map spread across a large table. Cartographer: "According to this old sea chart, the lost island isn't myth! We must prepare an expedition immediately!"
Veo 3, developed by Google DeepMind, represents the latest advancement in Google's video generation technology. It aims to provide users with the ability to generate high-quality videos that are not only visually impressive but also rich in auditory detail. Key enhancements and features announced for Veo 3 include:
- Improved Video Quality: Veo 3 is engineered to produce videos of superior quality when generated from both text and image prompts. This means more realistic textures, better motion coherence, and more faithful adherence to complex prompt details. The model is capable of handling intricate prompt details, translating nuanced textual descriptions into compelling visual narratives.
- Integrated Speech Generation: A significant step forward is Veo 3's ability to incorporate speech, such as dialogue and voice-overs, directly into the generated videos. This feature opens up vast possibilities for storytelling, marketing content, and educational materials, allowing creators to add another layer of narrative depth without needing separate audio production workflows for basic speech.
- Comprehensive Audio Integration: Beyond speech, Veo 3 can generate other audio elements, including music and sound effects. This means the model doesn't just create silent movies; it can produce videos with a more complete soundscape, enhancing the viewing experience and aligning the audio with the visual mood and events depicted.
The potential impact of these features is already being recognized by early adopters. Klarna, a leader in digital payments, has been leveraging Veo (and Imagen) on Vertex AI to boost content creation efficiency. They've noted significant reductions in production timelines for assets ranging from b-roll to YouTube bumpers. Justin Thomas, Head of Digital Experience & Growth at Klarna, remarked on the transformation: "With Veo and Imagen, we’ve transformed what used to be time-intensive production processes into quick, efficient tasks that allow us to scale content creation rapidly... What once took us eight weeks is now only taking eight hours, resulting in substantial cost savings.”
How to Use Google Veo API with Vertex AI

Google's Veo models are accessible on Vertex AI, allowing you to generate videos from text or image prompts. You can interact with Veo through the Google Cloud console or by making requests to the Vertex AI API. This guide focuses on using the API, with examples primarily using the Gen AI SDK for Python and REST calls.

Prerequisites for Using Veo on Vertex AI
Before you can start generating videos with Veo, ensure you have the following set up:
- Google Cloud Account and Project:
- You'll need a Google Cloud account. New accounts often come with free credits.
- Within the Google Cloud console, select an existing Google Cloud project or create a new one. If you're experimenting, creating a new project can make cleanup easier by allowing you to delete the project and all its associated resources afterward.
- Enable Vertex AI API:
- Navigate to the project selector page in the Google Cloud console.
- Ensure the Vertex AI API is enabled for your project.
- Authentication:
- You need to set up authentication for your environment.
- For REST API (local development): If you plan to use the REST API samples locally, the credentials you provide to the Google Cloud CLI (gcloud CLI) are used. Install the gcloud CLI and initialize it by running:
gcloud init
If you're using an external identity provider (IdP), sign in to the gcloud CLI with your federated identity first.
- For Python SDK: The Gen AI SDK typically uses Application Default Credentials (ADC). Setting the
GOOGLE_CLOUD_PROJECT
environment variable and ensuringGOOGLE_GENAI_USE_VERTEXAI=True
(as shown in later examples) helps configure the SDK to work with Vertex AI, leveraging your authenticated gcloud environment or service account credentials if configured.
Accessing Veo Models and Locations
- Model Versions: Veo offers multiple video generation models. The documentation provides examples using
veo-2.0-generate-001
and mentionsveo-3.0-generate-preview
(currently in Preview). Always refer to the official "Veo models" documentation for the most current list and their capabilities. - Locations: When making requests, you can specify a region (location) to control where your data is stored at rest. For a list of available regions, consult the "Generative AI on Vertex AI locations" documentation. The Python SDK examples often use environment variables to set the location.
Using the Veo API with the Python SDK (Gen AI SDK)
The Gen AI SDK for Python provides a convenient way to interact with Veo models on Vertex AI.
Installation
Install or upgrade the google-genai
library:
pip install --upgrade google-genai
Environment Variable Setup
Set the following environment variables. Replace GOOGLE_CLOUD_PROJECT
and GOOGLE_CLOUD_LOCATION
with your project ID and desired Google Cloud location (e.g., global
or a specific region like us-central1
).
export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
export GOOGLE_CLOUD_LOCATION=YOUR_LOCATION
export GOOGLE_GENAI_USE_VERTEXAI=True
Initializing the Client
from google import genai
client = genai.Client()
Generating Video from Text
You can generate videos using a descriptive text prompt. The output is a long-running operation, and the generated video is typically saved to a Google Cloud Storage (GCS) URI you specify.
import time
from google import genai
from google.genai.types import GenerateVideosConfig
client = genai.Client()
# !!! IMPORTANT: Update and uncomment the GCS URI for output !!!
# output_gcs_uri = "gs://your-bucket-name/your-output-prefix/"
# Ensure this bucket exists and your project/service account has write permissions.
try:
operation = client.models.generate_videos(
model="veo-2.0-generate-001", # Or other available Veo model
prompt="a cat reading a book",
config=GenerateVideosConfig(
aspect_ratio="16:9",
output_gcs_uri=output_gcs_uri, # Specify your GCS path
),
)
print("Video generation operation started. Polling for completion...")
while not operation.done:
time.sleep(15) # Wait for 15 seconds before checking status
operation = client.operations.get(operation) # Refresh operation status
print(f"Operation status: {operation.metadata.state if operation.metadata else 'Processing...'}")
if operation.response and operation.result.generated_videos:
print(f"Video generated successfully: {operation.result.generated_videos[0].video.uri}")
elif operation.error:
print(f"Error during video generation: {operation.error.message}")
else:
print("Operation finished but no video URI found or an unknown error occurred.")
except AttributeError:
print("Error: 'output_gcs_uri' is not defined. Please set the 'output_gcs_uri' variable.")
except Exception as e:
print(f"An unexpected error occurred: {e}")
Remember to replace "gs://your-bucket-name/your-output-prefix/"
with your actual GCS bucket and desired output path.
Generating Video from an Image (and optional text)
You can also generate videos starting from an input image, optionally guided by a text prompt.
import time
from google import genai
from google.genai.types import GenerateVideosConfig, Image
client = genai.Client()
# !!! IMPORTANT: Update and uncomment the GCS URI for output !!!
# output_gcs_uri = "gs://your-bucket-name/your-output-prefix-image/"
# Ensure this bucket exists and your project/service account has write permissions.
# Example using a public GCS image. Replace with your image URI.
input_image_gcs_uri = "gs://cloud-samples-data/generative-ai/image/flowers.png"
try:
operation = client.models.generate_videos(
model="veo-2.0-generate-001", # Or other available Veo model
image=Image(
gcs_uri=input_image_gcs_uri,
mime_type="image/png", # Adjust mime_type based on your image
),
prompt="the flowers sway gently in the breeze", # Optional text prompt
config=GenerateVideosConfig(
aspect_ratio="16:9", # Or match to your image/desired output
output_gcs_uri=output_gcs_uri,
),
)
print("Image-to-video generation operation started. Polling for completion...")
while not operation.done:
time.sleep(15)
operation = client.operations.get(operation)
print(f"Operation status: {operation.metadata.state if operation.metadata else 'Processing...'}")
if operation.response and operation.result.generated_videos:
print(f"Video generated successfully: {operation.result.generated_videos[0].video.uri}")
elif operation.error:
print(f"Error during video generation: {operation.error.message}")
else:
print("Operation finished but no video URI found or an unknown error occurred.")
except AttributeError:
print("Error: 'output_gcs_uri' is not defined. Please set the 'output_gcs_uri' variable.")
except Exception as e:
print(f"An unexpected error occurred: {e}")
Using the Veo API with REST
You can directly call the Veo API using HTTP requests. This involves sending a POST request to a specific endpoint.
Endpoint and HTTP Method
POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID:predictLongRunning
Replace PROJECT_ID
and MODEL_ID
(e.g., veo-2.0-generate-001
or veo-3.0-generate-preview
).
Request JSON Body
{
"instances": [
{
"prompt": "TEXT_PROMPT"
// For image input, the structure within "instances" will differ. Consult API reference.
}
],
"parameters": {
"storageUri": "OUTPUT_STORAGE_URI", // Optional: GCS URI for output. e.g., "gs://video-bucket/output/"
// If not provided, video bytes might be returned in the operation response for some configurations (check docs).
"sampleCount": "RESPONSE_COUNT", // Number of videos to generate (e.g., 1-4).
"durationSeconds": "DURATION", // Desired video length in seconds (e.g., 5-8).
"enhancePrompt": "ENHANCED_PROMPT" // Boolean: True (default) or False.
// Add other parameters like "aspectRatio", "fps" as per the API reference.
}
}
Make sure to replace placeholders like TEXT_PROMPT
, OUTPUT_STORAGE_URI
, etc., with actual values.
Authentication and Sending the Request (Example with curl
)
Save your request body in a file (e.g., request.json
).
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT_ID/locations/us-central1/publishers/google/models/YOUR_MODEL_ID:predictLongRunning"
This command returns an operation name (e.g., projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID
).
Handling Long-Running Operations
The predictLongRunning
endpoint initiates an asynchronous operation. You'll need to use the returned operation name to poll its status until completion, similar to how the Python SDK handles it.
What are Veo 3 Prompts & How to Write Better Prompts for Veo 3

Google Veo Models generates videos based on your textual descriptions. More detailed prompts generally result in higher quality and more relevant videos. Consider describing:
- Subjects and actions.
- Setting and environment.
- Cinematic styles, camera motions.
- Mood and tone.
For models supporting audio (like veo-3.0-generate-preview
), you can include descriptions for transcription (dialogue) and sound effects.
- Prompt Rewriter (Prompt Enhancement):
Veo includes an LLM-based prompt enhancement tool. This feature can rewrite your prompts to add more descriptive details, camera motions, transcriptions, and sound effects, aiming for higher quality video output. - Enabled by Default: This feature is enabled by default for models like
veo-2.0-generate-001
andveo-3.0-generate-preview
. - Disabling: You can turn prompt enhancement off by setting the
enhancePrompt
parameter toFalse
in your REST API call (or a similar parameter in the SDK if available). - Important for
veo-3.0-generate-preview
: You cannot disable the prompt rewriter when using theveo-3.0-generate-preview
model. - Rewritten Prompt in Response: If the original prompt is fewer than 30 words long, the rewritten prompt used by the model is delivered in the API response.
Okay, here's the additional section on using a tool like APIDog for testing the Veo REST API, followed by a conclusion for the article.
Testing the Veo REST API with a Tool like APIDog
While curl
is excellent for command-line testing, GUI-based API testing tools like APIDog, Postman, or Insomnia can offer a more visual and organized way to construct and manage your API requests, especially when dealing with complex JSON bodies or managing multiple API endpoints.
Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?
Apidog delivers all your demans, and replaces Postman at a much more affordable price!
Conclusion
Google's Veo models on Vertex AI represent a significant advancement in generative AI, particularly for video creation. By providing intuitive API access through both the Gen AI SDK for Python and direct REST endpoints, Google empowers developers and creators to integrate powerful text-to-video and image-to-video capabilities into their workflows and applications.