Welcome to the definitive beginner's guide to the Vercel AI SDK. In a world where artificial intelligence is rapidly transforming the digital landscape, the ability to integrate AI into web applications has shifted from a niche specialization to a core competency for modern developers. This guide is designed to take you from a curious beginner to a capable AI application developer.
For a long time, bridging the gap between a powerful Large Language Model (LLM) and a user-friendly web interface was a complex endeavor. Developers had to wrestle with disparate provider APIs, manage intricate state, and manually implement features like response streaming. The Vercel AI SDK was created to solve these exact problems. It's a TypeScript-first toolkit that provides a unified, elegant abstraction layer over the complexities of building AI-powered experiences.
This is not just a quickstart. Over the course of this tutorial, we will build a complete, feature-rich AI chatbot from the ground up using Next.js and Google's Gemini model. We will go far beyond a simple "hello world" example. You will learn:
- The "Why": A deeper understanding of the core concepts and the architectural patterns of modern AI applications.
- The "How": A detailed, step-by-step process for setting up your project, writing the server-side logic, and building a polished, interactive frontend.
- Advanced Capabilities: How to empower your chatbot with "Tools" to access real-time information, and how to orchestrate complex, multi-step interactions.
- Production-Ready Practices: How to handle loading states, manage errors gracefully, and structure your code for a real-world application.
By the end of this comprehensive guide, you will have not only a working, advanced chatbot but also the deep conceptual knowledge needed to confidently build your own unique and powerful AI-powered applications with the Vercel AI SDK.
Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?
Apidog delivers all your demands, and replaces Postman at a much more affordable price!
Chapter 1: Foundations and Setup
Every great structure needs a solid foundation. In this chapter, we'll set up our development environment, install the necessary tools, and get our API keys in order. We'll also take a moment to understand the "why" behind each choice we make.
Prerequisites
Before we write a single line of code, let's ensure your toolbox is ready.
- Node.js (version 18 or newer): The Vercel AI SDK and modern JavaScript frameworks like Next.js rely on features available in recent versions of Node.js. You can verify your version by running
node -v
in your terminal. If you don't have it, you can download it from the official Node.js website. - A Google AI API Key: This key is your authenticated pass to use Google's powerful Gemini family of models. The Vercel AI SDK is provider-agnostic, but for this guide, we'll focus on Gemini.
- Navigate to Google AI Studio.
- Sign in with your Google account.
- Click "Get API key" and then "Create API key in new project."
- Copy the generated key and store it somewhere secure for now. Treat this key like a password; never expose it publicly.
Step 1: Initializing the Next.js Project
We'll use Next.js, the premier React framework for building production-grade applications. Its App Router paradigm integrates perfectly with the server-centric nature of AI applications.
Open your terminal and execute this command to create a new project:
npx create-next-app@latest vercel-ai-tutorial
The installer will prompt you with several questions. Use these settings to follow along seamlessly:
- Would you like to use TypeScript? Yes (TypeScript is crucial for type-safe AI interactions)
- Would you like to use ESLint? Yes (For code quality)
- Would you like to use Tailwind CSS? Yes (For rapidly styling our UI)
- Would you like to use
src/
directory? Yes (A common convention for organizing code) - Would you like to use App Router? Yes (This is essential for this guide)
- Would you like to customize the default import alias? No (Defaults are fine)
Once the installation is complete, navigate into your newly created project directory:
cd vercel-ai-tutorial
Step 2: Installing the Vercel AI SDK
Now, let's add the AI SDK packages to our project.
npm install ai @ai-sdk/react @ai-sdk/google zod
Let's break down what each of these packages does:
ai
: This is the heart of the SDK. It contains the core, framework-agnostic functions likestreamText
andgenerateObject
that handle the direct communication with LLM providers.@ai-sdk/react
: This package provides the React hooks—specificallyuseChat
—that make building interactive UIs a breeze. It abstracts away the complexities of state management, streaming, and API communication.@ai-sdk/google
: This is a provider package. It's the specific adapter that allows the coreai
package to communicate with Google's AI models. If you wanted to use OpenAI, you'd install@ai-sdk/openai
instead.zod
: A powerful schema declaration and validation library. While not strictly part of the AI SDK, it's an indispensable partner for defining the structure of data for advanced features like Tool Calling, ensuring the AI's output is predictable and type-safe.
Step 3: Securing Your API Key
Never hardcode an API key in your application code. It's a major security risk. The professional standard is to use environment variables. Next.js has built-in support for this with .env.local
files.
Create the file in your project's root:
touch .env.local
Now, open this new file and add your Google AI key:
# .env.local
# This file is for local development and should NOT be committed to git.
GOOGLE_GENERATIVE_AI_API_KEY=YOUR_GOOGLE_AI_API_KEY
Replace YOUR_GOOGLE_AI_API_KEY
with the key you copied earlier. Next.js automatically loads this file and makes the key available on the server, which is exactly where we need it.
Chapter 2: Building the Chatbot's Backbone
With our project set up, it's time to build the core components of our application: the server-side API endpoint that talks to the AI, and the client-side UI that users will interact with.
The Client-Server Architecture of an AI App
Our chatbot will have two main parts:
- A Server-Side API Route (
/api/chat/route.ts
): This is a secure environment that runs on a server. Its primary job is to receive the chat history from the user's browser, add our secret API key, forward the request to the Google AI service, and then stream the response back to the user. Keeping this logic on the server is critical for security—it ensures our API key is never exposed to the public. - A Client-Side UI (
page.tsx
): This is the React component that runs in the user's browser. It's responsible for rendering the chat history, capturing user input, and sending that input to our API route.
This separation is fundamental to building secure and performant web applications.
Step 4: Creating the API Route Handler
Let's create the server-side endpoint. In your src/app
directory, create a new folder api
, and inside that, another folder chat
. Finally, create a file named route.ts
inside the chat
folder.
The final path should be src/app/api/chat/route.ts
.
Populate this file with the following code:
// src/app/api/chat/route.ts
import { google } from '@ai-sdk/google';
import { streamText } from 'ai';
// Vercel-specific configuration to allow streaming responses for up to 30 seconds
export const maxDuration = 30;
// The main API route handler
export async function POST(req: Request) {
try {
// Extract the `messages` array from the request body
const { messages } = await req.json();
// Call the AI provider with the conversation history
const result = await streamText({
model: google('models/gemini-1.5-pro-latest'),
// The `messages` array provides the model with context for the conversation
messages,
});
// Respond with a streaming response
return result.toDataStreamResponse();
} catch (error) {
// It's a good practice to handle potential errors
if (error instanceof Error) {
return new Response(JSON.stringify({ error: error.message }), { status: 500 });
}
return new Response(JSON.stringify({ error: 'An unknown error occurred' }), { status: 500 });
}
}
Let's dissect this crucial file:
export const maxDuration = 30;
: This is a Vercel-specific setting. Serverless functions have a default timeout. Since AI responses can sometimes take a moment to start generating, we're extending the timeout to 30 seconds to prevent the request from being terminated prematurely.export async function POST(req: Request)
: In the Next.js App Router, exporting an async function named after an HTTP method (likePOST
) in aroute.ts
file creates an API endpoint.const { messages } = await req.json();
: The frontend will send a JSON object in its request, and we're destructuring themessages
array from it. This array is the complete history of the conversation, which is essential for the LLM to provide a contextually aware response.const result = await streamText(...)
: This is the core call to the Vercel AI SDK. We provide it with themodel
we want to use and themessages
history. The SDK handles the authenticated request to the Google API in the background.return result.toDataStreamResponse();
: This is a powerful helper function. It takes theReadableStream
returned bystreamText
and wraps it in aResponse
object with the correct headers and format, making it incredibly easy for our client-side hooks to consume the stream.try...catch
: We've wrapped our logic in atry...catch
block to gracefully handle any potential errors during the API call, returning a clear error message to the client.
Step 5: Crafting the User Interface
Now for the fun part: building the UI. Thanks to the @ai-sdk/react
package, this is surprisingly simple. Open the main page file at src/app/page.tsx
and replace its entire content with the following:
// src/app/page.tsx
'use client';
import { useChat } from '@ai-sdk/react';
import { useRef, useEffect } from 'react';
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit, isLoading, error } = useChat();
// A ref to the scrollable container of messages
const messagesContainerRef = useRef<HTMLDivElement>(null);
// Effect to scroll to the bottom of the messages container whenever messages change
useEffect(() => {
if (messagesContainerRef.current) {
messagesContainerRef.current.scrollTop = messagesContainerRef.current.scrollHeight;
}
}, [messages]);
return (
<div className="flex flex-col h-screen bg-gray-50">
{/* Messages container */}
<div ref={messagesContainerRef} className="flex-1 overflow-y-auto p-8 space-y-4">
{messages.map(m => (
<div
key={m.id}
className={`flex gap-3 ${m.role === 'user' ? 'justify-end' : 'justify-start'}`}
>
{/* Display user's avatar */}
{m.role === 'user' && (
<div className="w-10 h-10 rounded-full bg-blue-500 flex items-center justify-center text-white font-bold">U</div>
)}
{/* Message bubble */}
<div
className={`max-w-xl p-3 rounded-2xl shadow-md whitespace-pre-wrap ${
m.role === 'user'
? 'bg-blue-500 text-white rounded-br-none'
: 'bg-white text-black rounded-bl-none'
}`}
>
<span className="font-bold block">{m.role === 'user' ? 'You' : 'AI Assistant'}</span>
{m.content}
</div>
{/* Display AI's avatar */}
{m.role !== 'user' && (
<div className="w-10 h-10 rounded-full bg-gray-700 flex items-center justify-center text-white font-bold">AI</div>
)}
</div>
))}
</div>
{/* Input form */}
<div className="p-4 bg-white border-t">
<form onSubmit={handleSubmit} className="flex items-center gap-4 max-w-4xl mx-auto">
<input
className="flex-1 p-3 border rounded-full focus:outline-none focus:ring-2 focus:ring-blue-500"
value={input}
placeholder="Ask me anything..."
onChange={handleInputChange}
disabled={isLoading}
/>
<button
type="submit"
className="px-6 py-3 bg-blue-500 text-white rounded-full font-semibold hover:bg-blue-600 disabled:bg-blue-300 disabled:cursor-not-allowed"
disabled={isLoading}
>
Send
</button>
</form>
{error && (
<p className="text-red-500 mt-2 text-center">{error.message}</p>
)}
</div>
</div>
);
}
This is a significant amount of code, but most of it is for creating a polished UI with Tailwind CSS. Let's focus on the logic:
'use client';
: This is essential. It marks this component as a Client Component, meaning it will execute in the browser and can use state and effects.const { ... } = useChat();
: This one line is the magic of the AI SDK UI library. It provides all the state and functionality we need:messages
: The array of chat messages, automatically kept in sync.input
,handleInputChange
,handleSubmit
: The state and handlers for our input form.handleSubmit
automatically packages the messages and calls our/api/chat
endpoint.isLoading
: A boolean that istrue
while the AI is generating a response. We use this to disable the form while waiting.error
: An error object that will be populated if our API call fails. We display this to the user.useRef
anduseEffect
: This is a standard React pattern to make the chat view automatically scroll to the bottom as new messages are added, ensuring the latest message is always visible.
Step 6: Run Your Application
You have now built a complete, well-structured AI chatbot. Let's fire it up!
npm run dev
Navigate to http://localhost:3000
in your browser. You should be greeted by a polished chat interface. Ask it a question. You will see your message appear instantly, and the AI's response will stream in token by token.
Chapter 3: Advanced Capabilities - Giving Your Chatbot Superpowers
Our chatbot is smart, but its knowledge is limited to its training data. It can't access live information or perform actions in the real world. In this chapter, we'll give it "Tools" to overcome these limitations.
What are Tools?
A Tool is a function you define that the LLM can choose to execute. You describe the tool to the model, and when it thinks the tool is necessary to answer a user's query, it will pause its text generation and instead output a special "tool call" object. Your code then executes the function with the arguments provided by the model, and the result is sent back to the model. The model then uses this new information to generate its final, more accurate response.
Let's empower our chatbot with two tools:
- A tool to get the current weather for a location.
- A tool to convert temperatures from Fahrenheit to Celsius.
This will allow our bot to answer questions like, "What's the weather in London in Celsius?"—a task requiring multiple steps and external data.
Step 7: Upgrading the API to Support Tools
We need to define our tools in the streamText
call on the server. Open src/app/api/chat/route.ts
and modify it to include the new tools
definition.
// src/app/api/chat/route.ts
import { google } from '@ai-sdk/google';
import { streamText, tool } from 'ai';
import { z } from 'zod';
export const maxDuration = 30;
export async function POST(req: Request) {
const { messages } = await req.json();
const result = await streamText({
model: google('models/gemini-1.5-pro-latest'),
messages,
// Define the tools the model can use
tools: {
getWeather: tool({
description: 'Get the current weather for a specific location. Always returns temperature in Fahrenheit.',
parameters: z.object({
location: z.string().describe('The city and state, e.g., San Francisco, CA'),
}),
execute: async ({ location }) => {
// In a real app, you would fetch from a real weather API
console.log(`Fetching weather for ${location}`);
return {
temperature: Math.floor(Math.random() * (100 - 30 + 1) + 30),
high: Math.floor(Math.random() * (100 - 80 + 1) + 80),
low: Math.floor(Math.random() * (50 - 30 + 1) + 30),
conditions: ['Sunny', 'Cloudy', 'Rainy'][Math.floor(Math.random() * 3)],
};
},
}),
convertFahrenheitToCelsius: tool({
description: 'Convert a temperature from Fahrenheit to Celsius.',
parameters: z.object({
temperature: z.number().describe('The temperature in Fahrenheit'),
}),
execute: async ({ temperature }) => {
console.log(`Converting ${temperature}°F to Celsius`);
return {
celsius: Math.round((temperature - 32) * (5 / 9)),
};
},
}),
},
});
return result.toDataStreamResponse();
}
Let's analyze the tools
object:
- Each key (
getWeather
,convertFahrenheitToCelsius
) is the name of our tool. description
: This is the most important part for the model. It reads this description to understand what the tool does and when it should be used. Be clear and specific.parameters
: We usezod
to define the function's signature. This tells the model exactly what arguments it needs to provide.z.string().describe(...)
gives the model a hint about the expected format.execute
: This is the actual server-side function that runs when the tool is called. Here, we simulate API calls with random data, but you could easily replace this with afetch
call to a real weather service.
Step 8: Enabling Multi-Step Tool Calls in the UI
Just defining the tools on the server isn't enough. By default, when the model makes a tool call, the conversation stops. We need to tell our useChat
hook to automatically send the result of that tool call back to the model so it can continue its reasoning and formulate a final answer.
This is incredibly simple. In src/app/page.tsx
, update the useChat
hook initialization:
// src/app/page.tsx
// ...
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit, isLoading, error } = useChat({
// Tell the hook to automatically send tool results back to the model
experimental_sendExtraToolMessages: true,
});
// ... rest of the component
}
That's it. The experimental_sendExtraToolMessages: true
property activates the multi-step tool-use flow.
Step 9: A Better UI for Tool Invocations
Our current UI only displays m.content
. When a tool is called, the interesting information is in a different property on the message object. Let's create a dedicated component to render tool calls nicely.
First, let's update the main message loop in src/app/page.tsx
to render these invocations.
// src/app/page.tsx
// ... inside the Chat component's return statement
<div ref={messagesContainerRef} className="flex-1 overflow-y-auto p-8 space-y-4">
{messages.map(m => (
<div
key={m.id}
className={`flex gap-3 ${m.role === 'user' ? 'justify-end' : 'justify-start'}`}
>
{/* ... avatars ... */}
<div
className={`max-w-xl p-3 rounded-2xl shadow-md whitespace-pre-wrap ${
m.role === 'user'
? 'bg-blue-500 text-white rounded-br-none'
: 'bg-white text-black rounded-bl-none'
}`}
>
<span className="font-bold block">{m.role === 'user' ? 'You' : 'AI Assistant'}</span>
{/* Render tool invocations */}
{m.toolInvocations?.map(tool => (
<div key={tool.toolCallId} className="my-2 p-2 bg-gray-100 rounded text-sm text-gray-700">
<p className="font-semibold">Tool Call: `{tool.toolName}`</p>
<pre className="mt-1 p-1 bg-gray-200 rounded text-xs">
{JSON.stringify(tool.args, null, 2)}
</pre>
</div>
))}
{m.content}
</div>
{/* ... avatars ... */}
</div>
))}
{isLoading && messages[messages.length - 1]?.role === 'assistant' && (
<div className="flex justify-start p-8 space-x-3">
<div className="w-10 h-10 rounded-full bg-gray-700 flex items-center justify-center text-white font-bold">AI</div>
<div className="p-3 rounded-2xl shadow-md bg-white">
<div className="typing-indicator">
<span></span><span></span><span></span>
</div>
</div>
</div>
)}
</div>
// ...
I've also added a simple typing indicator that appears while the assistant is thinking. You'll need to add a bit of CSS for it. In your src/app/globals.css
file, add:
/* src/app/globals.css */
.typing-indicator span {
height: 8px;
width: 8px;
background-color: #9E9EA1;
border-radius: 50%;
display: inline-block;
animation: a 1.2s infinite ease-in-out;
}
.typing-indicator span:nth-child(1) { animation-delay: -0.4s; }
.typing-indicator span:nth-child(2) { animation-delay: -0.2s; }
@keyframes a {
0%, 60%, 100% { transform: scale(0.2); }
30% { transform: scale(1); }
}
Now, run the application again. Ask it, "What is the weather in New York in Celsius?" You will see a fascinating chain of events unfold in your UI:
- The model will first call the
getWeather
tool. You'll see the rendered tool call in the UI. - The result (a random temperature in Fahrenheit) is sent back to the model.
- The model, knowing it needs Celsius, will then call the
convertFahrenheitToCelsius
tool, using the temperature from the first tool's result as its input. - Finally, with the Celsius temperature in hand, it will generate a natural language response answering your original question.
This is the power of building AI Agents, and the Vercel AI SDK makes this complex orchestration remarkably straightforward.
Chapter 4: Where to Go From Here?
You have successfully built an advanced, AI-powered chatbot. You've gone from a blank canvas to a feature-rich application that can stream responses, handle loading and error states, and leverage tools to interact with external data in a multi-step fashion.
This guide has given you a strong foundation, but it's just the beginning. The Vercel AI SDK has even more to offer. Here are some paths for your continued exploration:
- Generative UI: We've only streamed text and data. With React Server Components, the AI SDK allows you to have the AI generate and stream fully-formed, interactive React components. Imagine asking for the weather and getting back a beautiful, interactive weather widget instead of just text. This is a cutting-edge feature with enormous potential.
- Retrieval-Augmented Generation (RAG): Build chatbots that can reason over your own private documents. You can create a chatbot that answers questions about a PDF, a set of Markdown files, or your company's internal knowledge base.
- Explore Other Providers: The architecture we've built is highly modular. Try swapping Google's model for one from OpenAI or Anthropic. It's often as simple as changing one line of code in your API route, allowing you to experiment and find the best model for your specific use case.
- Vercel AI Playground: The AI Playground is an invaluable tool for testing prompts and comparing the output, performance, and cost of different models side-by-side.
The future of web development is intelligent, interactive, and personalized. With the Vercel AI SDK, you now possess the tools and the knowledge to be at the forefront of this revolution. Happy building!
Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?
Apidog delivers all your demands, and replaces Postman at a much more affordable price!