Vercel AI SDK with Next.js

The Vercel AI SDK is useful because it removes a lot of streaming glue. The mistake is treating that as the whole app.
The small setup I like is: one server route that owns model calls, one client component that owns chat state, explicit environment variables, and enough error handling that a failed stream does not look like a frozen UI.
Install the pieces
Start with a Next.js App Router project:
pnpm create next-app@latest my-ai-app --typescript --app
cd my-ai-appInstall the SDK and the provider package you plan to use:
pnpm add ai @ai-sdk/openaiAdd the key locally:
OPENAI_API_KEY="sk-your-key"Keep it in .env.local locally and configure the same variable in Vercel for
preview and production.
Server route
Put the model call behind a route. That keeps provider secrets on the server and gives the client a stable API.
// app/api/chat/route.ts
import { openai } from '@ai-sdk/openai'
import { streamText } from 'ai'
export const runtime = 'edge'
export async function POST(req: Request) {
const { messages } = await req.json()
const result = await streamText({
model: openai('gpt-4o-mini'),
messages,
system: 'You are a concise technical assistant.',
})
return result.toDataStreamResponse()
}The important part is toDataStreamResponse(). It gives the client an
incremental stream instead of forcing the UI to wait for the whole answer.
Client component
The useChat hook handles optimistic messages, input state, stream parsing, and
basic errors:
'use client'
import { useChat } from 'ai/react'
export default function ChatPage() {
const { messages, input, handleInputChange, handleSubmit, isLoading, error } =
useChat({ api: '/api/chat' })
return (
<main className="mx-auto flex h-screen max-w-3xl flex-col gap-4 p-6">
<div className="flex-1 space-y-3 overflow-y-auto">
{messages.map((message) => (
<article key={message.id}>
<p className="text-xs uppercase text-zinc-500">{message.role}</p>
<p className="whitespace-pre-wrap text-sm">{message.content}</p>
</article>
))}
{isLoading && <p className="text-sm text-zinc-500">Thinking...</p>}
{error && <p className="text-sm text-red-400">{error.message}</p>}
</div>
<form onSubmit={handleSubmit} className="flex gap-2">
<input
value={input}
onChange={handleInputChange}
className="flex-1 rounded-md border border-zinc-800 bg-zinc-950 px-3 py-2"
placeholder="Ask a question"
/>
<button
type="submit"
disabled={isLoading}
className="rounded-md bg-white px-4 py-2 text-sm font-medium text-black disabled:opacity-50"
>
Send
</button>
</form>
</main>
)
}This is intentionally plain. Once the stream is reliable, you can add markdown, message persistence, tool traces, eval links, or whatever the product actually needs.
Structured output
For structured data, use schema validation instead of parsing prose:
import { openai } from '@ai-sdk/openai'
import { streamObject } from 'ai'
import { z } from 'zod'
const WeatherSchema = z.object({
summary: z.string(),
temperatureC: z.number(),
})
export async function POST() {
const result = await streamObject({
model: openai('gpt-4o-mini'),
schema: WeatherSchema,
prompt: 'Return the current weather summary for Paris.',
})
return result.toJsonStreamResponse()
}The schema is not decoration. It is the boundary between "the model said something plausible" and "the app received a shape it can use."
Notes from building with it
- Keep provider keys server-side.
- Treat streaming errors as normal product states.
- Log model, latency, token usage, and tool calls early.
- Test slow streams and aborted requests.
- Do not add tools until the plain chat loop is stable.
The SDK is good glue. The product still needs boring engineering around it: validation, state, observability, retries, and a UI that explains what happened when the model or network fails.
What I would not skip
Before calling this production-ready, I would add a few checks that most demos leave out:
- reject empty or oversized prompts before they reach the provider,
- cap response length and tool-call depth,
- persist messages after the stream completes, not before,
- show a retry affordance when the stream fails halfway through,
- log provider status, model name, latency, and finish reason,
- keep system prompts and tool descriptions versioned with the code.
The stream itself is not the product. The product is what happens around the stream when the model is slow, the browser reconnects, the provider rate-limits, or the user asks for something the app should not do. The AI SDK gives you a good transport. You still need the boring boundaries.