Vercel AI SDK with Next.js

The Vercel AI SDK is useful because it removes a lot of streaming glue. The mistake is treating that as the whole app.
The small setup I like is: one server route that owns model calls, one client component that owns chat state, explicit environment variables, and enough error handling that a failed stream does not look like a frozen UI.
That last part is where most demo code falls apart. The happy path streams a response. The product path has empty prompts, slow providers, aborted requests, half-written assistant messages, and users clicking submit twice.
Install the pieces
Start with a Next.js App Router project:
pnpm create next-app@latest my-ai-app --typescript --app
cd my-ai-appInstall the SDK and the provider package you plan to use:
pnpm add ai @ai-sdk/openaiAdd the key locally:
OPENAI_API_KEY="sk-your-key"Keep it in .env.local locally and configure the same variable in Vercel for
preview and production.
Server route
Put the model call behind a route. That keeps provider secrets on the server and gives the client a stable API.
// app/api/chat/route.ts
import { openai } from '@ai-sdk/openai'
import { streamText } from 'ai'
export const runtime = 'edge'
export async function POST(req: Request) {
const { messages } = await req.json()
const result = await streamText({
model: openai('gpt-4o-mini'),
messages,
system: 'You are a concise technical assistant.',
})
return result.toDataStreamResponse()
}The important part is toDataStreamResponse(). It gives the client an
incremental stream instead of forcing the UI to wait for the whole answer. Once
that works, the rest of the app can treat streaming as normal UI state instead
of a special effect.
Client component
The useChat hook handles optimistic messages, input state, stream parsing, and
basic errors:
'use client'
import { useChat } from 'ai/react'
export default function ChatPage() {
const { messages, input, handleInputChange, handleSubmit, isLoading, error } =
useChat({ api: '/api/chat' })
return (
<main className="mx-auto flex h-screen max-w-3xl flex-col gap-4 p-6">
<div className="flex-1 space-y-3 overflow-y-auto">
{messages.map((message) => (
<article key={message.id}>
<p className="text-xs uppercase text-zinc-500">{message.role}</p>
<p className="whitespace-pre-wrap text-sm">{message.content}</p>
</article>
))}
{isLoading && <p className="text-sm text-zinc-500">Thinking...</p>}
{error && <p className="text-sm text-red-400">{error.message}</p>}
</div>
<form onSubmit={handleSubmit} className="flex gap-2">
<input
value={input}
onChange={handleInputChange}
className="flex-1 rounded-md border border-zinc-800 bg-zinc-950 px-3 py-2"
placeholder="Ask a question"
/>
<button
type="submit"
disabled={isLoading}
className="rounded-md bg-white px-4 py-2 text-sm font-medium text-black disabled:opacity-50"
>
Send
</button>
</form>
</main>
)
}I keep the first client boring on purpose. Markdown, persistence, tool traces, and eval links are easier to add after the stream is reliable. If the basic chat loop is flaky, richer UI only makes the failure harder to see.
Structured output
For structured data, use schema validation instead of parsing prose:
import { openai } from '@ai-sdk/openai'
import { streamObject } from 'ai'
import { z } from 'zod'
const WeatherSchema = z.object({
summary: z.string(),
temperatureC: z.number(),
})
export async function POST() {
const result = await streamObject({
model: openai('gpt-4o-mini'),
schema: WeatherSchema,
prompt: 'Return the current weather summary for Paris.',
})
return result.toJsonStreamResponse()
}The schema is not decoration. It is the boundary between "the model said something plausible" and "the app received a shape it can use."
The parts I would wire early
- Keep provider keys server-side.
- Treat streaming errors as normal product states.
- Log model, latency, token usage, and tool calls early.
- Test slow streams and aborted requests.
- Do not add tools until the plain chat loop is stable.
The SDK is good glue. It does not remove the need for validation, state, observability, retries, and a UI that explains what happened when the model or network fails.
What I would not skip
Before calling this production-ready, I would add checks that most demos leave out:
- reject empty or oversized prompts before they reach the provider,
- cap response length and tool-call depth,
- persist messages after the stream completes, not before,
- show a retry affordance when the stream fails halfway through,
- log provider status, model name, latency, and finish reason,
- keep system prompts and tool descriptions versioned with the code.
The stream itself is not the product. The product is what happens around the stream when the model is slow, the browser reconnects, the provider rate-limits, or the user asks for something the app should not do. The AI SDK gives you a good transport. You still have to decide where the guardrails live.