How to Add AI Features to Any SaaS Boilerplate 2026

TL;DR

Vercel AI SDK (ai package) is the standard for adding AI to Next.js SaaS boilerplates in 2026. It handles streaming, multi-provider support (OpenAI, Anthropic, Google), and React hooks out of the box. Add it to any boilerplate in 3 steps: install ai, create a route handler, use useChat or useCompletion in your components. For production: add usage tracking, rate limiting per user, and cost controls.

Key Takeaways

Vercel AI SDK: ai package, handles streaming + React hooks, multi-provider
OpenAI vs Anthropic: Both work identically through AI SDK — swap in one line
Streaming: Server-Sent Events via streamText() — no extra infra needed
Cost control: Track tokens per user, set monthly limits, use maxTokens
RAG: Embed documents → store in pgvector → retrieve at query time
Rate limiting: Per-user AI limits prevent abuse and runaway bills

Step 1: Install AI SDK

npm install ai @ai-sdk/openai
# For Anthropic:
npm install @ai-sdk/anthropic
# For Google:
npm install @ai-sdk/google

Step 2: Create Chat Route Handler

// app/api/ai/chat/route.ts — streaming chat endpoint:
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';
import { streamText, convertToCoreMessages } from 'ai';
import { auth } from '@/lib/auth';
import { db } from '@/lib/db';
import { checkUserAILimit, incrementAIUsage } from '@/lib/ai-limits';

export const runtime = 'edge';  // Run on edge for lower latency
export const maxDuration = 30;

export async function POST(request: Request) {
  const session = await auth();
  if (!session?.user) return new Response('Unauthorized', { status: 401 });

  // Check usage limits:
  const canUse = await checkUserAILimit(session.user.id);
  if (!canUse) {
    return new Response(
      JSON.stringify({ error: 'Monthly AI limit reached. Upgrade to Pro.' }),
      { status: 429, headers: { 'Content-Type': 'application/json' } }
    );
  }

  const { messages } = await request.json();

  const result = await streamText({
    model: openai('gpt-4o-mini'),  // Or anthropic('claude-3-5-haiku-latest')
    messages: convertToCoreMessages(messages),
    system: 'You are a helpful SaaS assistant. Be concise and actionable.',
    maxTokens: 1000,  // Cost control
    
    // Track usage after completion:
    onFinish: async ({ usage }) => {
      await incrementAIUsage(session.user.id, {
        promptTokens: usage.promptTokens,
        completionTokens: usage.completionTokens,
      });
    },
  });

  return result.toDataStreamResponse();
}

Step 3: Add Chat UI Component

// components/ai/AIChatPanel.tsx:
'use client';
import { useChat } from 'ai/react';
import { useState } from 'react';
import { Button } from '@/components/ui/button';
import { Textarea } from '@/components/ui/textarea';
import { ScrollArea } from '@/components/ui/scroll-area';

export function AIChatPanel() {
  const { messages, input, handleInputChange, handleSubmit, isLoading, error } = useChat({
    api: '/api/ai/chat',
    onError: (error) => {
      if (error.message.includes('429')) {
        alert('AI limit reached! Upgrade to Pro for unlimited access.');
      }
    },
  });

  return (
    <div className="flex flex-col h-[500px] border rounded-lg">
      <ScrollArea className="flex-1 p-4">
        {messages.map((message) => (
          <div
            key={message.id}
            className={`mb-4 ${message.role === 'user' ? 'text-right' : 'text-left'}`}
          >
            <div
              className={`inline-block px-4 py-2 rounded-lg max-w-[80%] ${
                message.role === 'user'
                  ? 'bg-primary text-primary-foreground'
                  : 'bg-muted'
              }`}
            >
              {message.content}
            </div>
          </div>
        ))}
        {isLoading && (
          <div className="text-muted-foreground text-sm animate-pulse">
            AI is thinking...
          </div>
        )}
      </ScrollArea>

      <form onSubmit={handleSubmit} className="p-4 border-t flex gap-2">
        <Textarea
          value={input}
          onChange={handleInputChange}
          placeholder="Ask anything..."
          className="resize-none"
          rows={2}
          onKeyDown={(e) => {
            if (e.key === 'Enter' && !e.shiftKey) {
              e.preventDefault();
              handleSubmit(e as any);
            }
          }}
        />
        <Button type="submit" disabled={isLoading}>
          {isLoading ? '...' : 'Send'}
        </Button>
      </form>
    </div>
  );
}

Usage Tracking and Rate Limiting

// lib/ai-limits.ts — per-user AI usage tracking:
import { db } from './db';

const FREE_MONTHLY_TOKENS = 10_000;
const PRO_MONTHLY_TOKENS = 500_000;

export async function checkUserAILimit(userId: string): Promise<boolean> {
  const user = await db.user.findUnique({
    where: { id: userId },
    include: {
      aiUsage: {
        where: {
          // Current calendar month:
          createdAt: {
            gte: new Date(new Date().getFullYear(), new Date().getMonth(), 1),
          },
        },
      },
    },
  });

  if (!user) return false;

  const totalTokens = user.aiUsage.reduce(
    (sum, u) => sum + u.promptTokens + u.completionTokens,
    0
  );

  const limit = user.plan === 'pro' ? PRO_MONTHLY_TOKENS : FREE_MONTHLY_TOKENS;
  return totalTokens < limit;
}

export async function incrementAIUsage(
  userId: string,
  tokens: { promptTokens: number; completionTokens: number }
) {
  await db.aiUsage.create({
    data: {
      userId,
      promptTokens: tokens.promptTokens,
      completionTokens: tokens.completionTokens,
      // Estimated cost (OpenAI gpt-4o-mini pricing):
      costUsd: (tokens.promptTokens * 0.00000015) + (tokens.completionTokens * 0.0000006),
    },
  });
}

// Add to schema.prisma:
model AiUsage {
  id               String   @id @default(cuid())
  userId           String
  user             User     @relation(fields: [userId], references: [id])
  promptTokens     Int
  completionTokens Int
  costUsd          Decimal  @db.Decimal(10, 8)
  createdAt        DateTime @default(now())
}

Common AI Feature Patterns

// 1. Text generation (one-shot, no streaming):
import { generateText } from 'ai';

const { text } = await generateText({
  model: openai('gpt-4o-mini'),
  prompt: `Summarize this in 2 sentences: ${userContent}`,
  maxTokens: 200,
});

// 2. Structured output (JSON):
import { generateObject } from 'ai';
import { z } from 'zod';

const { object } = await generateObject({
  model: openai('gpt-4o'),
  schema: z.object({
    sentiment: z.enum(['positive', 'negative', 'neutral']),
    score: z.number().min(0).max(1),
    summary: z.string(),
  }),
  prompt: `Analyze sentiment: "${userReview}"`,
});
// object.sentiment, object.score — fully typed!

// 3. Image generation:
import OpenAI from 'openai';
const openaiClient = new OpenAI();

const image = await openaiClient.images.generate({
  model: 'dall-e-3',
  prompt: userDescription,
  size: '1024x1024',
  quality: 'standard',
});
const imageUrl = image.data[0].url;

// 4. Embeddings for RAG:
const { embedding } = await embed({
  model: openai.embedding('text-embedding-3-small'),
  value: documentContent,
});
// Store embedding in pgvector, search later

Multi-Provider Setup

// lib/ai.ts — switch providers in one place:
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';
import { google } from '@ai-sdk/google';

// Your app's AI model config:
export const AI_MODELS = {
  chat: openai('gpt-4o-mini'),           // Fast + cheap for most
  smart: openai('gpt-4o'),               // Complex reasoning
  fast: anthropic('claude-3-5-haiku-latest'),  // Fastest for simple
  embedding: openai.embedding('text-embedding-3-small'),
} as const;

// Swap providers by changing one line:
// chat: anthropic('claude-3-5-haiku-latest'),
// chat: google('gemini-2.0-flash'),

Decision Guide

For chat/assistant features:
  → useChat() + /api/ai/chat route (streaming)
  → gpt-4o-mini for cost-effective responses
  → Add system prompt for your app's persona

For content generation (one-shot):
  → generateText() with maxTokens limit
  → No streaming needed — simpler UX

For structured AI output:
  → generateObject() with Zod schema
  → Perfect for form prefill, analysis, classification

For semantic search / RAG:
  → embed() → store in pgvector or Pinecone
  → Retrieve on query, inject into system prompt

Cost control checklist:
  → maxTokens on every call (default is unlimited!)
  → Per-user monthly token limits
  → Free tier: 10K tokens/month (~50 chat messages)
  → Track costs in DB to watch for abuse

Find AI-ready SaaS boilerplates at StarterPick.

Comments