Best Boilerplates for Building an AI Wrapper SaaS 2026

TL;DR

An AI wrapper SaaS has different requirements than a standard SaaS — you need streaming responses, token usage tracking, per-user rate limiting, and cost management on top of your own billing. Most general SaaS boilerplates don't ship with these. In 2026, the best options are: Shipixen (AI-native, ships with Vercel AI SDK pre-configured), T3 Stack + AI SDK (most flexible, build to your exact needs), and adapted versions of ShipFast/Makerkit (add AI layer yourself). Here's how to evaluate and set up each.

Key Takeaways

AI wrapper requirements: streaming, token metering, rate limiting, model switching, cost passthrough
Vercel AI SDK: the de facto standard for streaming in Next.js AI apps — use it in any boilerplate
T3 Stack + AI SDK: most flexible foundation — build exactly what you need
Shipixen: AI-native starter with streaming pre-built, good for rapid prototyping
ShipFast + AI layer: large community, add ai package on top — fastest path if you already own ShipFast
Token metering: use Stripe Meters (usage-based billing) — no boilerplate ships this pre-built

What Makes AI Wrapper SaaS Different

A standard SaaS boilerplate gives you auth + billing + dashboard. An AI wrapper needs:

Standard SaaS:
  Auth → Dashboard → Features → Billing (flat subscription)

AI Wrapper SaaS:
  Auth → Dashboard → AI Features (streaming) → Usage Tracking
    → Rate Limiting (per-user)
    → Token Metering → Cost Management
    → Billing (usage-based OR credit-based)
    → Prompt Management / Versioning

The unique technical requirements:

Requirement	Why It Matters	Typical Solution
Streaming responses	LLM responses are slow — stream for UX	Vercel AI SDK `useChat` / `useCompletion`
Token tracking	API costs scale with usage	Count tokens per request, store in DB
Rate limiting	Prevent abuse / cost overruns	Redis + sliding window (Upstash)
Model switching	GPT-4 vs Claude vs Gemini	Abstraction layer via AI SDK
Prompt management	Version prompts, A/B test	DB-stored prompts or separate config
Cost passthrough	Charge users for AI usage	Stripe Meters or credit system
Abort/cancel	Users stop mid-generation	AbortController in streaming handler

The Core: Vercel AI SDK

Regardless of which boilerplate you pick, the Vercel AI SDK (ai package) is the foundation for all AI interactions:

npm install ai @ai-sdk/openai @ai-sdk/anthropic

// app/api/chat/route.ts — streaming chat endpoint (works in any boilerplate):
import { openai } from '@ai-sdk/openai';
import { anthropic } from '@ai-sdk/anthropic';
import { streamText } from 'ai';
import { auth } from '@/lib/auth';
import { checkRateLimit } from '@/lib/rate-limit';
import { trackTokenUsage } from '@/lib/usage';

export async function POST(req: Request) {
  const session = await auth();
  if (!session) return new Response('Unauthorized', { status: 401 });

  // Rate limit: 20 requests/hour for free, 200 for pro:
  const { success, remaining } = await checkRateLimit(session.user.id, session.user.plan);
  if (!success) return new Response('Rate limit exceeded', { status: 429 });

  const { messages, model = 'gpt-4o-mini' } = await req.json();

  // Model routing:
  const modelProvider = model.startsWith('claude')
    ? anthropic(model)
    : openai(model);

  const result = await streamText({
    model: modelProvider,
    messages,
    system: 'You are a helpful assistant.',
    onFinish: async ({ usage }) => {
      // Track token usage after generation completes:
      await trackTokenUsage({
        userId: session.user.id,
        model,
        inputTokens: usage.promptTokens,
        outputTokens: usage.completionTokens,
        totalTokens: usage.totalTokens,
      });
    },
  });

  return result.toDataStreamResponse();
}

// app/chat/page.tsx — streaming UI (any boilerplate):
'use client';
import { useChat } from 'ai/react';

export default function ChatPage() {
  const { messages, input, handleInputChange, handleSubmit, isLoading, stop } = useChat({
    api: '/api/chat',
    body: { model: 'gpt-4o-mini' },
    onError: (err) => console.error('Chat error:', err),
  });

  return (
    <div className="flex flex-col h-screen max-w-2xl mx-auto p-4">
      <div className="flex-1 overflow-y-auto space-y-4">
        {messages.map((m) => (
          <div key={m.id} className={m.role === 'user' ? 'text-right' : 'text-left'}>
            <span className="inline-block p-3 rounded-lg bg-muted max-w-[80%]">
              {m.content}
            </span>
          </div>
        ))}
      </div>
      <form onSubmit={handleSubmit} className="flex gap-2 mt-4">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Type a message..."
          className="flex-1 rounded border p-2"
        />
        {isLoading
          ? <button type="button" onClick={stop}>Stop</button>
          : <button type="submit">Send</button>
        }
      </form>
    </div>
  );
}

This pattern works with any boilerplate — add it to ShipFast, Makerkit, T3 Stack, etc.

Option 1: T3 Stack — Most Flexible

Best for: developers who want full control and are comfortable assembling their own AI layer.

npm create t3-app@latest my-ai-saas
# Select: Next.js, TypeScript, Prisma, tRPC, Tailwind
npm install ai @ai-sdk/openai @ai-sdk/anthropic @upstash/ratelimit @upstash/redis

Database schema for AI usage tracking:

// prisma/schema.prisma additions:
model Conversation {
  id        String    @id @default(cuid())
  userId    String
  user      User      @relation(fields: [userId], references: [id])
  title     String?
  messages  Message[]
  createdAt DateTime  @default(now())
  updatedAt DateTime  @updatedAt
}

model Message {
  id             String       @id @default(cuid())
  conversationId String
  conversation   Conversation @relation(fields: [conversationId], references: [id])
  role           String       // 'user' | 'assistant' | 'system'
  content        String       @db.Text
  model          String?      // 'gpt-4o-mini', 'claude-3-haiku', etc.
  inputTokens    Int          @default(0)
  outputTokens   Int          @default(0)
  createdAt      DateTime     @default(now())
}

model UsageSummary {
  id           String   @id @default(cuid())
  userId       String
  user         User     @relation(fields: [userId], references: [id])
  month        String   // '2026-03'
  totalTokens  Int      @default(0)
  totalCost    Float    @default(0) // in USD
  updatedAt    DateTime @updatedAt

  @@unique([userId, month])
}

Rate limiting with Upstash Redis:

// lib/rate-limit.ts
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';

const redis = new Redis({
  url: process.env.UPSTASH_REDIS_REST_URL!,
  token: process.env.UPSTASH_REDIS_REST_TOKEN!,
});

const limits = {
  free: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(20, '1 h'), // 20 requests/hour
  }),
  pro: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(200, '1 h'), // 200/hour
  }),
  enterprise: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(2000, '1 h'),
  }),
};

export async function checkRateLimit(userId: string, plan: string) {
  const limiter = limits[plan as keyof typeof limits] ?? limits.free;
  return limiter.limit(userId);
}

Token cost tracking:

// lib/usage.ts
const TOKEN_COSTS = {
  'gpt-4o': { input: 0.000005, output: 0.000015 },       // per token
  'gpt-4o-mini': { input: 0.00000015, output: 0.0000006 },
  'claude-3-5-sonnet': { input: 0.000003, output: 0.000015 },
  'claude-3-haiku': { input: 0.00000025, output: 0.00000125 },
  'gemini-1.5-pro': { input: 0.00000125, output: 0.000005 },
} as const;

export async function trackTokenUsage({
  userId, model, inputTokens, outputTokens, totalTokens,
}: {
  userId: string;
  model: string;
  inputTokens: number;
  outputTokens: number;
  totalTokens: number;
}) {
  const costs = TOKEN_COSTS[model as keyof typeof TOKEN_COSTS];
  const cost = costs
    ? inputTokens * costs.input + outputTokens * costs.output
    : 0;

  const month = new Date().toISOString().slice(0, 7); // '2026-03'

  await db.usageSummary.upsert({
    where: { userId_month: { userId, month } },
    update: {
      totalTokens: { increment: totalTokens },
      totalCost: { increment: cost },
    },
    create: { userId, month, totalTokens, totalCost: cost },
  });
}

T3 Stack is right for AI SaaS if you need custom billing logic, multi-model support, or are building something that doesn't fit a template.

Option 2: ShipFast + AI Layer

Best for: those who already own ShipFast and want to add AI features fast.

ShipFast doesn't ship with AI pre-built, but adding the Vercel AI SDK on top takes ~2 hours:

# In your ShipFast project:
npm install ai @ai-sdk/openai

// Add to ShipFast's existing API structure:
// app/api/ai/chat/route.ts
import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
import { getServerSession } from 'next-auth'; // or Supabase auth
import { authOptions } from '@/libs/next-auth';

export async function POST(req: Request) {
  const session = await getServerSession(authOptions);

  // Reuse ShipFast's existing auth check:
  if (!session?.user) {
    return new Response('Unauthorized', { status: 401 });
  }

  // Use ShipFast's plan detection:
  const isPro = session.user.priceId === process.env.STRIPE_PRO_PRICE_ID;
  if (!isPro) {
    return new Response('Upgrade to Pro for AI features', { status: 403 });
  }

  const { messages } = await req.json();

  const result = await streamText({
    model: openai('gpt-4o-mini'),
    messages,
  });

  return result.toDataStreamResponse();
}

ShipFast + AI path makes sense if:

You already own ShipFast (no additional boilerplate cost)
Your AI features are gated behind a paid plan (ShipFast's plan check is simple)
You don't need per-token billing (flat-rate subscription is fine)

Option 3: Makerkit — Plugin-Based AI Integration

Makerkit's plugin system is well-suited for AI features:

// Makerkit plugin pattern for AI:
// packages/plugins/ai-assistant/src/api/chat.ts
import { createRouteHandlerClient } from '@supabase/auth-helpers-nextjs';
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';

export async function chatHandler(req: Request) {
  const supabase = createRouteHandlerClient({ cookies });
  const { data: { user } } = await supabase.auth.getUser();

  if (!user) return new Response('Unauthorized', { status: 401 });

  // Makerkit's org context:
  const { organizationId, messages } = await req.json();

  // Check org's AI quota:
  const quota = await getOrganizationAIQuota(organizationId);
  if (quota.used >= quota.limit) {
    return new Response('AI quota exceeded for this organization', { status: 429 });
  }

  const result = await streamText({
    model: openai('gpt-4o-mini'),
    messages,
    onFinish: async ({ usage }) => {
      await incrementOrganizationAIUsage(organizationId, usage.totalTokens);
    },
  });

  return result.toDataStreamResponse();
}

Makerkit shines for AI SaaS if:

You're building B2B — organizations share an AI quota
You want the AI feature as an add-on to a full SaaS (not AI as the core product)
You'll use the team management and billing plugins alongside it

Credit-Based vs Usage-Based Billing

Two billing models work for AI wrappers:

// Model 1: Credit-based (buy credits, spend on AI usage)
// Simpler UX — users buy packs, credits deduct per request

// Credit purchase:
const session = await stripe.checkout.sessions.create({
  mode: 'payment',       // One-time payment, not subscription
  line_items: [{ price: 'price_1000_credits', quantity: 1 }],
  // ...
});

// Credit deduction:
await db.user.update({
  where: { id: userId },
  data: { credits: { decrement: tokensUsed } },
});

// Guard before each AI call:
const user = await db.user.findUnique({ where: { id: userId } });
if (user.credits <= 0) throw new Error('Insufficient credits');

// Model 2: Stripe Meters (usage-based subscription)
// More complex setup, but users pay for exactly what they use

// Record usage event:
await stripe.billing.meterEvents.create({
  event_name: 'ai_tokens',
  payload: {
    stripe_customer_id: customerId,
    value: String(tokensUsed),
  },
});

// Create subscription with usage price:
await stripe.subscriptions.create({
  customer: customerId,
  items: [{ price: 'price_per_1k_tokens' }],
});

Which billing model to use:

	Credit-Based	Usage-Based (Stripe Meters)
User experience	Predictable (buy credits, see balance)	Pay for what you use
Revenue predictability	Higher (bulk purchase)	Lower (variable)
Setup complexity	Lower	Higher
Best for	Consumer AI tools, indie hackers	B2B with high usage variance

Production Checklist for AI SaaS

Before launching an AI wrapper:

[ ] Streaming — responses stream, not wait for full generation
[ ] Error handling — API timeouts, rate limit errors, model failures
[ ] Token limits — enforce per-request max tokens (prevent cost bombs)
[ ] Rate limiting — per-user hourly/daily limits
[ ] Cost monitoring — alert when daily spend exceeds threshold
[ ] Prompt injection prevention — sanitize user input
[ ] PII handling — don't log PII in prompt logs
[ ] Fallback model — if GPT-4o fails, fall back to GPT-4o-mini
[ ] Abort/cancel — users can stop a generation
[ ] Content moderation — if user-facing, run through moderation API

// Content moderation (OpenAI):
const moderation = await openai.moderations.create({
  input: userMessage,
});
if (moderation.results[0].flagged) {
  return new Response('Message flagged by content policy', { status: 400 });
}

Find AI-ready SaaS boilerplates at StarterPick.

Comments