Skip to main content

Best Boilerplates for AI SaaS Products in 2026

·StarterPick Team
aisaasboilerplatellm2026

AI SaaS Has Different Infrastructure Requirements

Standard SaaS boilerplates (ShipFast, Supastarter, T3) work for AI SaaS — but they're missing the AI-specific infrastructure that takes weeks to build:

  • LLM API integration with streaming and error handling
  • Token metering and per-user usage limits
  • Vector database for RAG and semantic search
  • Prompt management and versioning
  • AI credit system tied to Stripe billing

In 2026, developers are building this infrastructure layer on top of standard SaaS boilerplates. This article shows the patterns.

AI SaaS Architecture

User input
  ↓
Rate limit check + credit deduction (Redis)
  ↓
Context retrieval (pgvector RAG) [optional]
  ↓
LLM API call (OpenAI / Anthropic / Google)
  ↓
Token usage metering → update user credits
  ↓
Stream response to client (Vercel AI SDK)
  ↓
Store conversation history (PostgreSQL)

Best Starting Combinations

ShipFast + Vercel AI SDK

The most common 2026 stack for AI SaaS:

  • ShipFast handles: auth, Stripe credits/subscriptions, email, landing page
  • Vercel AI SDK handles: multi-provider LLM calls, streaming, structured output
// app/api/chat/route.ts
import { streamText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { checkAndDeductCredits } from '@/libs/credits';
import { getServerSession } from 'next-auth';

export async function POST(req: Request) {
  const session = await getServerSession();
  if (!session) return new Response('Unauthorized', { status: 401 });

  // Check credits before LLM call
  const hasCredits = await checkAndDeductCredits(session.user.id, 1);
  if (!hasCredits) return new Response('No credits', { status: 402 });

  const { messages } = await req.json();

  const result = streamText({
    model: anthropic('claude-3-5-sonnet-20241022'),
    messages,
    onFinish: async ({ usage }) => {
      // Update actual token usage after completion
      await recordTokenUsage(session.user.id, usage);
    },
  });

  return result.toDataStreamResponse();
}

T3 Stack + AI SDK + pgvector

For AI SaaS with RAG (search over user documents):

// packages/api/src/router/rag.ts
import { openai } from '@ai-sdk/openai';
import { embed, embedMany } from 'ai';
import { db } from '@acme/db';
import { documents, embeddings } from '@acme/db/schema';
import { cosineDistance, gt, desc } from 'drizzle-orm';
import { sql } from 'drizzle-orm';

// Store document embedding
export async function indexDocument(userId: string, content: string) {
  const { embedding } = await embed({
    model: openai.embedding('text-embedding-3-small'),
    value: content,
  });

  await db.insert(embeddings).values({
    userId,
    content,
    embedding,  // pgvector stores float[]
  });
}

// Retrieve relevant context
export async function findRelevantContent(userId: string, query: string) {
  const { embedding } = await embed({
    model: openai.embedding('text-embedding-3-small'),
    value: query,
  });

  const similarity = sql<number>`1 - (${cosineDistance(embeddings.embedding, embedding)})`;

  return db.select({ content: embeddings.content, similarity })
    .from(embeddings)
    .where(gt(similarity, 0.5))
    .orderBy(desc(similarity))
    .limit(5);
}

Token Billing Patterns

Credit System (Most Common)

// $0.01 = 1 credit. Bundle credits into Stripe products.
const PRICING = {
  starter: { credits: 1000, price: 10 },   // $10 = 1000 credits
  pro: { credits: 5000, price: 40 },        // $40 = 5000 credits
  scale: { credits: 25000, price: 150 },    // $150 = 25000 credits
};

// LLM costs in credits per 1K tokens
const LLM_CREDIT_COST = {
  'gpt-4o': { input: 0.5, output: 1.5 },         // per 1K tokens
  'claude-3-5-sonnet': { input: 0.3, output: 1.5 },
};

Subscription with Soft Limits

// Plans include monthly token budget, warn at 80%, block at 100%
const PLANS = {
  starter: { tokensPerMonth: 500_000, price: 29 },
  pro: { tokensPerMonth: 5_000_000, price: 99 },
};

AI SaaS Launch Checklist

Before launching an AI product:

  • Rate limiting — Prevent users from consuming all credits in one burst
  • Error handling — LLM APIs are flaky; implement retry with exponential backoff
  • Streaming — Users expect character-by-character output, not wait-then-dump
  • Cost controls — Set monthly spend limits on your LLM provider account
  • Content moderation — Screen inputs for ToS violations (OpenAI Moderation API)
  • Fallback models — If primary model fails, fallback to alternative
  • Usage dashboard — Show users their credit balance and usage history

Compare AI SaaS boilerplates and standard starters on StarterPick.

Comments