Rate Limiting and Abuse Prevention for SaaS Apps 2026

TL;DR

Every SaaS gets abused. Rate limiting is table stakes, not optional. Without it, a single malicious user can exhaust your OpenAI budget, spam your email provider into suspension, or DDoS your free tier out of existence. The standard stack in 2026: Upstash Redis for distributed rate limiting (works on Edge), different limits per endpoint type (auth vs API vs AI), and Vercel's built-in DDoS protection as a base layer. Implementation takes 30 minutes and prevents incidents that take days to recover from.

Key Takeaways

Upstash: serverless Redis with HTTP API — works in Next.js Edge Middleware, no cold starts
Sliding window vs fixed window: sliding window prevents burst abuse at window boundaries
Differentiated limits: auth endpoints need tighter limits than regular API
AI endpoints: 10-20x more expensive than regular API calls — protect them aggressively
IP vs user ID: IP rate limiting for unauthenticated routes; user ID for authenticated routes
Boilerplate gap: most SaaS boilerplates ship zero rate limiting — this is critical to add

The Rate Limiting Stack

npm install @upstash/ratelimit @upstash/redis

# .env
UPSTASH_REDIS_REST_URL=https://xxx.upstash.io
UPSTASH_REDIS_REST_TOKEN=xxxxx

Pattern 1: Global Middleware Rate Limiting

Apply a base rate limit to all requests at the Edge:

// middleware.ts
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

const redis = Redis.fromEnv();

// Different limiters for different threat levels:
const limiters = {
  // All routes: 200 requests per minute per IP (catches bots)
  global: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(200, '1m'),
    prefix: 'rl:global',
  }),

  // Auth endpoints: 10 attempts per 15 minutes (brute force protection)
  auth: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(10, '15m'),
    prefix: 'rl:auth',
  }),

  // API routes: 60 requests per minute (standard API usage)
  api: new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(60, '1m'),
    prefix: 'rl:api',
  }),
};

export async function middleware(request: NextRequest) {
  const pathname = request.nextUrl.pathname;
  const ip = request.headers.get('x-forwarded-for')?.split(',')[0]?.trim()
    ?? request.headers.get('x-real-ip')
    ?? 'anonymous';

  // Choose limiter based on route:
  let limiter = limiters.global;
  let limitKey = ip;

  if (pathname.startsWith('/api/auth') || pathname.startsWith('/auth')) {
    limiter = limiters.auth;
  } else if (pathname.startsWith('/api/')) {
    limiter = limiters.api;
  }

  const { success, limit, remaining, reset } = await limiter.limit(limitKey);

  if (!success) {
    return new NextResponse(
      JSON.stringify({ error: 'Too many requests. Please slow down.' }),
      {
        status: 429,
        headers: {
          'Content-Type': 'application/json',
          'X-RateLimit-Limit': limit.toString(),
          'X-RateLimit-Remaining': remaining.toString(),
          'X-RateLimit-Reset': reset.toString(),
          'Retry-After': Math.ceil((reset - Date.now()) / 1000).toString(),
        },
      }
    );
  }

  // Pass rate limit info to handlers:
  const response = NextResponse.next();
  response.headers.set('X-RateLimit-Limit', limit.toString());
  response.headers.set('X-RateLimit-Remaining', remaining.toString());
  return response;
}

export const config = {
  matcher: [
    '/((?!_next/static|_next/image|favicon.ico|.*\\.png$).*)',
  ],
};

Pattern 2: Per-User Rate Limiting on Authenticated Routes

For logged-in users, rate limit by user ID (more precise than IP):

// lib/rate-limit.ts — reusable helper
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
import { NextResponse } from 'next/server';

const redis = Redis.fromEnv();

// Configurable limiters:
const createLimiter = (limit: number, window: string, prefix: string) =>
  new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(limit, window as `${number} ${'ms' | 's' | 'm' | 'h' | 'd'}`),
    prefix: `rl:${prefix}`,
  });

const LIMITERS = {
  // Standard user actions: 100/minute
  default: createLimiter(100, '1 m', 'default'),

  // AI chat: 20/minute (expensive endpoint)
  aiChat: createLimiter(20, '1 m', 'ai-chat'),

  // AI daily cap: 200/day (abuse ceiling)
  aiChatDaily: createLimiter(200, '24 h', 'ai-chat-daily'),

  // Email sending: 5/hour (prevent email spam)
  email: createLimiter(5, '1 h', 'email'),

  // Webhooks from Stripe: 1000/minute (don't block legitimate webhooks)
  webhook: createLimiter(1000, '1 m', 'webhook'),
} as const;

type LimiterKey = keyof typeof LIMITERS;

export async function checkRateLimit(
  identifier: string,
  limiter: LimiterKey = 'default'
): Promise<{ success: boolean; remaining: number; reset: number }> {
  return LIMITERS[limiter].limit(identifier);
}

export function rateLimitResponse(reset: number) {
  return NextResponse.json(
    { error: 'Rate limit exceeded. Please wait before retrying.' },
    {
      status: 429,
      headers: {
        'Retry-After': Math.ceil((reset - Date.now()) / 1000).toString(),
      },
    }
  );
}

// app/api/ai/chat/route.ts — protect expensive AI endpoint:
import { checkRateLimit, rateLimitResponse } from '@/lib/rate-limit';

export async function POST(req: Request) {
  const session = await auth();
  if (!session?.user) return new Response('Unauthorized', { status: 401 });

  // Check both per-minute and per-day limits:
  const [minuteLimit, dailyLimit] = await Promise.all([
    checkRateLimit(session.user.id, 'aiChat'),
    checkRateLimit(session.user.id, 'aiChatDaily'),
  ]);

  if (!minuteLimit.success) {
    return rateLimitResponse(minuteLimit.reset);
  }

  if (!dailyLimit.success) {
    return NextResponse.json(
      { error: 'Daily AI limit reached. Resets at midnight UTC.' },
      { status: 429 }
    );
  }

  // Proceed with AI call...
}

// app/api/auth/sign-in/route.ts (or Server Action)
import { checkRateLimit } from '@/lib/rate-limit';

const loginLimiter = new Ratelimit({
  redis: Redis.fromEnv(),
  // 5 failed attempts per 15 minutes per IP+email combo:
  limiter: Ratelimit.slidingWindow(5, '15 m'),
  prefix: 'rl:login',
});

export async function POST(req: Request) {
  const { email, password } = await req.json();
  const ip = req.headers.get('x-forwarded-for') ?? 'unknown';

  // Key combines IP + email — prevents distributed brute force
  // and single-IP brute force simultaneously:
  const key = `${ip}:${email.toLowerCase()}`;
  const { success } = await loginLimiter.limit(key);

  if (!success) {
    // Don't leak that rate limiting was triggered — return same error as bad password:
    return NextResponse.json(
      { error: 'Invalid email or password' },
      { status: 401 }
    );
  }

  const user = await db.user.findUnique({ where: { email } });
  const valid = user && await bcrypt.compare(password, user.passwordHash ?? '');

  if (!valid) {
    // Consume a rate limit slot on failure:
    return NextResponse.json({ error: 'Invalid email or password' }, { status: 401 });
  }

  // Success — create session
}

Pattern 4: API Key Rate Limiting

For apps that issue API keys to users:

// lib/api-key-auth.ts
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';

const redis = Redis.fromEnv();

// Tier-based limits:
const TIER_LIMITS = {
  free: { requests: 100, window: '1 h' },
  pro: { requests: 1000, window: '1 h' },
  enterprise: { requests: 100000, window: '1 h' },
} as const;

export async function validateApiKey(req: Request) {
  const apiKey = req.headers.get('x-api-key')
    ?? req.headers.get('Authorization')?.replace('Bearer ', '');

  if (!apiKey) {
    return { error: 'API key required', status: 401 };
  }

  const keyRecord = await db.apiKey.findUnique({
    where: { key: hashApiKey(apiKey) },
    include: { user: { select: { id: true, plan: true } } },
  });

  if (!keyRecord || !keyRecord.active) {
    return { error: 'Invalid API key', status: 401 };
  }

  // Check tier-based rate limit:
  const tier = keyRecord.user.plan as keyof typeof TIER_LIMITS;
  const { requests, window } = TIER_LIMITS[tier] ?? TIER_LIMITS.free;

  const limiter = new Ratelimit({
    redis,
    limiter: Ratelimit.slidingWindow(requests, window),
    prefix: `rl:api-key:${tier}`,
  });

  const { success, remaining, reset } = await limiter.limit(keyRecord.id);

  if (!success) {
    return {
      error: `Rate limit exceeded. Your ${tier} plan allows ${requests} requests per hour.`,
      status: 429,
      headers: { 'X-RateLimit-Reset': reset.toString() },
    };
  }

  // Update last used timestamp asynchronously:
  db.apiKey.update({
    where: { id: keyRecord.id },
    data: { lastUsedAt: new Date() },
  }).catch(() => {}); // Fire and forget

  return { userId: keyRecord.user.id, remaining };
}

Pattern 5: Cost Caps for AI Features

Prevent runaway AI costs from a single user:

// lib/cost-guard.ts
const MONTHLY_COST_CAPS_USD = {
  free: 0.50,     // $0.50/month max AI cost
  pro: 10.00,     // $10/month max AI cost
  enterprise: 100.00,
};

export async function checkCostCap(userId: string, plan: string): Promise<boolean> {
  const startOfMonth = new Date(
    new Date().getFullYear(),
    new Date().getMonth(),
    1
  );

  const usage = await db.aiUsage.aggregate({
    where: { userId, createdAt: { gte: startOfMonth } },
    _sum: { estimatedCostUsd: true },
  });

  const currentCost = Number(usage._sum.estimatedCostUsd ?? 0);
  const cap = MONTHLY_COST_CAPS_USD[plan as keyof typeof MONTHLY_COST_CAPS_USD]
    ?? MONTHLY_COST_CAPS_USD.free;

  return currentCost < cap;
}

// Use in AI routes:
const withinCap = await checkCostCap(session.user.id, user.plan);
if (!withinCap) {
  return NextResponse.json(
    {
      error: 'monthly_cost_cap',
      message: 'AI usage limit reached for this month. Upgrade for higher limits.',
    },
    { status: 402 }
  );
}

Recommended Limits by Endpoint Type

Authentication (login, register, password reset):
  → 10 attempts per 15 minutes per IP+email
  → 50 attempts per hour per IP

Standard API (CRUD operations):
  → 100 requests per minute per user
  → 1000 requests per hour per user

AI endpoints (chat, generation, embeddings):
  → 20 requests per minute per user (per-minute burst protection)
  → 200 requests per day per user (daily abuse ceiling)
  → $10/month cost cap (prevents runaway costs)

Email sending (user-triggered):
  → 5 per hour per user
  → 20 per day per user

Webhooks (from external services like Stripe):
  → 1000 per minute per source IP (don't block legitimate webhook delivery)
  → No user-based limit (Stripe uses a small set of IPs)

Public API (developer API keys):
  → Free tier: 100 requests/hour
  → Pro tier: 1,000 requests/hour
  → Enterprise: custom

Find boilerplates with pre-built rate limiting and security at StarterPick.

Comments