Rate Limiting and Abuse Prevention for SaaS Apps 2026
TL;DR
Every SaaS gets abused. Rate limiting is table stakes, not optional. Without it, a single malicious user can exhaust your OpenAI budget, spam your email provider into suspension, or DDoS your free tier out of existence. The standard stack in 2026: Upstash Redis for distributed rate limiting (works on Edge), different limits per endpoint type (auth vs API vs AI), and Vercel's built-in DDoS protection as a base layer. Implementation takes 30 minutes and prevents incidents that take days to recover from.
Key Takeaways
- Upstash: serverless Redis with HTTP API — works in Next.js Edge Middleware, no cold starts
- Sliding window vs fixed window: sliding window prevents burst abuse at window boundaries
- Differentiated limits: auth endpoints need tighter limits than regular API
- AI endpoints: 10-20x more expensive than regular API calls — protect them aggressively
- IP vs user ID: IP rate limiting for unauthenticated routes; user ID for authenticated routes
- Boilerplate gap: most SaaS boilerplates ship zero rate limiting — this is critical to add
The Rate Limiting Stack
npm install @upstash/ratelimit @upstash/redis
# .env
UPSTASH_REDIS_REST_URL=https://xxx.upstash.io
UPSTASH_REDIS_REST_TOKEN=xxxxx
Pattern 1: Global Middleware Rate Limiting
Apply a base rate limit to all requests at the Edge:
// middleware.ts
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';
const redis = Redis.fromEnv();
// Different limiters for different threat levels:
const limiters = {
// All routes: 200 requests per minute per IP (catches bots)
global: new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(200, '1m'),
prefix: 'rl:global',
}),
// Auth endpoints: 10 attempts per 15 minutes (brute force protection)
auth: new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(10, '15m'),
prefix: 'rl:auth',
}),
// API routes: 60 requests per minute (standard API usage)
api: new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(60, '1m'),
prefix: 'rl:api',
}),
};
export async function middleware(request: NextRequest) {
const pathname = request.nextUrl.pathname;
const ip = request.headers.get('x-forwarded-for')?.split(',')[0]?.trim()
?? request.headers.get('x-real-ip')
?? 'anonymous';
// Choose limiter based on route:
let limiter = limiters.global;
let limitKey = ip;
if (pathname.startsWith('/api/auth') || pathname.startsWith('/auth')) {
limiter = limiters.auth;
} else if (pathname.startsWith('/api/')) {
limiter = limiters.api;
}
const { success, limit, remaining, reset } = await limiter.limit(limitKey);
if (!success) {
return new NextResponse(
JSON.stringify({ error: 'Too many requests. Please slow down.' }),
{
status: 429,
headers: {
'Content-Type': 'application/json',
'X-RateLimit-Limit': limit.toString(),
'X-RateLimit-Remaining': remaining.toString(),
'X-RateLimit-Reset': reset.toString(),
'Retry-After': Math.ceil((reset - Date.now()) / 1000).toString(),
},
}
);
}
// Pass rate limit info to handlers:
const response = NextResponse.next();
response.headers.set('X-RateLimit-Limit', limit.toString());
response.headers.set('X-RateLimit-Remaining', remaining.toString());
return response;
}
export const config = {
matcher: [
'/((?!_next/static|_next/image|favicon.ico|.*\\.png$).*)',
],
};
Pattern 2: Per-User Rate Limiting on Authenticated Routes
For logged-in users, rate limit by user ID (more precise than IP):
// lib/rate-limit.ts — reusable helper
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
import { NextResponse } from 'next/server';
const redis = Redis.fromEnv();
// Configurable limiters:
const createLimiter = (limit: number, window: string, prefix: string) =>
new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(limit, window as `${number} ${'ms' | 's' | 'm' | 'h' | 'd'}`),
prefix: `rl:${prefix}`,
});
const LIMITERS = {
// Standard user actions: 100/minute
default: createLimiter(100, '1 m', 'default'),
// AI chat: 20/minute (expensive endpoint)
aiChat: createLimiter(20, '1 m', 'ai-chat'),
// AI daily cap: 200/day (abuse ceiling)
aiChatDaily: createLimiter(200, '24 h', 'ai-chat-daily'),
// Email sending: 5/hour (prevent email spam)
email: createLimiter(5, '1 h', 'email'),
// Webhooks from Stripe: 1000/minute (don't block legitimate webhooks)
webhook: createLimiter(1000, '1 m', 'webhook'),
} as const;
type LimiterKey = keyof typeof LIMITERS;
export async function checkRateLimit(
identifier: string,
limiter: LimiterKey = 'default'
): Promise<{ success: boolean; remaining: number; reset: number }> {
return LIMITERS[limiter].limit(identifier);
}
export function rateLimitResponse(reset: number) {
return NextResponse.json(
{ error: 'Rate limit exceeded. Please wait before retrying.' },
{
status: 429,
headers: {
'Retry-After': Math.ceil((reset - Date.now()) / 1000).toString(),
},
}
);
}
// app/api/ai/chat/route.ts — protect expensive AI endpoint:
import { checkRateLimit, rateLimitResponse } from '@/lib/rate-limit';
export async function POST(req: Request) {
const session = await auth();
if (!session?.user) return new Response('Unauthorized', { status: 401 });
// Check both per-minute and per-day limits:
const [minuteLimit, dailyLimit] = await Promise.all([
checkRateLimit(session.user.id, 'aiChat'),
checkRateLimit(session.user.id, 'aiChatDaily'),
]);
if (!minuteLimit.success) {
return rateLimitResponse(minuteLimit.reset);
}
if (!dailyLimit.success) {
return NextResponse.json(
{ error: 'Daily AI limit reached. Resets at midnight UTC.' },
{ status: 429 }
);
}
// Proceed with AI call...
}
Pattern 3: Brute Force Protection for Login
// app/api/auth/sign-in/route.ts (or Server Action)
import { checkRateLimit } from '@/lib/rate-limit';
const loginLimiter = new Ratelimit({
redis: Redis.fromEnv(),
// 5 failed attempts per 15 minutes per IP+email combo:
limiter: Ratelimit.slidingWindow(5, '15 m'),
prefix: 'rl:login',
});
export async function POST(req: Request) {
const { email, password } = await req.json();
const ip = req.headers.get('x-forwarded-for') ?? 'unknown';
// Key combines IP + email — prevents distributed brute force
// and single-IP brute force simultaneously:
const key = `${ip}:${email.toLowerCase()}`;
const { success } = await loginLimiter.limit(key);
if (!success) {
// Don't leak that rate limiting was triggered — return same error as bad password:
return NextResponse.json(
{ error: 'Invalid email or password' },
{ status: 401 }
);
}
const user = await db.user.findUnique({ where: { email } });
const valid = user && await bcrypt.compare(password, user.passwordHash ?? '');
if (!valid) {
// Consume a rate limit slot on failure:
return NextResponse.json({ error: 'Invalid email or password' }, { status: 401 });
}
// Success — create session
}
Pattern 4: API Key Rate Limiting
For apps that issue API keys to users:
// lib/api-key-auth.ts
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
const redis = Redis.fromEnv();
// Tier-based limits:
const TIER_LIMITS = {
free: { requests: 100, window: '1 h' },
pro: { requests: 1000, window: '1 h' },
enterprise: { requests: 100000, window: '1 h' },
} as const;
export async function validateApiKey(req: Request) {
const apiKey = req.headers.get('x-api-key')
?? req.headers.get('Authorization')?.replace('Bearer ', '');
if (!apiKey) {
return { error: 'API key required', status: 401 };
}
const keyRecord = await db.apiKey.findUnique({
where: { key: hashApiKey(apiKey) },
include: { user: { select: { id: true, plan: true } } },
});
if (!keyRecord || !keyRecord.active) {
return { error: 'Invalid API key', status: 401 };
}
// Check tier-based rate limit:
const tier = keyRecord.user.plan as keyof typeof TIER_LIMITS;
const { requests, window } = TIER_LIMITS[tier] ?? TIER_LIMITS.free;
const limiter = new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(requests, window),
prefix: `rl:api-key:${tier}`,
});
const { success, remaining, reset } = await limiter.limit(keyRecord.id);
if (!success) {
return {
error: `Rate limit exceeded. Your ${tier} plan allows ${requests} requests per hour.`,
status: 429,
headers: { 'X-RateLimit-Reset': reset.toString() },
};
}
// Update last used timestamp asynchronously:
db.apiKey.update({
where: { id: keyRecord.id },
data: { lastUsedAt: new Date() },
}).catch(() => {}); // Fire and forget
return { userId: keyRecord.user.id, remaining };
}
Pattern 5: Cost Caps for AI Features
Prevent runaway AI costs from a single user:
// lib/cost-guard.ts
const MONTHLY_COST_CAPS_USD = {
free: 0.50, // $0.50/month max AI cost
pro: 10.00, // $10/month max AI cost
enterprise: 100.00,
};
export async function checkCostCap(userId: string, plan: string): Promise<boolean> {
const startOfMonth = new Date(
new Date().getFullYear(),
new Date().getMonth(),
1
);
const usage = await db.aiUsage.aggregate({
where: { userId, createdAt: { gte: startOfMonth } },
_sum: { estimatedCostUsd: true },
});
const currentCost = Number(usage._sum.estimatedCostUsd ?? 0);
const cap = MONTHLY_COST_CAPS_USD[plan as keyof typeof MONTHLY_COST_CAPS_USD]
?? MONTHLY_COST_CAPS_USD.free;
return currentCost < cap;
}
// Use in AI routes:
const withinCap = await checkCostCap(session.user.id, user.plan);
if (!withinCap) {
return NextResponse.json(
{
error: 'monthly_cost_cap',
message: 'AI usage limit reached for this month. Upgrade for higher limits.',
},
{ status: 402 }
);
}
Recommended Limits by Endpoint Type
Authentication (login, register, password reset):
→ 10 attempts per 15 minutes per IP+email
→ 50 attempts per hour per IP
Standard API (CRUD operations):
→ 100 requests per minute per user
→ 1000 requests per hour per user
AI endpoints (chat, generation, embeddings):
→ 20 requests per minute per user (per-minute burst protection)
→ 200 requests per day per user (daily abuse ceiling)
→ $10/month cost cap (prevents runaway costs)
Email sending (user-triggered):
→ 5 per hour per user
→ 20 per day per user
Webhooks (from external services like Stripe):
→ 1000 per minute per source IP (don't block legitimate webhook delivery)
→ No user-based limit (Stripe uses a small set of IPs)
Public API (developer API keys):
→ Free tier: 100 requests/hour
→ Pro tier: 1,000 requests/hour
→ Enterprise: custom
Find boilerplates with pre-built rate limiting and security at StarterPick.