Database Migrations in Production SaaS
TL;DR
Most SaaS boilerplates run prisma migrate deploy in CI and pray. That works until it doesn't. Dropping a column while the old code is still running causes instant 500 errors. Adding a non-nullable column without a default causes the migration to lock the table. Renaming a column is a multi-deploy operation. This guide covers the patterns that let you migrate without downtime: expand/contract, shadow databases, and coordinated deploy/migrate sequencing. None of this is exotic — just disciplined sequencing.
Key Takeaways
- Never drop or rename columns in the same deploy that changes code — multi-step expand/contract pattern
- Non-nullable columns need a default or Postgres locks the entire table during backfill
- Prisma migrate deploy vs dev:
deployis for production (forward-only),devis for local (creates migration files) - Shadow database: Prisma needs a second DB to generate migration files safely — use a separate schema in Postgres
- Lock timeouts: add
lock_timeoutto prevent long migrations from blocking reads - Rollback plan: always know how to undo the last migration before deploying
The Expand/Contract Pattern
The fundamental pattern for zero-downtime schema changes:
Instead of a single risky migration:
-- ❌ DANGEROUS: Rename column in one step
ALTER TABLE users RENAME COLUMN full_name TO display_name;
-- ^ Instantly breaks any code reading `full_name` that's still deployed
Use three separate deploys:
Deploy 1: Expand (add new structure, keep old)
→ Schema: ADD COLUMN display_name
→ Code: writes to BOTH full_name AND display_name
Deploy 2: Migrate data + transition (if needed)
→ Backfill: UPDATE display_name WHERE display_name IS NULL
→ Code: reads from display_name, writes to both
Deploy 3: Contract (remove old structure)
→ Schema: DROP COLUMN full_name
→ Code: only uses display_name
Common Migrations and How to Do Them Safely
Adding a Non-Nullable Column
// ❌ WRONG: Fails on tables with existing rows (no default)
// prisma/migrations/xxx_add_plan.sql
ALTER TABLE users ADD COLUMN plan VARCHAR NOT NULL;
// ✅ CORRECT: Add with default, later make stricter
Step 1: Add the column with a default value:
// schema.prisma — add with default
model User {
// ...
plan String @default("free")
}
-- Generated migration is safe (default means no table lock):
ALTER TABLE "users" ADD COLUMN "plan" TEXT NOT NULL DEFAULT 'free';
Step 2: Later, if you want to remove the default (to force explicit values):
-- Safe to do after all existing rows have values:
ALTER TABLE "users" ALTER COLUMN "plan" DROP DEFAULT;
Renaming a Column
Deploy 1 (Expand): Add new column, write to both
Deploy 2 (Migrate): Backfill new column, switch reads to new
Deploy 3 (Contract): Drop old column
// Step 1: Add display_name alongside full_name
model User {
fullName String
displayName String? // nullable initially
}
// Application code in Step 1 — writes to both:
await db.user.update({
where: { id },
data: {
fullName: name, // old column (still being read)
displayName: name, // new column (being populated)
},
});
-- Step 2: Backfill migration (batched to avoid locking):
DO $$
DECLARE
batch_size INTEGER := 1000;
offset_val INTEGER := 0;
rows_updated INTEGER;
BEGIN
LOOP
UPDATE users
SET display_name = full_name
WHERE id IN (
SELECT id FROM users
WHERE display_name IS NULL
LIMIT batch_size OFFSET offset_val
);
GET DIAGNOSTICS rows_updated = ROW_COUNT;
EXIT WHEN rows_updated < batch_size;
offset_val := offset_val + batch_size;
PERFORM pg_sleep(0.01); -- Brief pause to reduce lock contention
END LOOP;
END $$;
// Step 3: Remove old column
model User {
displayName String // now required
// fullName removed
}
Adding an Index to a Large Table
Regular index creation locks the table for writes. Use CONCURRENTLY:
-- ❌ Locks table during creation:
CREATE INDEX idx_users_email ON users(email);
-- ✅ Non-blocking (takes longer but doesn't block writes):
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
In Prisma migrations, you can't use CONCURRENTLY directly via the schema. Use a raw SQL migration:
// prisma/migrations/xxx_add_email_index/migration.sql
-- This migration was created manually
-- CreateIndex
-- Using CONCURRENTLY to avoid locking the users table:
CREATE INDEX CONCURRENTLY "users_email_idx" ON "users"("email");
Mark it in schema.prisma after creation:
model User {
email String @unique
@@index([email])
}
Setting Up Lock Timeouts
Prevent a slow migration from blocking the entire app:
-- Add to the start of long-running migrations:
SET lock_timeout = '5s'; -- Fail fast if we can't get a lock
SET statement_timeout = '30s'; -- Fail fast if the statement takes too long
-- Now run your migration:
ALTER TABLE users ADD COLUMN last_seen_at TIMESTAMPTZ;
In Prisma, add these to the beginning of the migration file:
-- prisma/migrations/xxx/migration.sql
SET lock_timeout = '5s';
SET statement_timeout = '60s';
ALTER TABLE "users" ADD COLUMN "last_seen_at" TIMESTAMP(3);
Production Deploy Sequence
The correct order for code + schema changes:
When adding new columns/tables (backward-compatible):
1. Run migration (new column exists, old code doesn't use it yet)
2. Deploy new code (now reads/writes new column)
New code can safely add columns before deploying because the old code ignores unknown columns.
When removing columns/tables (requires expand/contract):
1. Deploy new code (stops reading the old column, writes to new)
2. Wait for all instances to update (rolling deploy completes)
3. Run migration (drop the old column — code no longer references it)
Setting Up in CI/CD (GitHub Actions + Vercel)
# .github/workflows/deploy.yml
name: Deploy
on:
push:
branches: [main]
jobs:
migrate-and-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install dependencies
run: npm ci
- name: Run database migrations
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
run: npx prisma migrate deploy
# migrate deploy is safe for production:
# - Only runs pending migrations
# - Never creates new migration files
# - Idempotent (safe to run multiple times)
# - Fails fast if migration is incompatible
- name: Deploy to Vercel
# Deploy runs AFTER migration — new code matches new schema
uses: amondnet/vercel-action@v25
with:
vercel-token: ${{ secrets.VERCEL_TOKEN }}
vercel-org-id: ${{ secrets.VERCEL_ORG_ID }}
vercel-project-id: ${{ secrets.VERCEL_PROJECT_ID }}
vercel-args: '--prod'
Setting Up Shadow Database
Prisma needs a shadow database to generate migration files safely (it creates + drops the shadow DB to diff schemas):
# .env (development only)
DATABASE_URL="postgresql://user:pass@localhost:5432/myapp"
SHADOW_DATABASE_URL="postgresql://user:pass@localhost:5432/myapp_shadow"
-- Create shadow DB once:
CREATE DATABASE myapp_shadow;
// schema.prisma
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
shadowDatabaseUrl = env("SHADOW_DATABASE_URL") // Only for dev
}
For Supabase, you can use a separate schema instead of a separate database:
SHADOW_DATABASE_URL="postgresql://user:pass@db.xxx.supabase.co:5432/postgres?schema=shadow"
What to Do When a Migration Goes Wrong
// 1. Check what migrations have been applied:
// $ npx prisma migrate status
// 2. If migration is stuck (blocking lock):
// Find blocking queries:
SELECT pid, query, state, wait_event_type, wait_event
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY query_start;
// Kill a specific blocking query:
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE pid = <blocking_pid>;
// 3. Mark a failed migration as "rolled back" in Prisma:
// $ npx prisma migrate resolve --rolled-back <migration_name>
// 4. If you need to manually fix the schema and mark as applied:
// $ npx prisma migrate resolve --applied <migration_name>
Migration Checklist Before Deploying
Pre-migration:
[ ] Migration is backward-compatible with current code
[ ] New non-nullable columns have default values
[ ] Indexes use CONCURRENTLY if table has >100K rows
[ ] Lock timeout added for DDL operations
[ ] Tested in staging with production data size
Deploy sequence:
[ ] Merge code PR that still works with OLD schema
[ ] Run `prisma migrate deploy` against production DB
[ ] Verify migration completed successfully
[ ] Deploy new code to production
Post-migration:
[ ] Monitor error rates for 15 minutes post-deploy
[ ] Verify key user flows work (login, payment, core feature)
[ ] Check slow query log for new query patterns
Find SaaS boilerplates with production-ready database setup at StarterPick.