Skip to main content

Best Boilerplates with RAG (Retrieval-Augmented Generation) Built-In 2026

·StarterPick Team
ragaipgvectornextjsvector-databaseboilerplate2026

RAG Is the Architecture Behind Every Useful AI Product

Retrieval-Augmented Generation (RAG) is the technique that makes AI products actually useful: instead of relying on an LLM's training data, you retrieve relevant context from your own data sources and inject it into the prompt.

A document Q&A chatbot? RAG. A customer support bot trained on your docs? RAG. A knowledge base with semantic search? RAG.

In 2026, RAG has moved from research technique to production standard. The boilerplates that include it out of the box — or make it easy to add — give you a meaningful head start.

TL;DR

Best boilerplates for RAG in 2026:

  1. Vercel AI SDK + pgvector — The most common stack. Supabase or Neon provides pgvector. Vercel AI SDK handles embedding and retrieval.
  2. OpenSaaS + RAG pattern — Add RAG to OpenSaaS's Wasp foundation. The most complete free base.
  3. Makerkit + AI Plugin — Enterprise-grade SaaS boilerplate with AI plugin including RAG patterns.
  4. LangChain.js starter templates — More complex orchestration for multi-step RAG pipelines.
  5. Custom: Next.js + Supabase + pgvector — Roll your own with well-documented patterns.

What RAG Requires

A production RAG system has four components:

ComponentPurposeCommon Tools
Embedding modelConvert text to vectorsOpenAI text-embedding-3, Anthropic, Cohere
Vector storeStore and search vectorspgvector, Pinecone, Weaviate, Qdrant
RetrievalFind relevant chunksCosine similarity, hybrid search
GenerationLLM uses retrieved contextOpenAI, Anthropic, Gemini

The simplest stack: OpenAI embeddings + pgvector (in Supabase/Neon) + Vercel AI SDK for generation.

Stack Options

pgvector (PostgreSQL)

The simplest approach: add the pgvector extension to your existing PostgreSQL database. Available in Supabase and Neon with zero additional infrastructure.

-- Enable pgvector in Supabase/Neon:
CREATE EXTENSION IF NOT EXISTS vector;

-- Store document chunks with embeddings:
CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  content TEXT NOT NULL,
  metadata JSONB,
  embedding VECTOR(1536)  -- OpenAI text-embedding-3-small dimension
);

-- Semantic search function:
CREATE OR REPLACE FUNCTION match_documents(
  query_embedding VECTOR(1536),
  match_count INT DEFAULT 5
)
RETURNS TABLE(id BIGINT, content TEXT, metadata JSONB, similarity FLOAT)
LANGUAGE SQL STABLE AS $$
  SELECT id, content, metadata,
    1 - (embedding <=> query_embedding) AS similarity
  FROM documents
  WHERE 1 - (embedding <=> query_embedding) > 0.5
  ORDER BY embedding <=> query_embedding
  LIMIT match_count;
$$;

Dedicated Vector Databases

For large-scale RAG with millions of vectors:

DatabaseFree tierBest for
PineconeYes (Starter)Simplest API, managed
WeaviateYes (self-hosted)Hybrid search, multi-modal
QdrantYes (cloud)Performance, self-hosted
pgvectorYes (via Supabase/Neon)Simplest infra (same DB)

The RAG Implementation Pattern

Step 1: Ingest Documents

// lib/ingest.ts
import { openai } from '@ai-sdk/openai';
import { embed } from 'ai';
import { supabase } from '@/lib/supabase';

// Split document into chunks:
function chunkText(text: string, chunkSize = 500, overlap = 50): string[] {
  const chunks: string[] = [];
  for (let i = 0; i < text.length; i += chunkSize - overlap) {
    chunks.push(text.slice(i, i + chunkSize));
  }
  return chunks;
}

export async function ingestDocument(text: string, metadata: object) {
  const chunks = chunkText(text);

  for (const chunk of chunks) {
    const { embedding } = await embed({
      model: openai.embedding('text-embedding-3-small'),
      value: chunk,
    });

    await supabase.from('documents').insert({
      content: chunk,
      metadata,
      embedding,
    });
  }
}

Step 2: Retrieve Relevant Chunks

// lib/retrieve.ts
import { embed } from 'ai';
import { openai } from '@ai-sdk/openai';
import { supabase } from '@/lib/supabase';

export async function retrieveContext(query: string, topK = 5) {
  const { embedding } = await embed({
    model: openai.embedding('text-embedding-3-small'),
    value: query,
  });

  const { data: documents } = await supabase.rpc('match_documents', {
    query_embedding: embedding,
    match_count: topK,
  });

  return documents?.map(d => d.content).join('\n\n') ?? '';
}

Step 3: Generate with Context

// app/api/chat/route.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { retrieveContext } from '@/lib/retrieve';

export async function POST(req: Request) {
  const { messages } = await req.json();
  const userQuery = messages[messages.length - 1].content;

  const context = await retrieveContext(userQuery);

  const result = await streamText({
    model: openai('gpt-4o'),
    system: `You are a helpful assistant. Use the following context to answer the user's question:

${context}

If the context doesn't contain relevant information, say so.`,
    messages,
  });

  return result.toDataStreamResponse();
}

Boilerplate Evaluations

The Vercel AI SDK's embed function handles embedding generation. Supabase provides pgvector. Together they form the simplest RAG stack for Next.js:

# Enable pgvector in Supabase:
# Dashboard → SQL Editor → Run: CREATE EXTENSION vector;

# Install deps:
npm install ai @ai-sdk/openai @supabase/supabase-js

No dedicated boilerplate exists for this — but the Supabase RAG quickstart and Vercel AI SDK docs together provide a complete guide.

OpenSaaS + RAG

OpenSaaS provides the SaaS foundation (auth, billing, admin). Add pgvector via Supabase (which OpenSaaS supports) for the RAG layer.

The combination gives you a complete AI SaaS with RAG capabilities without paying for a commercial boilerplate.

Makerkit AI Plugin

Makerkit's paid plugin marketplace includes an AI template with document Q&A patterns. If you are already using Makerkit ($299), the AI plugin extends it with:

  • Document upload and processing
  • Embedding generation
  • Semantic search over uploaded documents
  • Chat interface with document context

LangChain.js Starters

For complex RAG pipelines — multiple sources, re-ranking, query transformation — LangChain.js provides orchestration:

import { ChatOpenAI } from '@langchain/openai';
import { OpenAIEmbeddings } from '@langchain/openai';
import { SupabaseVectorStore } from '@langchain/community/vectorstores/supabase';

const embeddings = new OpenAIEmbeddings();
const vectorStore = await SupabaseVectorStore.fromExistingIndex(
  embeddings,
  { client: supabaseClient, tableName: 'documents' }
);

const retriever = vectorStore.asRetriever({ k: 5 });

LangChain adds complexity but enables advanced RAG patterns like:

  • Query transformation (HyDE, multi-query)
  • Re-ranking (Cohere rerank)
  • Multi-document summarization
  • Hybrid search (dense + sparse)
Use CaseStack
Document Q&ANext.js + Supabase pgvector + Vercel AI SDK
Knowledge baseNext.js + Supabase pgvector + pgfts (hybrid)
Multi-source RAGLangChain.js + Pinecone
Product searchpgvector with hybrid (vector + full-text)
Customer support botOpenSaaS + pgvector

Performance Considerations

  • Chunk size matters. 500-1000 tokens per chunk is typical. Smaller chunks improve precision; larger chunks improve recall.
  • Overlap prevents gaps. 50-100 token overlap between chunks ensures sentences at boundaries are captured.
  • Hybrid search beats pure vector search. Combining pgvector similarity with PostgreSQL full-text search improves results significantly.
  • Reranking improves quality. After retrieval, using a reranker (Cohere, or Colbert) reorders results for better LLM context.

Methodology

Based on publicly available information from Vercel AI SDK documentation, Supabase RAG guides, LangChain.js documentation, and community resources as of March 2026.


Building an RAG application? StarterPick helps you find the right SaaS foundation to build on top of.

Comments