RAG Pipeline in Minutes

This tutorial builds a minimal but complete RAG pipeline:

Read a folder of Markdown files
Split each file into overlapping chunks
Push chunks into RushDB (with source metadata)
Create an embedding index on the chunk text
Retrieve the top-K relevant chunks for a query
Pass them to an LLM as context

Prerequisites: a running RushDB instance with RUSHDB_EMBEDDING_MODEL configured, or RushDB Cloud with AI enabled.

The docs folder

For this example, assume you have a local folder ./docs with a few Markdown files:

docs/
  architecture.md
  api-reference.md
  deployment.md

Each file is a few hundred lines. The chunking step below splits them into overlapping windows so no context is lost at boundaries.

Step 1: Chunk and ingest

Python
TypeScript
shell

import os
import re
from rushdb import RushDB

db = RushDB("RUSHDB_API_KEY", base_url="https://api.rushdb.com/api/v1")

DOCS_DIR = "./docs"
CHUNK_SIZE = 400      # characters
CHUNK_OVERLAP = 80    # characters


def chunk_text(text: str, size: int, overlap: int) -> list[str]:
    chunks, start = [], 0
    while start < len(text):
        end = min(start + size, len(text))
        chunks.append(text[start:end].strip())
        start += size - overlap
    return [c for c in chunks if c]


records = []
for filename in os.listdir(DOCS_DIR):
    if not filename.endswith(".md"):
        continue
    path = os.path.join(DOCS_DIR, filename)
    with open(path) as f:
        content = f.read()

    # Extract first heading as title
    match = re.search(r"^#\s+(.+)", content, re.MULTILINE)
    title = match.group(1) if match else filename

    for i, chunk in enumerate(chunk_text(content, CHUNK_SIZE, CHUNK_OVERLAP)):
        records.append({
            "source": filename,
            "title": title,
            "chunk_index": i,
            "text": chunk,
        })

db.records.import_json({"label": "CHUNK", "data": records})
print(f"Ingested {len(records)} chunks from {DOCS_DIR}")

import RushDB from '@rushdb/javascript-sdk'
import fs from 'fs'
import path from 'path'

const db = new RushDB('RUSHDB_API_KEY')

const DOCS_DIR = './docs'
const CHUNK_SIZE = 400
const CHUNK_OVERLAP = 80

function chunkText(text: string, size: number, overlap: number): string[] {
  const chunks: string[] = []
  let start = 0
  while (start < text.length) {
    const end = Math.min(start + size, text.length)
    const chunk = text.slice(start, end).trim()
    if (chunk) chunks.push(chunk)
    start += size - overlap
  }
  return chunks
}

const records: object[] = []

for (const filename of fs.readdirSync(DOCS_DIR)) {
  if (!filename.endsWith('.md')) continue
  const content = fs.readFileSync(path.join(DOCS_DIR, filename), 'utf-8')
  const titleMatch = content.match(/^#\s+(.+)/m)
  const title = titleMatch?.[1] ?? filename

  chunkText(content, CHUNK_SIZE, CHUNK_OVERLAP).forEach((text, i) => {
    records.push({ source: filename, title, chunk_index: i, text })
  })
}

await db.records.importJson({ label: 'CHUNK', data: records })
console.log(`Ingested ${records.length} chunks from ${DOCS_DIR}`)

Assemble your chunks in any language, then POST them in a single batch:

curl -X POST https://api.rushdb.com/api/v1/records/import/json \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "label": "CHUNK",
    "data": [
      {
        "source": "architecture.md",
        "title": "Architecture Overview",
        "chunk_index": 0,
        "text": "RushDB stores every record as a node in a property graph..."
      },
      {
        "source": "architecture.md",
        "title": "Architecture Overview",
        "chunk_index": 1,
        "text": "Relationships between nested objects are created automatically..."
      },
      {
        "source": "deployment.md",
        "title": "Deployment Guide",
        "chunk_index": 0,
        "text": "You can run RushDB with Docker using the official image..."
      }
    ]
  }'

Step 2: Create an embedding index

Index the text property so RushDB can run semantic search against it.

Python
TypeScript
shell

db.ai.indexes.create({
    "label": "CHUNK",
    "propertyName": "text"
})
print("Embedding index created — RushDB is backfilling in the background")

await db.ai.indexes.create({
  label: 'CHUNK',
  propertyName: 'text'
})
console.log('Embedding index created — backfilling in background')

curl -X POST https://api.rushdb.com/api/v1/ai/indexes \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "label": "CHUNK",
    "propertyName": "text"
  }'

Backfilling runs asynchronously. Poll GET /api/v1/ai/indexes (or db.ai.indexes.find()) and wait until status is ready before running searches. For small corpora this usually takes under a minute.

Step 3: Retrieve relevant chunks

Python
TypeScript
shell

query = "How does RushDB handle nested JSON objects?"

result = db.ai.search({
    "propertyName": "text",
    "query": query,
    "labels": ["CHUNK"],
    "limit": 5
})

context_chunks = [r["text"] for r in result["data"]]
print(f"Retrieved {len(context_chunks)} chunks for: {query!r}")
for i, r in enumerate(result["data"]):
    print(f"\n[{i+1}] (score {r.get('__score', 0):.3f})")
    print(r["text"][:200] + "…")

const query = 'How does RushDB handle nested JSON objects?'

const { data: chunks } = await db.ai.search({
  propertyName: 'text',
  query,
  labels: ['CHUNK'],
  limit: 5
})

console.log(`Retrieved ${chunks.length} chunks for: "${query}"`)
chunks.forEach((chunk, i) => {
  console.log(`\n[${i + 1}] score: ${chunk.__score.toFixed(3)}`)
  console.log(chunk.text.slice(0, 200) + '…')
})

curl -X POST https://api.rushdb.com/api/v1/ai/search \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "propertyName": "text",
    "query": "How does RushDB handle nested JSON objects?",
    "labels": ["CHUNK"],
    "limit": 5
  }'

Step 4: Generate an answer

Pass the retrieved chunks as context to any LLM. Example using the OpenAI SDK:

Python
TypeScript
shell

from openai import OpenAI

client = OpenAI()

context = "\n\n---\n\n".join(context_chunks)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system",
            "content": "Answer using only the context provided. Be concise."
        },
        {
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {query}"
        }
    ]
)

print(response.choices[0].message.content)

import OpenAI from 'openai'

const client = new OpenAI()

const context = context_chunks.join('\n\n---\n\n')

const response = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [
    {
      role: 'system',
      content: 'Answer using only the context provided. Be concise.'
    },
    {
      role: 'user',
      content: `Context:\n${context}\n\nQuestion: ${query}`
    }
  ]
})

console.log(response.choices[0].message.content)

CONTEXT=$(echo "$CHUNKS" | jq -r '.data[].text' | paste -sd '\n---\n' -)

curl -s https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"gpt-4o-mini\",
    \"messages\": [
      {\"role\": \"system\", \"content\": \"Answer using only the context provided. Be concise.\"},
      {\"role\": \"user\", \"content\": \"Context:\\n$CONTEXT\\n\\nQuestion: $QUERY\"}
    ]
  }" | jq -r '.choices[0].message.content'

RushDB is the retrieval layer — any LLM or framework (LangChain, LlamaIndex, Vercel AI SDK) slots in at this step.

Filtering by source

If you want to scope retrieval to a specific file, add a where clause:

Python
TypeScript
shell

result = db.ai.search({
    "propertyName": "text",
    "query": "docker compose environment variables",
    "labels": ["CHUNK"],
    "where": { "source": { "$endsWith": "deployment.md" } },
    "limit": 5
})

const { data } = await db.ai.search({
  propertyName: 'text',
  query: 'docker compose environment variables',
  labels: ['CHUNK'],
  where: { source: { $endsWith: 'deployment.md' } },
  limit: 5
})

curl -X POST https://api.rushdb.com/api/v1/ai/search \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "propertyName": "text",
    "query": "docker compose environment variables",
    "labels": ["CHUNK"],
    "where": { "source": { "$endsWith": "deployment.md" } },
    "limit": 5
  }'

The where prefilter runs on the graph layer before semantic scoring — so you narrow candidates without sacrificing recall within the target file.

What's next

Add more metadata fields (author, date, section heading) — they're queryable without any schema changes
Use $startsWith or $in on source to search across a subset of files
Combine with transactions to atomically re-ingest a file when it changes
Replace the fixed-size chunker with a semantic splitter (split on headings, paragraphs, or sentences)

The docs folder​

Step 1: Chunk and ingest​

Step 2: Create an embedding index​

Step 3: Retrieve relevant chunks​

Step 4: Generate an answer​

Filtering by source​

What's next​