Skip to main content

RAG Pipeline in Minutes

This tutorial builds a minimal but complete RAG pipeline:

  1. Read a folder of Markdown files
  2. Split each file into overlapping chunks
  3. Push chunks into RushDB (with source metadata)
  4. Create an embedding index on the chunk text
  5. Retrieve the top-K relevant chunks for a query
  6. Pass them to an LLM as context

Prerequisites: a running RushDB instance with RUSHDB_EMBEDDING_MODEL configured, or RushDB Cloud with AI enabled.


The docs folder

For this example, assume you have a local folder ./docs with a few Markdown files:

docs/
architecture.md
api-reference.md
deployment.md

Each file is a few hundred lines. The chunking step below splits them into overlapping windows so no context is lost at boundaries.


Step 1: Chunk and ingest

import os
import re
from rushdb import RushDB

db = RushDB("RUSHDB_API_KEY", base_url="https://api.rushdb.com/api/v1")

DOCS_DIR = "./docs"
CHUNK_SIZE = 400 # characters
CHUNK_OVERLAP = 80 # characters


def chunk_text(text: str, size: int, overlap: int) -> list[str]:
chunks, start = [], 0
while start < len(text):
end = min(start + size, len(text))
chunks.append(text[start:end].strip())
start += size - overlap
return [c for c in chunks if c]


records = []
for filename in os.listdir(DOCS_DIR):
if not filename.endswith(".md"):
continue
path = os.path.join(DOCS_DIR, filename)
with open(path) as f:
content = f.read()

# Extract first heading as title
match = re.search(r"^#\s+(.+)", content, re.MULTILINE)
title = match.group(1) if match else filename

for i, chunk in enumerate(chunk_text(content, CHUNK_SIZE, CHUNK_OVERLAP)):
records.append({
"source": filename,
"title": title,
"chunk_index": i,
"text": chunk,
})

db.records.import_json({"label": "CHUNK", "data": records})
print(f"Ingested {len(records)} chunks from {DOCS_DIR}")

Step 2: Create an embedding index

Index the text property so RushDB can run semantic search against it.

db.ai.indexes.create({
"label": "CHUNK",
"propertyName": "text"
})
print("Embedding index created — RushDB is backfilling in the background")

Backfilling runs asynchronously. Poll GET /api/v1/ai/indexes (or db.ai.indexes.find()) and wait until status is ready before running searches. For small corpora this usually takes under a minute.


Step 3: Retrieve relevant chunks

query = "How does RushDB handle nested JSON objects?"

result = db.ai.search({
"propertyName": "text",
"query": query,
"labels": ["CHUNK"],
"limit": 5
})

context_chunks = [r["text"] for r in result["data"]]
print(f"Retrieved {len(context_chunks)} chunks for: {query!r}")
for i, r in enumerate(result["data"]):
print(f"\n[{i+1}] (score {r.get('__score', 0):.3f})")
print(r["text"][:200] + "…")

Step 4: Generate an answer

Pass the retrieved chunks as context to any LLM. Example using the OpenAI SDK:

from openai import OpenAI

client = OpenAI()

context = "\n\n---\n\n".join(context_chunks)

response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "Answer using only the context provided. Be concise."
},
{
"role": "user",
"content": f"Context:\n{context}\n\nQuestion: {query}"
}
]
)

print(response.choices[0].message.content)

RushDB is the retrieval layer — any LLM or framework (LangChain, LlamaIndex, Vercel AI SDK) slots in at this step.


Filtering by source

If you want to scope retrieval to a specific file, add a where clause:

result = db.ai.search({
"propertyName": "text",
"query": "docker compose environment variables",
"labels": ["CHUNK"],
"where": { "source": { "$endsWith": "deployment.md" } },
"limit": 5
})

The where prefilter runs on the graph layer before semantic scoring — so you narrow candidates without sacrificing recall within the target file.


What's next

  • Add more metadata fields (author, date, section heading) — they're queryable without any schema changes
  • Use $startsWith or $in on source to search across a subset of files
  • Combine with transactions to atomically re-ingest a file when it changes
  • Replace the fixed-size chunker with a semantic splitter (split on headings, paragraphs, or sentences)