RAG Pipeline in Minutes
This tutorial builds a minimal but complete RAG pipeline:
- Read a folder of Markdown files
- Split each file into overlapping chunks
- Push chunks into RushDB (with source metadata)
- Create an embedding index on the chunk text
- Retrieve the top-K relevant chunks for a query
- Pass them to an LLM as context
Prerequisites: a running RushDB instance with RUSHDB_EMBEDDING_MODEL configured, or RushDB Cloud with AI enabled.
The docs folder
For this example, assume you have a local folder ./docs with a few Markdown files:
docs/
architecture.md
api-reference.md
deployment.md
Each file is a few hundred lines. The chunking step below splits them into overlapping windows so no context is lost at boundaries.
Step 1: Chunk and ingest
- Python
- TypeScript
- shell
import os
import re
from rushdb import RushDB
db = RushDB("RUSHDB_API_KEY", base_url="https://api.rushdb.com/api/v1")
DOCS_DIR = "./docs"
CHUNK_SIZE = 400 # characters
CHUNK_OVERLAP = 80 # characters
def chunk_text(text: str, size: int, overlap: int) -> list[str]:
chunks, start = [], 0
while start < len(text):
end = min(start + size, len(text))
chunks.append(text[start:end].strip())
start += size - overlap
return [c for c in chunks if c]
records = []
for filename in os.listdir(DOCS_DIR):
if not filename.endswith(".md"):
continue
path = os.path.join(DOCS_DIR, filename)
with open(path) as f:
content = f.read()
# Extract first heading as title
match = re.search(r"^#\s+(.+)", content, re.MULTILINE)
title = match.group(1) if match else filename
for i, chunk in enumerate(chunk_text(content, CHUNK_SIZE, CHUNK_OVERLAP)):
records.append({
"source": filename,
"title": title,
"chunk_index": i,
"text": chunk,
})
db.records.import_json({"label": "CHUNK", "data": records})
print(f"Ingested {len(records)} chunks from {DOCS_DIR}")
import RushDB from '@rushdb/javascript-sdk'
import fs from 'fs'
import path from 'path'
const db = new RushDB('RUSHDB_API_KEY')
const DOCS_DIR = './docs'
const CHUNK_SIZE = 400
const CHUNK_OVERLAP = 80
function chunkText(text: string, size: number, overlap: number): string[] {
const chunks: string[] = []
let start = 0
while (start < text.length) {
const end = Math.min(start + size, text.length)
const chunk = text.slice(start, end).trim()
if (chunk) chunks.push(chunk)
start += size - overlap
}
return chunks
}
const records: object[] = []
for (const filename of fs.readdirSync(DOCS_DIR)) {
if (!filename.endsWith('.md')) continue
const content = fs.readFileSync(path.join(DOCS_DIR, filename), 'utf-8')
const titleMatch = content.match(/^#\s+(.+)/m)
const title = titleMatch?.[1] ?? filename
chunkText(content, CHUNK_SIZE, CHUNK_OVERLAP).forEach((text, i) => {
records.push({ source: filename, title, chunk_index: i, text })
})
}
await db.records.importJson({ label: 'CHUNK', data: records })
console.log(`Ingested ${records.length} chunks from ${DOCS_DIR}`)
Assemble your chunks in any language, then POST them in a single batch:
curl -X POST https://api.rushdb.com/api/v1/records/import/json \
-H "Authorization: Bearer $RUSHDB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"label": "CHUNK",
"data": [
{
"source": "architecture.md",
"title": "Architecture Overview",
"chunk_index": 0,
"text": "RushDB stores every record as a node in a property graph..."
},
{
"source": "architecture.md",
"title": "Architecture Overview",
"chunk_index": 1,
"text": "Relationships between nested objects are created automatically..."
},
{
"source": "deployment.md",
"title": "Deployment Guide",
"chunk_index": 0,
"text": "You can run RushDB with Docker using the official image..."
}
]
}'
Step 2: Create an embedding index
Index the text property so RushDB can run semantic search against it.
- Python
- TypeScript
- shell
db.ai.indexes.create({
"label": "CHUNK",
"propertyName": "text"
})
print("Embedding index created — RushDB is backfilling in the background")
await db.ai.indexes.create({
label: 'CHUNK',
propertyName: 'text'
})
console.log('Embedding index created — backfilling in background')
curl -X POST https://api.rushdb.com/api/v1/ai/indexes \
-H "Authorization: Bearer $RUSHDB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"label": "CHUNK",
"propertyName": "text"
}'
Backfilling runs asynchronously. Poll GET /api/v1/ai/indexes (or db.ai.indexes.find()) and wait until status is ready before running searches. For small corpora this usually takes under a minute.
Step 3: Retrieve relevant chunks
- Python
- TypeScript
- shell
query = "How does RushDB handle nested JSON objects?"
result = db.ai.search({
"propertyName": "text",
"query": query,
"labels": ["CHUNK"],
"limit": 5
})
context_chunks = [r["text"] for r in result["data"]]
print(f"Retrieved {len(context_chunks)} chunks for: {query!r}")
for i, r in enumerate(result["data"]):
print(f"\n[{i+1}] (score {r.get('__score', 0):.3f})")
print(r["text"][:200] + "…")
const query = 'How does RushDB handle nested JSON objects?'
const { data: chunks } = await db.ai.search({
propertyName: 'text',
query,
labels: ['CHUNK'],
limit: 5
})
console.log(`Retrieved ${chunks.length} chunks for: "${query}"`)
chunks.forEach((chunk, i) => {
console.log(`\n[${i + 1}] score: ${chunk.__score.toFixed(3)}`)
console.log(chunk.text.slice(0, 200) + '…')
})
curl -X POST https://api.rushdb.com/api/v1/ai/search \
-H "Authorization: Bearer $RUSHDB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"propertyName": "text",
"query": "How does RushDB handle nested JSON objects?",
"labels": ["CHUNK"],
"limit": 5
}'
Step 4: Generate an answer
Pass the retrieved chunks as context to any LLM. Example using the OpenAI SDK:
- Python
- TypeScript
- shell
from openai import OpenAI
client = OpenAI()
context = "\n\n---\n\n".join(context_chunks)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "Answer using only the context provided. Be concise."
},
{
"role": "user",
"content": f"Context:\n{context}\n\nQuestion: {query}"
}
]
)
print(response.choices[0].message.content)
import OpenAI from 'openai'
const client = new OpenAI()
const context = context_chunks.join('\n\n---\n\n')
const response = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{
role: 'system',
content: 'Answer using only the context provided. Be concise.'
},
{
role: 'user',
content: `Context:\n${context}\n\nQuestion: ${query}`
}
]
})
console.log(response.choices[0].message.content)
CONTEXT=$(echo "$CHUNKS" | jq -r '.data[].text' | paste -sd '\n---\n' -)
curl -s https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"gpt-4o-mini\",
\"messages\": [
{\"role\": \"system\", \"content\": \"Answer using only the context provided. Be concise.\"},
{\"role\": \"user\", \"content\": \"Context:\\n$CONTEXT\\n\\nQuestion: $QUERY\"}
]
}" | jq -r '.choices[0].message.content'
RushDB is the retrieval layer — any LLM or framework (LangChain, LlamaIndex, Vercel AI SDK) slots in at this step.
Filtering by source
If you want to scope retrieval to a specific file, add a where clause:
- Python
- TypeScript
- shell
result = db.ai.search({
"propertyName": "text",
"query": "docker compose environment variables",
"labels": ["CHUNK"],
"where": { "source": { "$endsWith": "deployment.md" } },
"limit": 5
})
const { data } = await db.ai.search({
propertyName: 'text',
query: 'docker compose environment variables',
labels: ['CHUNK'],
where: { source: { $endsWith: 'deployment.md' } },
limit: 5
})
curl -X POST https://api.rushdb.com/api/v1/ai/search \
-H "Authorization: Bearer $RUSHDB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"propertyName": "text",
"query": "docker compose environment variables",
"labels": ["CHUNK"],
"where": { "source": { "$endsWith": "deployment.md" } },
"limit": 5
}'
The where prefilter runs on the graph layer before semantic scoring — so you narrow candidates without sacrificing recall within the target file.
What's next
- Add more metadata fields (author, date, section heading) — they're queryable without any schema changes
- Use
$startsWithor$inonsourceto search across a subset of files - Combine with transactions to atomically re-ingest a file when it changes
- Replace the fixed-size chunker with a semantic splitter (split on headings, paragraphs, or sentences)