AI & Semantic Search
RushDB is a self-aware memory layer for agents, humans, and apps. It continuously understands its own structure — labels, fields, value distributions, relationships — and exposes that knowledge so agents can reason over real data without hallucinating schema details, and apps can retrieve semantically relevant context on demand.
The db.ai namespace covers three capabilities:
| Capability | Description |
|---|---|
| Graph Ontology | Self-describing schema discovery: label names, field types, value ranges, and the relationship map — always up to date |
| Embedding Indexes | Per-label vector policies that turn string properties into long-term semantic memory |
| Semantic Search | Cosine/euclidean similarity retrieval over indexed properties, for agents and apps alike |
How it fits together
┌─────────────────────────────────────────────────────┐
│ Your data (records + relationships) │
│ │
│ BOOK { title: "...", description: "..." } │
└────────────────────┬────────────────────────────────┘
│
db.ai.indexes.create()
│
▼
┌─────────────────────────────────────────────────────┐
│ Embedding index policy │
│ label: BOOK property: description dims: 1536 │
│ sourceType: managed | external │
└────────────────────┬────────────────────────────────┘
│
Backfill (managed) / inline vectors (external)
│
▼
┌─────────────────────────────────────────────────────┐
│ Vector stored on VALUE relationship │
│ rel._emb_managed_cosine_1536 = [0.1, 0.2, ...] │
└────────────────────┬────────────────────────────────┘
│
db.ai.search({ query / queryVector })
│
▼
┌─────────────────────────────────────────────────────┐
│ Records ranked by similarity score │
│ result.get('__score') == 0.94 (cosine sim.) │
└─────────────────────────────────────────────────────┘
Quick links
| Topic | Description |
|---|---|
| Ontology | Schema discovery with get_ontology_markdown / get_ontology |
| Indexing | Create and manage managed embedding indexes |
| Advanced Indexing — BYOV | Bring Your Own Vectors: external indexes, inline writes |
| Semantic Search | Query by meaning with db.ai.search() |
| Writing with Vectors | Attach vectors at create / upsert / import_json time |
| Agent Skills | Installable skills that teach any compatible agent to use RushDB |
Graph Ontology
The ontology methods expose a live snapshot of your database structure — without any manual schema definitions.
Get Ontology as Markdown
db.ai.get_ontology_markdown()
Returns the full schema as compact Markdown — the recommended format for LLM context injection.
db.ai.get_ontology_markdown(
params: dict | None = None, # {"labels": ["Order"]} to scope output
# {"force": True} to bypass the 1-hour cache
transaction=None
) -> ApiResponse[str]
from rushdb import RushDB
db = RushDB("RUSHDB_API_KEY")
# Inject into LLM at session start
response = db.ai.get_ontology_markdown()
schema = response.data
messages = [
{"role": "system", "content": f"You are a data assistant.\n\n{schema}"},
{"role": "user", "content": "How many paid orders are there?"}
]
# Scope to specific labels
order_response = db.ai.get_ontology_markdown({"labels": ["Order"]})
# Bypass the 1-hour cache and force a fresh recalculation
fresh_response = db.ai.get_ontology_markdown({"force": True})
Example output
# Graph Ontology
## Labels
| Label | Count |
|-----------|------:|
| `Order` | 1840 |
| `User` | 312 |
| `Product` | 95 |
---
## `Order` (1840 records)
### Properties
| Property | Type | Values / Range | Semantic Search |
|-------------|----------|----------------------------------------|--------------------------------|
| `status` | string | `pending`, `paid`, `shipped` (+2 more) | — |
| `total` | number | `4.99`..`2499.00` | — |
| `name` | string | `Widget A`, `Widget B` (+8 more) | `managed` cosine 1536d [ready] |
| `createdAt` | datetime | `2024-01-03`..`2026-02-27` | — |
### Relationships
| Type | Direction | Other Label |
|-------------|-----------|-------------|
| `PLACED_BY` | out | `User` |
| `CONTAINS` | out | `Product` |
Get Ontology (raw)
db.ai.get_ontology()
Returns the same ontology as a structured list of dicts — useful for schema UIs, auto-complete, or looking up property IDs for db.properties.values().
db.ai.get_ontology(
params: dict | None = None, # {"labels": ["Order"]} to scope output
# {"force": True} to bypass the 1-hour cache
transaction=None
) -> ApiResponse[list[dict]]
# List all labels with counts
response = db.ai.get_ontology()
for item in response.data:
print(f"{item['label']}: {item['count']} records")
# Look up property ID for value enumeration
response = db.ai.get_ontology({"labels": ["Order"]})
order_schema = response.data[0]
status_prop = next(p for p in order_schema["properties"] if p["name"] == "status")
values_response = db.properties.values({"id": status_prop["id"]})
# ['pending', 'paid', 'shipped', 'cancelled', 'refunded']
# Identify semantically-searchable properties
indexed = [p for p in order_schema["properties"] if p.get("vectorIndexes")]
# indexed[0]["vectorIndexes"][0]["status"] == "ready" → queryable with db.ai.search()
# Bypass the 1-hour cache
fresh = db.ai.get_ontology({"force": True})
Each item in response.data:
{
"label": str,
"count": int,
"properties": [
{
"id": str, # use with db.properties.values()
"name": str,
"type": str, # 'string' | 'number' | 'boolean' | 'datetime'
"values": list, # up to 10 samples (string/boolean only)
"min": str | float | None, # number/datetime only
"max": str | float | None,
# non-empty when embedding indexes exist — property is queryable with db.ai.search()
"vectorIndexes": [
{
"id": str,
"sourceType": str, # 'managed' | 'external'
"similarityFunction": str, # 'cosine' | 'euclidean'
"dimensions": int,
"status": str, # 'pending' | 'indexing' | 'ready' | 'error'
"modelKey": str,
}
], # omitted (or empty list) when no index exists
}
],
"relationships": [
{
"label": str,
"type": str,
"direction": str, # 'in' | 'out'
}
]
}
Both methods share a 1-hour cache per project. The first call after TTL expiry triggers a full graph scan; all subsequent calls within the hour are instant. Pass {"force": True} in params to bypass the cache and trigger an immediate recalculation.
Call db.ai.get_ontology_markdown() first in every AI session. Without it, models will hallucinate field and label names.
Agent Skills
@rushdb/skills is a collection of Agent Skills — installable instructions that teach any skills-compatible AI agent (Claude, GitHub Copilot, Cursor, Windsurf, and others) to use RushDB efficiently, without manual system prompt engineering.
npx skills add rush-db/rushdb --path packages/skills
| Skill | What it teaches |
|---|---|
rushdb-query-builder | Discovery-first workflow, SearchQuery syntax, aggregation, relationship traversal, and semantic search |
rushdb-agent-memory | Using RushDB as persistent structured memory — store, link, and semantically recall sessions, decisions, and entities |
rushdb-data-modeling | LMPG model design, label/property naming conventions, nested JSON import, and schema evolution |
rushdb-faceted-search | Build faceted filter UIs — discover properties and types, enumerate distinct values, map to widgets, assemble a live where clause |
Each skill bundles a SKILL.md with concise instructions and optional reference files (like the full SearchQuery spec) that the agent loads on demand.
The MCP server gives agents direct tool access to RushDB at runtime. Agent Skills teach agents how to use those tools correctly — they complement each other.