Skip to main content

AI & Semantic Search

RushDB is a self-aware memory layer for agents, humans, and apps. It continuously understands its own structure — labels, fields, value distributions, relationships — and exposes that knowledge so agents can reason over real data without hallucinating schema details, and apps can retrieve semantically relevant context on demand.

The db.ai namespace covers three capabilities:

CapabilityDescription
Graph OntologySelf-describing schema discovery: label names, field types, value ranges, and the relationship map — always up to date
Embedding IndexesPer-label vector policies that turn string properties into long-term semantic memory
Semantic SearchCosine/euclidean similarity retrieval over indexed properties, for agents and apps alike

How it fits together

┌─────────────────────────────────────────────────────┐
│ Your data (records + relationships) │
│ │
│ BOOK { title: "...", description: "..." } │
└────────────────────┬────────────────────────────────┘

db.ai.indexes.create()


┌─────────────────────────────────────────────────────┐
│ Embedding index policy │
│ label: BOOK property: description dims: 1536 │
│ sourceType: managed | external │
└────────────────────┬────────────────────────────────┘

Backfill (managed) / inline vectors (external)


┌─────────────────────────────────────────────────────┐
│ Vector stored on VALUE relationship │
│ rel._emb_managed_cosine_1536 = [0.1, 0.2, ...] │
└────────────────────┬────────────────────────────────┘

db.ai.search({ query / queryVector })


┌─────────────────────────────────────────────────────┐
│ Records ranked by similarity score │
│ result.get('__score') == 0.94 (cosine sim.) │
└─────────────────────────────────────────────────────┘

TopicDescription
OntologySchema discovery with get_ontology_markdown / get_ontology
IndexingCreate and manage managed embedding indexes
Advanced Indexing — BYOVBring Your Own Vectors: external indexes, inline writes
Semantic SearchQuery by meaning with db.ai.search()
Writing with VectorsAttach vectors at create / upsert / import_json time
Agent SkillsInstallable skills that teach any compatible agent to use RushDB

Graph Ontology

The ontology methods expose a live snapshot of your database structure — without any manual schema definitions.

Get Ontology as Markdown

db.ai.get_ontology_markdown()

Returns the full schema as compact Markdown — the recommended format for LLM context injection.

db.ai.get_ontology_markdown(
params: dict | None = None, # {"labels": ["Order"]} to scope output
# {"force": True} to bypass the 1-hour cache
transaction=None
) -> ApiResponse[str]
from rushdb import RushDB

db = RushDB("RUSHDB_API_KEY")

# Inject into LLM at session start
response = db.ai.get_ontology_markdown()
schema = response.data

messages = [
{"role": "system", "content": f"You are a data assistant.\n\n{schema}"},
{"role": "user", "content": "How many paid orders are there?"}
]

# Scope to specific labels
order_response = db.ai.get_ontology_markdown({"labels": ["Order"]})

# Bypass the 1-hour cache and force a fresh recalculation
fresh_response = db.ai.get_ontology_markdown({"force": True})
Example output
# Graph Ontology

## Labels

| Label | Count |
|-----------|------:|
| `Order` | 1840 |
| `User` | 312 |
| `Product` | 95 |

---

## `Order` (1840 records)

### Properties

| Property | Type | Values / Range | Semantic Search |
|-------------|----------|----------------------------------------|--------------------------------|
| `status` | string | `pending`, `paid`, `shipped` (+2 more) | — |
| `total` | number | `4.99`..`2499.00` | — |
| `name` | string | `Widget A`, `Widget B` (+8 more) | `managed` cosine 1536d [ready] |
| `createdAt` | datetime | `2024-01-03`..`2026-02-27` | — |

### Relationships

| Type | Direction | Other Label |
|-------------|-----------|-------------|
| `PLACED_BY` | out | `User` |
| `CONTAINS` | out | `Product` |

Get Ontology (raw)

db.ai.get_ontology()

Returns the same ontology as a structured list of dicts — useful for schema UIs, auto-complete, or looking up property IDs for db.properties.values().

db.ai.get_ontology(
params: dict | None = None, # {"labels": ["Order"]} to scope output
# {"force": True} to bypass the 1-hour cache
transaction=None
) -> ApiResponse[list[dict]]
# List all labels with counts
response = db.ai.get_ontology()
for item in response.data:
print(f"{item['label']}: {item['count']} records")

# Look up property ID for value enumeration
response = db.ai.get_ontology({"labels": ["Order"]})
order_schema = response.data[0]
status_prop = next(p for p in order_schema["properties"] if p["name"] == "status")

values_response = db.properties.values({"id": status_prop["id"]})
# ['pending', 'paid', 'shipped', 'cancelled', 'refunded']

# Identify semantically-searchable properties
indexed = [p for p in order_schema["properties"] if p.get("vectorIndexes")]
# indexed[0]["vectorIndexes"][0]["status"] == "ready" → queryable with db.ai.search()

# Bypass the 1-hour cache
fresh = db.ai.get_ontology({"force": True})

Each item in response.data:

{
"label": str,
"count": int,
"properties": [
{
"id": str, # use with db.properties.values()
"name": str,
"type": str, # 'string' | 'number' | 'boolean' | 'datetime'
"values": list, # up to 10 samples (string/boolean only)
"min": str | float | None, # number/datetime only
"max": str | float | None,
# non-empty when embedding indexes exist — property is queryable with db.ai.search()
"vectorIndexes": [
{
"id": str,
"sourceType": str, # 'managed' | 'external'
"similarityFunction": str, # 'cosine' | 'euclidean'
"dimensions": int,
"status": str, # 'pending' | 'indexing' | 'ready' | 'error'
"modelKey": str,
}
], # omitted (or empty list) when no index exists
}
],
"relationships": [
{
"label": str,
"type": str,
"direction": str, # 'in' | 'out'
}
]
}
Caching

Both methods share a 1-hour cache per project. The first call after TTL expiry triggers a full graph scan; all subsequent calls within the hour are instant. Pass {"force": True} in params to bypass the cache and trigger an immediate recalculation.

Agent quickstart

Call db.ai.get_ontology_markdown() first in every AI session. Without it, models will hallucinate field and label names.


Agent Skills

@rushdb/skills is a collection of Agent Skills — installable instructions that teach any skills-compatible AI agent (Claude, GitHub Copilot, Cursor, Windsurf, and others) to use RushDB efficiently, without manual system prompt engineering.

npx skills add rush-db/rushdb --path packages/skills
SkillWhat it teaches
rushdb-query-builderDiscovery-first workflow, SearchQuery syntax, aggregation, relationship traversal, and semantic search
rushdb-agent-memoryUsing RushDB as persistent structured memory — store, link, and semantically recall sessions, decisions, and entities
rushdb-data-modelingLMPG model design, label/property naming conventions, nested JSON import, and schema evolution
rushdb-faceted-searchBuild faceted filter UIs — discover properties and types, enumerate distinct values, map to widgets, assemble a live where clause

Each skill bundles a SKILL.md with concise instructions and optional reference files (like the full SearchQuery spec) that the agent loads on demand.

MCP server vs. Agent Skills

The MCP server gives agents direct tool access to RushDB at runtime. Agent Skills teach agents how to use those tools correctly — they complement each other.