Skip to main content

Discover Your Schema

RushDB is self-aware — it continuously understands its own structure: labels, field types, value distributions, relationships, and semantic search readiness. The db.ai ontology methods expose this as either Markdown (for LLM injection) or structured JSON (for schema UIs, autocomplete, or property-ID lookups).


What the Ontology Contains

The ontology is RushDB's always-current answer to "what does this project's data look like?" — computed from the graph itself, never declared upfront.

ComponentDescription
Label inventoryAll label names in the project, with record counts
Property manifest per labelProperty name, type, and either sample values (strings/booleans) or a min–max range (numbers/datetimes)
Relationship mapWhich labels connect to which, via which relationship type, and in which direction
Semantic index statusWhich properties have embedding indexes and whether they are ready for db.ai.search()

Building an AI agent? Schema Self-Awareness shows what the rendered output looks like and how to inject it into LLM context.


Get Ontology as Markdown

The recommended format for LLM context injection — compact, token-efficient, and ready to paste into a system prompt.

db.ai.get_ontology_markdown()

from rushdb import RushDB

db = RushDB("RUSHDB_API_KEY")

# Full schema as a Markdown document
response = db.ai.get_ontology_markdown()
schema = response.data

# Scope to specific labels
order_response = db.ai.get_ontology_markdown({"labels": ["Order"]})

# Bypass the 1-hour cache and force a fresh recalculation
fresh_response = db.ai.get_ontology_markdown({"force": True})

See a complete rendered example — graph, Markdown, and JSON side by side — in Schema Self-Awareness.


Get Ontology (Structured JSON)

Returns the same ontology as a structured array — useful for schema UIs, autocomplete, or looking up property IDs for db.properties.values().

db.ai.get_ontology()

# List all labels with counts
response = db.ai.get_ontology()
for item in response.data:
print(f"{item['label']}: {item['count']} records")

# Scope to specific labels
book_response = db.ai.get_ontology({"labels": ["Book"]})
book_schema = book_response.data[0]

# Get property ID for value enumeration
genre_prop = next(p for p in book_schema["properties"] if p["name"] == "genre")
genres = db.properties.values(genre_prop["id"])

# Identify semantically searchable properties
indexed = [p for p in book_schema["properties"] if p.get("vectorIndexes")]
# indexed[0]["vectorIndexes"][0]["status"] == "ready" → queryable with db.ai.search()

# Bypass the 1-hour cache
fresh_response = db.ai.get_ontology({"force": True})

TypeScript types

type OntologyItem = {
label: string
count: number
properties: OntologyProperty[]
relationships: OntologyRelationship[]
}

type OntologyProperty = {
id: string // use with db.properties.values()
name: string
type: string // 'string' | 'number' | 'boolean' | 'datetime'
values?: Array<string | number> // up to 10 samples (string/boolean only)
min?: number | string // number/datetime only
max?: number | string
/** Non-empty when embedding indexes exist — property is queryable with db.ai.search() */
vectorIndexes?: OntologyVectorIndex[]
}

type OntologyVectorIndex = {
id: string
sourceType: string // 'managed' | 'external'
similarityFunction: string // 'cosine' | 'euclidean'
dimensions: number
status: string // 'pending' | 'indexing' | 'ready' | 'error'
modelKey: string
}

type OntologyRelationship = {
label: string
type: string
direction: 'in' | 'out'
count?: number
properties?: Array<{
name: string
type: string
values?: Array<string | number | boolean>
min?: number | string
max?: number | string
relationshipsCount: number
}>
}

Caching

Caching

Both methods share a 1-hour cache per project. The first call after TTL expiry triggers a full graph scan; all subsequent calls within the hour are instant. Pass { force: true } to bypass the cache and trigger an immediate recalculation.


See also