AI & Semantic Search

RushDB is a self-aware memory layer for agents, humans, and apps. It continuously understands its own structure — labels, fields, value distributions, relationships — and exposes that knowledge so agents can reason over real data without hallucinating schema details, and apps can retrieve semantically relevant context on demand.

The db.ai namespace covers three capabilities:

Capability	Description
Graph Ontology	Self-describing schema discovery: label names, field types, value ranges, and the relationship map — always up to date
Embedding Indexes	Per-label vector policies that turn string properties into long-term semantic memory
Semantic Search	Cosine/euclidean similarity retrieval over indexed properties, for agents and apps alike

How it fits together

┌─────────────────────────────────────────────────────┐
│  Your data (records + relationships)                │
│                                                     │
│  BOOK { title: "...", description: "..." }          │
└────────────────────┬────────────────────────────────┘
                     │
         db.ai.indexes.create()
                     │
                     ▼
┌─────────────────────────────────────────────────────┐
│  Embedding index policy                             │
│  label: BOOK  property: description  dims: 1536    │
│  sourceType: managed | external                     │
└────────────────────┬────────────────────────────────┘
                     │
      Backfill (managed) / inline vectors (external)
                     │
                     ▼
┌─────────────────────────────────────────────────────┐
│  Vector stored on VALUE relationship                │
│  rel._emb_managed_cosine_1536 = [0.1, 0.2, ...]    │
└────────────────────┬────────────────────────────────┘
                     │
          db.ai.search({ query / queryVector })
                     │
                     ▼
┌─────────────────────────────────────────────────────┐
│  Records ranked by similarity score                 │
│  result.get('__score') == 0.94  (cosine sim.)      │
└─────────────────────────────────────────────────────┘

Quick links

Topic	Description
Ontology	Schema discovery with `get_ontology_markdown` / `get_ontology`
Indexing	Create and manage managed embedding indexes
Advanced Indexing — BYOV	Bring Your Own Vectors: external indexes, inline writes
Semantic Search	Query by meaning with `db.ai.search()`
Writing with Vectors	Attach vectors at create / upsert / import_json time
Agent Skills	Installable skills that teach any compatible agent to use RushDB

Graph Ontology

The ontology methods expose a live snapshot of your database structure — without any manual schema definitions.

Get Ontology as Markdown

db.ai.get_ontology_markdown()

Returns the full schema as compact Markdown — the recommended format for LLM context injection.

db.ai.get_ontology_markdown(
    params: dict | None = None,   # {"labels": ["Order"]} to scope output
                                  # {"force": True} to bypass the 1-hour cache
    transaction=None
) -> ApiResponse[str]

from rushdb import RushDB

db = RushDB("RUSHDB_API_KEY")

# Inject into LLM at session start
response = db.ai.get_ontology_markdown()
schema = response.data

messages = [
    {"role": "system", "content": f"You are a data assistant.\n\n{schema}"},
    {"role": "user",   "content": "How many paid orders are there?"}
]

# Scope to specific labels
order_response = db.ai.get_ontology_markdown({"labels": ["Order"]})

# Bypass the 1-hour cache and force a fresh recalculation
fresh_response = db.ai.get_ontology_markdown({"force": True})

Example output

# Graph Ontology

## Labels

| Label     | Count |
|-----------|------:|
| `Order`   |  1840 |
| `User`    |   312 |
| `Product` |    95 |

---

## `Order` (1840 records)

### Properties

| Property    | Type     | Values / Range                         | Semantic Search                |
|-------------|----------|----------------------------------------|--------------------------------|
| `status`    | string   | `pending`, `paid`, `shipped` (+2 more) | —                              |
| `total`     | number   | `4.99`..`2499.00`                      | —                              |
| `name`      | string   | `Widget A`, `Widget B` (+8 more)       | `managed` cosine 1536d [ready] |
| `createdAt` | datetime | `2024-01-03`..`2026-02-27`             | —                              |

### Relationships

| Type        | Direction | Other Label |
|-------------|-----------|-------------|
| `PLACED_BY` | out       | `User`      |
| `CONTAINS`  | out       | `Product`   |

Get Ontology (raw)

db.ai.get_ontology()

Returns the same ontology as a structured list of dicts — useful for schema UIs, auto-complete, or looking up property IDs for db.properties.values().

db.ai.get_ontology(
    params: dict | None = None,   # {"labels": ["Order"]} to scope output
                                  # {"force": True} to bypass the 1-hour cache
    transaction=None
) -> ApiResponse[list[dict]]

# List all labels with counts
response = db.ai.get_ontology()
for item in response.data:
    print(f"{item['label']}: {item['count']} records")

# Look up property ID for value enumeration
response = db.ai.get_ontology({"labels": ["Order"]})
order_schema = response.data[0]
status_prop = next(p for p in order_schema["properties"] if p["name"] == "status")

values_response = db.properties.values({"id": status_prop["id"]})
# ['pending', 'paid', 'shipped', 'cancelled', 'refunded']

# Identify semantically-searchable properties
indexed = [p for p in order_schema["properties"] if p.get("vectorIndexes")]
# indexed[0]["vectorIndexes"][0]["status"] == "ready" → queryable with db.ai.search()

# Bypass the 1-hour cache
fresh = db.ai.get_ontology({"force": True})

Each item in response.data:

{
    "label": str,
    "count": int,
    "properties": [
        {
            "id": str,                       # use with db.properties.values()
            "name": str,
            "type": str,                     # 'string' | 'number' | 'boolean' | 'datetime'
            "values": list,                  # up to 10 samples (string/boolean only)
            "min": str | float | None,       # number/datetime only
            "max": str | float | None,
            # non-empty when embedding indexes exist — property is queryable with db.ai.search()
            "vectorIndexes": [
                {
                    "id": str,
                    "sourceType": str,         # 'managed' | 'external'
                    "similarityFunction": str, # 'cosine' | 'euclidean'
                    "dimensions": int,
                    "status": str,             # 'pending' | 'indexing' | 'ready' | 'error'
                    "modelKey": str,
                }
            ],  # omitted (or empty list) when no index exists
        }
    ],
    "relationships": [
        {
            "label": str,
            "type": str,
            "direction": str,                # 'in' | 'out'
        }
    ]
}

Caching

Both methods share a 1-hour cache per project. The first call after TTL expiry triggers a full graph scan; all subsequent calls within the hour are instant. Pass {"force": True} in params to bypass the cache and trigger an immediate recalculation.

Agent quickstart

Call db.ai.get_ontology_markdown() first in every AI session. Without it, models will hallucinate field and label names.

Agent Skills

@rushdb/skills is a collection of Agent Skills — installable instructions that teach any skills-compatible AI agent (Claude, GitHub Copilot, Cursor, Windsurf, and others) to use RushDB efficiently, without manual system prompt engineering.

npx skills add rush-db/rushdb --path packages/skills

Skill	What it teaches
`rushdb-query-builder`	Discovery-first workflow, SearchQuery syntax, aggregation, relationship traversal, and semantic search
`rushdb-agent-memory`	Using RushDB as persistent structured memory — store, link, and semantically recall sessions, decisions, and entities
`rushdb-data-modeling`	LMPG model design, label/property naming conventions, nested JSON import, and schema evolution
`rushdb-faceted-search`	Build faceted filter UIs — discover properties and types, enumerate distinct values, map to widgets, assemble a live `where` clause

Each skill bundles a SKILL.md with concise instructions and optional reference files (like the full SearchQuery spec) that the agent loads on demand.

MCP server vs. Agent Skills

The MCP server gives agents direct tool access to RushDB at runtime. Agent Skills teach agents how to use those tools correctly — they complement each other.

How it fits together​

Quick links​

Graph Ontology​

Get Ontology as Markdown​

Get Ontology (raw)​

Agent Skills​