Skip to main content

Agent-Safe Query Planning with Ontology First

LLMs that interact with databases without schema grounding make mistakes that are hard to catch: they invent label names, assume property shapes, and produce queries that return zero results or subtly wrong ones.

The fix is a disciplined execution loop: load the schema first, learn the query spec second, then build queries constrained to what actually exists. This tutorial teaches that loop in code and in the MCP server.


The guarded execution loop

Every agent session that touches RushDB should follow this shape.


Step 1: Load and ground on the ontology

At the start of every agent session, fetch the ontology and store the exact label and property names observed.

from rushdb import RushDB
import os

db = RushDB(os.environ["RUSHDB_API_KEY"], base_url="https://api.rushdb.com/api/v1")


def load_ontology() -> str:
"""Returns ontology markdown for injecting into agent system prompt."""
return db.ai.get_ontology_markdown()


schema_context = load_ontology()

Step 2: Validate label names before querying

When the agent generates a query, validate that every label in labels and where appears in the ontology before executing.

def extract_labels(ontology) -> set:
labels = set()
if isinstance(ontology, list):
for entry in ontology:
if "label" in entry:
labels.add(entry["label"])
return labels


def safe_find(query: dict) -> object:
ontology = db.ai.get_ontology()
known_labels = extract_labels(ontology)

requested_labels = query.get("labels", [])
unknown = [l for l in requested_labels if l not in known_labels]
if unknown:
raise ValueError(
f"Unknown labels: {unknown}. Known: {list(known_labels)}"
)
return db.records.find(query)


try:
result = safe_find({"labels": ["CUSTOMER"], "where": {"status": "active"}, "limit": 10})
except ValueError as e:
print(e)
# Agent: re-check ontology, pick correct label, retry

Step 3: Handle zero-result queries without hallucinating

When a query returns zero results, the agent should widen the filter — not invent records or claim they exist.

def query_with_fallback(label: str, filter_dict: dict) -> object:
result = db.records.find({"labels": [label], "where": filter_dict, "limit": 10})

if not result.data:
print("Zero results. Diagnosing filter…")
for key in list(filter_dict.keys()):
partial_where = {k: v for k, v in filter_dict.items() if k != key}
partial = db.records.find({"labels": [label], "where": partial_where, "limit": 1})
if partial.total > 0:
print(f'Key "{key}" = "{filter_dict[key]}" eliminates all results')
dist = db.records.find({
"labels": [label],
"select": {
"count": {"$count": "*"},
key: f"$record.{key}"
},
"groupBy": [key, "count"],
"orderBy": {"count": "desc"},
"limit": 10
})
print(f'Actual "{key}" values:', dist.data)
break

return result

Step 4: Use the MCP query builder prompt in agent sessions

RushDB's MCP server provides a getQueryBuilderPrompt tool that returns a system prompt enforcing ontology-first behavior. Inject it into your agent's system message.

In Claude or Cursor:

Use the getQueryBuilderPrompt tool to load your operating instructions before making any queries.

In code:

# Same pattern in Python — build system context from both sources
system_prompt = "\n".join([
query_builder_prompt, # from getQueryBuilderPrompt MCP tool
"",
"## Current Schema",
schema_context # from get_ontology_markdown()
])

The five rules for agent-safe queries

These rules prevent the most common agent mistakes:

  1. Always call getOntologyMarkdown first — never assume label names from memory or conversation history
  2. Use only labels that appear in the ontology — invented labels return zero results silently
  3. Use only property names that appear for those labels — unknown properties in where are ignored without error, producing misleading results
  4. Enumerate categorical values before filtering — never guess status/type/category strings
  5. Test direction before building traversal queries — a wrong direction returns zero results instead of an error

Production caveat

Ontology grounding is a first-call overhead: one getOntologyMarkdown request per agent session. For high-throughput agents that execute many queries per session, cache the ontology for the session duration and invalidate it if a query returns unexpected zero results (which may indicate a schema change mid-session).


Next steps