Semantic Search

RushDB lets you search records by meaning, not just exact field values. Create an embedding index on any string property and every record that carries that property becomes searchable by natural-language similarity — while still supporting all the usual field filters, pagination, and graph traversal.

How It Works

Create an index — Pick a label and a string property. RushDB creates a vector index policy.
Embed — For managed indexes, RushDB generates embeddings automatically on write and backfills existing records. For external (BYOV) indexes, your application supplies pre-computed vectors.
Search — Pass a natural-language query (managed) or a pre-computed vector (external). RushDB ranks candidates by cosine or euclidean similarity and returns scored results.

Managed vs. External Indexes

Aspect	Managed	External (BYOV)
Embeddings generated by	RushDB (server-side)	Your application
Write flow	Automatic on record create/update	Supply vectors via `upsertVectors` or inline on write
Search input	Natural-language `query` string	Pre-computed `queryVector` array
Model control	RushDB-managed model	Any model, any dimension

Both types store vectors on the value relationship between the property node and the record node, using Neo4j's native vector index for fast retrieval.

Combining with Field Filters

Semantic search is not an either/or — it composes with RushDB's structured query capabilities. Pass a where clause to pre-filter candidates before similarity ranking:

Search "space exploration"
  WHERE genre = "sci-fi" AND year >= 2000
  LIMIT 10

This narrows the vector search to only matching records, keeping results precise and fast.

Two Ways to Search Semantically

1. `db.ai.search()` — dedicated semantic endpoint

The simplest path. Returns records ranked by similarity score (__score):

Accepts query (text) for managed indexes or queryVector for external indexes.
Supports where pre-filtering, limit, and skip.
Results always ordered by __score descending (best match first).

2. `vector.similarity` aggregation in SearchQuery

For advanced use cases, add a vector.similarity.cosine or vector.similarity.euclidean aggregation to any db.records.find() call. This gives you the full SearchQuery feature set (groupBy, collect, multi-hop relationships) alongside similarity scoring.

→ See Search — Select Expressions for the aggregation syntax.

Index Lifecycle

State	Description
`pending`	Index created, backfill not yet started.
`indexing`	Backfill in progress — existing records are being embedded.
`ready`	All records indexed. New records are embedded on write automatically.

You can check index status at any time and list all indexes for a project.

When to Use Semantic Search

Scenario	Approach
User knows the exact value	Structured `where` filter
User describes what they want in natural language	`db.ai.search()` with `query`
Combine meaning + exact constraints	`db.ai.search()` with `where` pre-filter
Need groupBy, collect, or multi-hop alongside similarity	`db.records.find()` with `vector.similarity` aggregation

→ See also Agent Memory Model for how semantic search fits into the three-layer retrieval stack.

Implementation Reference

Each interface covers search, indexing, and BYOV — pick the one that fits your stack:

TypeScript SDKdb.ai.search · db.ai.indexes · BYOV Python SDKdb.ai.search · db.ai.indexes · BYOV REST APIPOST /ai/search · /ai/indexes · BYOV

How It Works​

Managed vs. External Indexes​

Combining with Field Filters​

Two Ways to Search Semantically​

1. db.ai.search() — dedicated semantic endpoint​

2. vector.similarity aggregation in SearchQuery​

Index Lifecycle​

When to Use Semantic Search​

Implementation Reference​