Semantic Search
RushDB lets you search records by meaning, not just exact field values. Create an embedding index on any string property and every record that carries that property becomes searchable by natural-language similarity — while still supporting all the usual field filters, pagination, and graph traversal.
How It Works
- Create an index — Pick a label and a string property. RushDB creates a vector index policy.
- Embed — For managed indexes, RushDB generates embeddings automatically on write and backfills existing records. For external (BYOV) indexes, your application supplies pre-computed vectors.
- Search — Pass a natural-language query (managed) or a pre-computed vector (external). RushDB ranks candidates by cosine or euclidean similarity and returns scored results.
Managed vs. External Indexes
| Aspect | Managed | External (BYOV) |
|---|---|---|
| Embeddings generated by | RushDB (server-side) | Your application |
| Write flow | Automatic on record create/update | Supply vectors via upsertVectors or inline on write |
| Search input | Natural-language query string | Pre-computed queryVector array |
| Model control | RushDB-managed model | Any model, any dimension |
Both types store vectors on the value relationship between the property node and the record node, using Neo4j's native vector index for fast retrieval.
Combining with Field Filters
Semantic search is not an either/or — it composes with RushDB's structured query capabilities. Pass a where clause to pre-filter candidates before similarity ranking:
Search "space exploration"
WHERE genre = "sci-fi" AND year >= 2000
LIMIT 10
This narrows the vector search to only matching records, keeping results precise and fast.
Two Ways to Search Semantically
1. db.ai.search() — dedicated semantic endpoint
The simplest path. Returns records ranked by similarity score (__score):
- Accepts
query(text) for managed indexes orqueryVectorfor external indexes. - Supports
wherepre-filtering,limit, andskip. - Results always ordered by
__scoredescending (best match first).
2. vector.similarity aggregation in SearchQuery
For advanced use cases, add a vector.similarity.cosine or vector.similarity.euclidean aggregation to any db.records.find() call. This gives you the full SearchQuery feature set (groupBy, collect, multi-hop relationships) alongside similarity scoring.
→ See Search — Select Expressions for the aggregation syntax.
Index Lifecycle
| State | Description |
|---|---|
pending | Index created, backfill not yet started. |
indexing | Backfill in progress — existing records are being embedded. |
ready | All records indexed. New records are embedded on write automatically. |
You can check index status at any time and list all indexes for a project.
When to Use Semantic Search
| Scenario | Approach |
|---|---|
| User knows the exact value | Structured where filter |
| User describes what they want in natural language | db.ai.search() with query |
| Combine meaning + exact constraints | db.ai.search() with where pre-filter |
| Need groupBy, collect, or multi-hop alongside similarity | db.records.find() with vector.similarity aggregation |
→ See also Agent Memory Model for how semantic search fits into the three-layer retrieval stack.
Implementation Reference
Each interface covers search, indexing, and BYOV — pick the one that fits your stack: