Advanced Indexing — Bring Your Own Vectors
External indexes (BYOV — Bring Your Own Vectors) let you supply pre-computed embedding vectors instead of having the server compute them. Use them when you need:
- A custom or private model the server cannot access
- Multimodal embeddings (image, audio, document structure)
- Vectors already produced by your ML pipeline
- Reproducible embeddings not tied to the server's active model
Creating an external index
Pass external: true (shorthand) or sourceType: 'external' (explicit). Both are equivalent:
// ── shorthand ────────────────────────────────────────────────
const { data: extIndex } = await db.ai.indexes.create({
label: 'Article',
propertyName: 'body',
external: true,
dimensions: 768,
similarityFunction: 'cosine',
})
// extIndex.sourceType === 'external'
// extIndex.status === 'awaiting_vectors'
// ── explicit ─────────────────────────────────────────────────
const { data: extIndex } = await db.ai.indexes.create({
label: 'Article',
propertyName: 'body',
sourceType: 'external',
dimensions: 768,
similarityFunction: 'cosine',
})
An external index starts with status awaiting_vectors and transitions to ready once at least one vector has been written.
Because the server never calls an embedding model ,
dimensionsis required for external indexes.
External vs managed comparison
| Managed | External | |
|---|---|---|
sourceType | 'managed' | 'external' |
| Initial status | 'pending' | 'awaiting_vectors' |
| Who computes embeddings | RushDB server (via configured model) | Your application |
dimensions | Optional (uses server default) | Required |
| Backfill for existing records | Automatic | Manual via upsertVectors or inline writes |
Upsert Vectors
db.ai.indexes.upsertVectors()
The bulk upload API — ideal for seeding an index from a dataset or syncing after a batch pipeline.
db.ai.indexes.upsertVectors(
indexId: string,
payload: { items: Array<{ recordId: string; vector: number[] }> }
): Promise<ApiResponse<void>>
const { data: records } = await db.records.find(
{ where: { __label: 'Article' } }
)
const myEmbedder = new MyEmbeddingModel()
const items = await Promise.all(
records.map(async record => ({
recordId: record.__id,
vector: await myEmbedder.embed(record.body)
}))
)
await db.ai.indexes.upsertVectors(extIndex.id, { items })
The request is idempotent — calling it again with the same recordId replaces the stored vector.
Writing vectors at record creation time
Instead of a two-step create → upsertVectors flow, you can write vectors inline using the vectors parameter on any write operation. The server resolves the correct external index automatically.
See Write Operations with Vectors for the full reference.
// One-step: create record AND write its vector
const { data: record } = await db.records.create({
label: 'Article',
data: { title: 'Warp drives', body: 'Alcubierre metric...' },
vectors: [{ propertyName: 'body', vector: myVec }]
})
Disambiguation
When the same (label, propertyName) pair is covered by more than one external index (different similarityFunction or dimensions), RushDB cannot determine which index to use without extra information.
Specify similarityFunction to resolve the ambiguity:
// Two indexes on Product:embedding — cosine and euclidean
await db.ai.indexes.create({
label: 'Product', propertyName: 'embedding', external: true,
similarityFunction: 'cosine', dimensions: 768,
})
await db.ai.indexes.create({
label: 'Product', propertyName: 'embedding', external: true,
similarityFunction: 'euclidean', dimensions: 768,
})
// ✅ explicit — writes to the cosine index only
await db.records.create({
label: 'Product',
data: { name: 'Widget' },
vectors: [{
propertyName: 'embedding',
vector: vec,
similarityFunction: 'cosine', // <-- required when ambiguous
}]
})
// ✅ explicit — searches the euclidean index only
await db.ai.search({
label: 'Product',
propertyName: 'embedding',
queryVector: vec,
similarityFunction: 'euclidean', // <-- required when ambiguous
})
// ❌ omitting similarityFunction when two indexes exist → 422 Unprocessable Entity
await db.records.create({
label: 'Product',
data: { name: 'Gadget' },
vectors: [{ propertyName: 'embedding', vector: vec }],
})
Index signature uniqueness
Two index policies are considered identical (and a second create returns 409 Conflict) when all five fields match:
| Field | Effect on uniqueness |
|---|---|
label | ✅ |
propertyName | ✅ |
sourceType | ✅ |
similarityFunction | ✅ |
dimensions | ✅ |
Changing any one field produces a distinct index and both are allowed to coexist.
Complete BYOV worked example
import RushDB from '@rushdb/javascript-sdk'
const db = new RushDB('your-api-key')
// 1. Create the external index
const { data: idx } = await db.ai.indexes.create({
label: 'Doc',
propertyName: 'content',
external: true,
dimensions: 3,
similarityFunction: 'cosine',
})
// 2. Create records + write inline vectors (one round trip per record)
const articles = [
{ title: 'Alpha', content: 'First article', vector: [1, 0, 0] },
{ title: 'Beta', content: 'Second article', vector: [0, 1, 0] },
{ title: 'Gamma', content: 'Third article', vector: [0, 0, 1] },
]
for (const { title, content, vector } of articles) {
await db.records.create({
label: 'Doc',
data: { title, content },
vectors: [{ propertyName: 'content', vector }],
})
}
// 3. Search using a pre-computed query vector
const { data: results } = await db.ai.search({
label: 'Doc',
propertyName: 'content',
queryVector: [1, 0, 0], // closest to Alpha
limit: 3,
})
console.log(results[0].title) // 'Alpha'
console.log(results[0].__score) // ~1.0
Batch import with createMany
For bulk seeding with flat rows, use records.createMany() with the top-level vectors parameter:
await db.records.createMany({
label: "Doc",
data: [
{ title: "Alpha", content: "First article" },
{ title: "Beta", content: "Second article" },
{ title: "Gamma", content: "Third article" },
],
vectors: [
[{ propertyName: "content", vector: [1, 0, 0] }],
[{ propertyName: "content", vector: [0, 1, 0] }],
[{ propertyName: "content", vector: [0, 0, 1] }],
],
})
For nested JSON payloads, use importJson to create records and then call db.ai.indexes.upsertVectors() to seed the vectors separately.
Mixing managed and external indexes
You can have both a managed index and an external index on the same property simultaneously:
// Managed — server embeds for full-text search
await db.ai.indexes.create({ label: 'Product', propertyName: 'description' })
// External — your custom multimodal model
await db.ai.indexes.create({
label: 'Product', propertyName: 'description',
external: true, dimensions: 512, similarityFunction: 'cosine',
})
Specifying similarityFunction in db.ai.search() routes the query to the intended index.