Learn

Bring Your Own Vectors (BYOV)

External indexes let you supply pre-computed embedding vectors instead of having the server compute them. Use them when you need:

A custom or private model the server cannot access
Multimodal embeddings (image, audio, document structure)
Vectors already produced by your ML pipeline
Reproducible embeddings not tied to the server's active model

External vs Managed Comparison

	Managed	External
`sourceType`	`"managed"`	`"external"`
Initial status	`"pending"`	`"awaiting_vectors"`
Who computes embeddings	RushDB server (configured model)	Your application
`dimensions` required	No (uses server default)	Yes
Backfill for existing records	Automatic	Manual via `upsertVectors` / `upsert_vectors` or inline writes

Create an External Index

An external index starts with status awaiting_vectors and transitions to ready once at least one vector has been written.

Python
TypeScript
shell

db.ai.indexes.create()

response = db.ai.indexes.create({
    "label": "Article",
    "propertyName": "body",
    "sourceType": "external",
    "dimensions": 768,
    "similarityFunction": "cosine",
})
print(response.data["status"])  # 'awaiting_vectors'

dimensions is required for external indexes — the server cannot infer it without an embedding model.

db.ai.indexes.create()

// Shorthand: external: true
const { data: extIndex } = await db.ai.indexes.create({
  label: 'Article',
  propertyName: 'body',
  external: true,
  dimensions: 768,
  similarityFunction: 'cosine'
})
// extIndex.sourceType === 'external'
// extIndex.status    === 'awaiting_vectors'

// Explicit: sourceType: 'external'
const { data: extIndex } = await db.ai.indexes.create({
  label: 'Article',
  propertyName: 'body',
  sourceType: 'external',
  dimensions: 768,
  similarityFunction: 'cosine'
})

POST /api/v1/ai/indexes

curl -X POST https://api.rushdb.com/api/v1/ai/indexes \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "label": "Article",
    "propertyName": "body",
    "sourceType": "external",
    "dimensions": 768,
    "similarityFunction": "cosine"
  }'

{
  "data": {
    "id": "idx_abc",
    "label": "Article",
    "propertyName": "body",
    "sourceType": "external",
    "status": "awaiting_vectors",
    "dimensions": 768,
    "similarityFunction": "cosine"
  },
  "success": true
}

Bulk-Upsert Vectors

Use upsertVectors / upsert_vectors to seed an external index from an existing dataset or batch pipeline. The request is idempotent — calling it again with the same recordId replaces the stored vector.

Python
TypeScript
shell

db.ai.indexes.upsert_vectors(index_id, params)

# Fetch records and embed with your own model
records_response = db.records.find({"where": {"__label": "Article"}})

items = []
for record in records_response.data:
    vector = my_embedder.embed(record["body"])
    items.append({"recordId": record["__id"], "vector": vector})

db.ai.indexes.upsert_vectors(ext_index_id, {"items": items})

db.ai.indexes.upsertVectors(indexId, payload)

const { data: records } = await db.records.find({
  where: { __label: 'Article' }
})

const myEmbedder = new MyEmbeddingModel()
const items = await Promise.all(
  records.map(async (record) => ({
    recordId: record.__id,
    vector: await myEmbedder.embed(record.body)
  }))
)

await db.ai.indexes.upsertVectors(extIndex.id, { items })

POST /api/v1/ai/indexes/:id/vectors/upsert

curl -X POST https://api.rushdb.com/api/v1/ai/indexes/$INDEX_ID/vectors/upsert \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "items": [
      { "recordId": "rec_abc", "vector": [0.1, 0.2, 0.3] },
      { "recordId": "rec_def", "vector": [0.4, 0.5, 0.6] }
    ]
  }'

Inline Write (Preferred for New Records)

Instead of a two-step create → upsert_vectors flow, write vectors inline with any record write operation. See Write Records with Vectors for the full reference.

Python
TypeScript
shell

# One step: create record AND write its vector
record = db.records.create(
    label="Article",
    data={"title": "Warp drives", "body": "Alcubierre metric..."},
    vectors=[{"propertyName": "body", "vector": my_embedder.embed("Alcubierre metric...")}],
)

// One step: create record AND write its vector
const { data: record } = await db.records.create({
  label: 'Article',
  data: { title: 'Warp drives', body: 'Alcubierre metric...' },
  vectors: [{ propertyName: 'body', vector: myVec }]
})

curl -X POST https://api.rushdb.com/api/v1/records \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -d '{
    "label": "Article",
    "data": { "title": "Warp drives", "body": "Alcubierre metric..." },
    "vectors": [{ "propertyName": "body", "vector": [0.1, 0.2, 0.3] }]
  }'

Disambiguation

When the same (label, propertyName) pair has multiple external indexes (e.g. cosine and euclidean), specify similarityFunction to resolve which index to use.

Python
TypeScript
shell

# Create two indexes on the same property
db.ai.indexes.create({
    "label": "Product", "propertyName": "embedding",
    "sourceType": "external", "similarityFunction": "cosine", "dimensions": 768,
})
db.ai.indexes.create({
    "label": "Product", "propertyName": "embedding",
    "sourceType": "external", "similarityFunction": "euclidean", "dimensions": 768,
})

# ✅ Write to the cosine index only
db.records.create(
    label="Product",
    data={"name": "Widget"},
    vectors=[{
        "propertyName": "embedding",
        "vector": vec,
        "similarityFunction": "cosine",   # required when ambiguous
    }],
)

# ✅ Search the euclidean index only
db.ai.search({
    "labels": ["Product"],
    "propertyName": "embedding",
    "queryVector": vec,
    "similarityFunction": "euclidean",
})

# ❌ Missing similarityFunction → 422 Unprocessable Entity
db.records.create(
    label="Product",
    data={"name": "Gadget"},
    vectors=[{"propertyName": "embedding", "vector": vec}],  # ambiguous!
)

// Create two indexes on the same property
await db.ai.indexes.create({
  label: 'Product',
  propertyName: 'embedding',
  external: true,
  similarityFunction: 'cosine',
  dimensions: 768
})
await db.ai.indexes.create({
  label: 'Product',
  propertyName: 'embedding',
  external: true,
  similarityFunction: 'euclidean',
  dimensions: 768
})

// ✅ Write to cosine index only
await db.records.create({
  label: 'Product',
  data: { name: 'Widget' },
  vectors: [
    {
      propertyName: 'embedding',
      vector: vec,
      similarityFunction: 'cosine' // required when ambiguous
    }
  ]
})

// ✅ Search euclidean index only
await db.ai.search({
  labels: ['Product'],
  propertyName: 'embedding',
  queryVector: vec,
  similarityFunction: 'euclidean' // required when ambiguous
})

// ❌ Omitting similarityFunction when two indexes exist → 422 Unprocessable Entity
await db.records.create({
  label: 'Product',
  data: { name: 'Gadget' },
  vectors: [{ propertyName: 'embedding', vector: vec }] // ambiguous!
})

# ✅ Write to cosine index only
curl -X POST https://api.rushdb.com/api/v1/records \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -d '{
    "label": "Product",
    "data": { "name": "Widget" },
    "vectors": [
      { "propertyName": "embedding", "vector": [0.1, 0.9], "similarityFunction": "cosine" }
    ]
  }'

Index signature uniqueness

Two index policies are considered identical (and a second create returns 409 Conflict) when all five fields match:

Field	Effect on uniqueness
`label`	✅
`propertyName`	✅
`sourceType`	✅
`similarityFunction`	✅
`dimensions`	✅

Complete BYOV Worked Example

Python
TypeScript
shell

from rushdb import RushDB

db = RushDB("your-api-key")

# 1. Create the external index
idx_response = db.ai.indexes.create({
    "label": "Doc",
    "propertyName": "content",
    "sourceType": "external",
    "dimensions": 3,
    "similarityFunction": "cosine",
})
ext_index_id = idx_response.data["id"]

# 2. Create records with inline vectors (one round trip per record)
articles = [
    {"title": "Alpha", "content": "First article",  "vector": [1, 0, 0]},
    {"title": "Beta",  "content": "Second article", "vector": [0, 1, 0]},
    {"title": "Gamma", "content": "Third article",  "vector": [0, 0, 1]},
]

for article in articles:
    db.records.create(
        label="Doc",
        data={"title": article["title"], "content": article["content"]},
        vectors=[{"propertyName": "content", "vector": article["vector"]}],
    )

# 3. Search using a pre-computed query vector
response = db.ai.search({
    "labels": ["Doc"],
    "propertyName": "content",
    "queryVector": [1, 0, 0],   # closest to Alpha
    "limit": 3,
})

print(response.data[0].get("title"))    # "Alpha"
print(response.data[0].get("__score")) # ~1.0

import RushDB from '@rushdb/javascript-sdk'

const db = new RushDB('your-api-key')

// 1. Create the external index
const { data: idx } = await db.ai.indexes.create({
  label: 'Doc',
  propertyName: 'content',
  external: true,
  dimensions: 3,
  similarityFunction: 'cosine'
})

// 2. Create records + write inline vectors (one round trip per record)
const articles = [
  { title: 'Alpha', content: 'First article', vector: [1, 0, 0] },
  { title: 'Beta', content: 'Second article', vector: [0, 1, 0] },
  { title: 'Gamma', content: 'Third article', vector: [0, 0, 1] }
]

for (const { title, content, vector } of articles) {
  await db.records.create({
    label: 'Doc',
    data: { title, content },
    vectors: [{ propertyName: 'content', vector }]
  })
}

// 3. Search using a pre-computed query vector
const { data: results } = await db.ai.search({
  labels: ['Doc'],
  propertyName: 'content',
  queryVector: [1, 0, 0], // closest to Alpha
  limit: 3
})

console.log(results[0].data.title) // "Alpha"
console.log(results[0].data.__score) // ~1.0

# 1. Create the external index
curl -X POST https://api.rushdb.com/api/v1/ai/indexes \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"label":"Doc","propertyName":"content","sourceType":"external","dimensions":3,"similarityFunction":"cosine"}'

# 2. Create a record with inline vectors
curl -X POST https://api.rushdb.com/api/v1/records \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -d '{"label":"Doc","data":{"title":"Alpha","content":"First article"},"vectors":[{"propertyName":"content","vector":[1,0,0]}]}'

# 3. Search using a pre-computed query vector
curl -X POST https://api.rushdb.com/api/v1/ai/search \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"labels":["Doc"],"propertyName":"content","queryVector":[1,0,0],"limit":3}'

Batch Import with createMany

For bulk seeding with flat rows, use createMany / create_many with the top-level indexed vectors parameter:

Python
TypeScript
shell

db.records.create_many(
    label="Doc",
    data=[
        {"title": "Alpha", "content": "First article"},
        {"title": "Beta",  "content": "Second article"},
        {"title": "Gamma", "content": "Third article"},
    ],
    vectors=[
        [{"propertyName": "content", "vector": [1, 0, 0]}],  # row 0
        [{"propertyName": "content", "vector": [0, 1, 0]}],  # row 1
        [{"propertyName": "content", "vector": [0, 0, 1]}],  # row 2
    ],
)

await db.records.createMany({
  label: 'Doc',
  data: [
    { title: 'Alpha', content: 'First article' },
    { title: 'Beta', content: 'Second article' },
    { title: 'Gamma', content: 'Third article' }
  ],
  vectors: [
    [{ propertyName: 'content', vector: [1, 0, 0] }],
    [{ propertyName: 'content', vector: [0, 1, 0] }],
    [{ propertyName: 'content', vector: [0, 0, 1] }]
  ]
})

curl -X POST https://api.rushdb.com/api/v1/records/import/json \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -d '{
    "label": "Doc",
    "data": [
      {"title":"Alpha","content":"First article"},
      {"title":"Beta","content":"Second article"},
      {"title":"Gamma","content":"Third article"}
    ],
    "vectors": [
      [{"propertyName":"content","vector":[1,0,0]}],
      [{"propertyName":"content","vector":[0,1,0]}],
      [{"propertyName":"content","vector":[0,0,1]}]
    ]
  }'

For nested JSON payloads (importJson), create records first then call upsertVectors separately to seed the index.

Mixing Managed and External Indexes

You can have both a managed index and an external index on the same property simultaneously:

Python
TypeScript
shell

# Managed — server embeds for full-text semantic search
db.ai.indexes.create({"label": "Product", "propertyName": "description"})

# External — your custom multimodal model
db.ai.indexes.create({
    "label": "Product",
    "propertyName": "description",
    "sourceType": "external",
    "dimensions": 512,
    "similarityFunction": "cosine",
})

// Managed — server embeds for full-text semantic search
await db.ai.indexes.create({ label: 'Product', propertyName: 'description' })

// External — your custom multimodal model
await db.ai.indexes.create({
  label: 'Product',
  propertyName: 'description',
  external: true,
  dimensions: 512,
  similarityFunction: 'cosine'
})

# Managed index
curl -X POST https://api.rushdb.com/api/v1/ai/indexes \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"label":"Product","propertyName":"description"}'

# External index
curl -X POST https://api.rushdb.com/api/v1/ai/indexes \
  -H "Authorization: Bearer $RUSHDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"label":"Product","propertyName":"description","sourceType":"external","dimensions":512,"similarityFunction":"cosine"}'

When searching against a property with both types, specify similarityFunction (and optionally sourceType) to select the target index.

External vs Managed Comparison​

Create an External Index​

Bulk-Upsert Vectors​

Inline Write (Preferred for New Records)​

Disambiguation​

Index signature uniqueness​

Complete BYOV Worked Example​

Batch Import with createMany​

Mixing Managed and External Indexes​

See also​