Skip to main content

Writing Records with Vectors

RushDB lets you attach pre-computed embedding vectors to records at write time, eliminating the need for a separate upsertVectors call. Any operation that creates or modifies records supports this through the vectors parameter.

This feature requires at least one external index to exist for the target (label, propertyName).


vectors parameter

All write operations accept a vectors array:

type VectorEntry = {
/** Property name this vector is associated with. */
propertyName: string
/** Pre-computed embedding vector. */
vector: number[]
/** Required when multiple indexes exist on the same property. */
similarityFunction?: 'cosine' | 'euclidean'
}

Create a Record with Vectors

records.create()

const { data: record } = await db.records.create({
label: 'Article',
data: {
title: 'How transformers work',
body: 'Attention is all you need ...',
},
vectors: [
{ propertyName: 'body', vector: myEmbed('Attention is all you need ...') }
],
})

console.log(record.__id) // record is created AND vector is written atomically

Upsert with Vectors

records.upsert()

upsert is idempotent on the record's slug (natural key). Passing vectors writes (or replaces) the stored vector for each propertyName in the same call:

// First call — creates the record + writes vector
const { data: r1 } = await db.records.upsert({
label: 'Article',
data: { slug: 'transformers-101', title: 'Transformers 101', body: '...' },
vectors: [{ propertyName: 'body', vector: v1 }],
})

// Second call — same slug → updates the title/body + replaces the vector
const { data: r2 } = await db.records.upsert({
label: 'Article',
data: { slug: 'transformers-101', title: 'Transformers 101 (revised)', body: 'Updated ...' },
vectors: [{ propertyName: 'body', vector: v2 }],
})

console.log(r1.__id === r2.__id) // true — same record

Set with Vectors

records.set()

set replaces all properties of a record with new values. Including vectors writes those vectors at the same time:

// Find or create the record first
const { data: rec } = await db.records.create({
label: 'Product',
data: { name: 'Widget', price: 9.99 },
})

// Full replace — data AND vector updated together
await db.records.set(rec.__id, {
data: { name: 'Widget Pro', price: 19.99 },
vectors: [{ propertyName: 'description', vector: newVec }],
})

Create Multiple Records with Vectors

records.createMany()

createMany is optimised for flat (CSV-like) rows. Use the top-level vectors parameter — an array indexed by row position — to attach a vector to each record without nesting arrays inside your flat data:

await db.records.createMany({
label: 'Product',
data: [
{ name: 'Alpha', description: 'First product' },
{ name: 'Beta', description: 'Second product' },
{ name: 'Gamma', description: 'Third product' },
],
vectors: [
[{ propertyName: 'description', vector: [1, 0, 0] }], // row 0
[{ propertyName: 'description', vector: [0, 1, 0] }], // row 1
[{ propertyName: 'description', vector: [0, 0, 1] }], // row 2
],
options: { returnResult: true },
})

Sparse vectors

Leave rows without vectors by providing a shorter vectors array (any unspecified trailing rows are skipped):

await db.records.createMany({
label: 'Product',
data: [{ name: 'Alpha' }, { name: 'Beta' }, { name: 'Gamma' }],
// only row 0 gets a vector; rows 1 and 2 are skipped
vectors: [[{ propertyName: 'description', vector: myVec }]],
})

Validation

The SDK throws synchronously if vectors.length > data.length:

// ❌ Throws: "vectors length (3) exceeds the number of data rows (2)"
db.records.createMany({
label: 'Product',
data: [{ name: 'A' }, { name: 'B' }],
vectors: [
[{ propertyName: 'description', vector: [1, 0, 0] }],
[{ propertyName: 'description', vector: [0, 1, 0] }],
[{ propertyName: 'description', vector: [0, 0, 1] }], // no row 2
],
})

Import CSV with Vectors

records.importCsv()

CSV data is a raw string, so per-row vectors are supplied as a separate vectors parameter using the same indexed-array format as createMany. Row indices are 0-based and refer to data rows after the header is consumed.

const csv = `name,description
Alpha,First product
Beta,Second product
Gamma,Third product`

await db.records.importCsv({
label: 'Product',
data: csv,
vectors: [
[{ propertyName: 'description', vector: [1, 0, 0] }], // csv row 0
[{ propertyName: 'description', vector: [0, 1, 0] }], // csv row 1
[{ propertyName: 'description', vector: [0, 0, 1] }], // csv row 2
],
options: { returnResult: true },
})

Sparse vectors

Same sparse pattern as createMany — any rows beyond vectors.length get no vector:

await db.records.importCsv({
label: 'Product',
data: csv,
// only the first row gets a vector
vectors: [[{ propertyName: 'description', vector: myVec }]],
})

Validation

The server returns 400 Bad Request if vectors.length exceeds the number of data rows (validated after CSV parsing). The client does not know the row count before sending since CSV is a raw string.

400 Bad Request: vectors length (5) exceeds the number of CSV data rows (3)

Specifying similarityFunction for disambiguation

When a single (label, propertyName) has multiple external indexes registered (e.g. one cosine and one euclidean), you must include similarityFunction in each VectorEntry so the server can route the write to the correct index:

// Write to the cosine index
await db.records.create({
label: 'Product',
data: { name: 'Widget' },
vectors: [
{ propertyName: 'embedding', vector: vec, similarityFunction: 'cosine' }
],
})

Omitting similarityFunction when multiple indexes match returns 422 Unprocessable Entity.


Multiple vectors in one call

You can write vectors for multiple properties or indexes in a single operation:

await db.records.create({
label: 'Document',
data: { title: 'Multi-modal doc', abstract: '...', fullText: '...' },
vectors: [
{ propertyName: 'abstract', vector: abstractVec },
{ propertyName: 'fullText', vector: fullTextVec },
],
})

Each entry is matched independently against the available external indexes.


Complete worked example

import RushDB from '@rushdb/javascript-sdk'

const db = new RushDB('your-api-key')
const emb = new YourEmbeddingModel()

// 1. Create an external index once (idempotent via 409 Conflict)
const { data: idx } = await db.ai.indexes.create({
label: 'Article',
propertyName: 'body',
external: true,
dimensions: 768,
similarityFunction: 'cosine',
}).catch(e => e.status === 409 ? db.ai.indexes.find() : Promise.reject(e))

// 2. Create records from your pipeline, embedding as you go
const docs = [
{ title: 'Alpha', body: 'First doc' },
{ title: 'Beta', body: 'Second doc' },
]

for (const doc of docs) {
await db.records.create({
label: 'Article',
data: doc,
vectors: [{ propertyName: 'body', vector: await emb.embed(doc.body) }],
})
}

// 3. Search
const queryVec = await emb.embed('first document')
const { data } = await db.ai.search({
label: 'Article',
propertyName: 'body',
queryVector: queryVec,
limit: 3,
})
console.log(data[0].title) // 'Alpha'

Inline vectors vs. upsertVectors

Inline vectorsdb.ai.indexes.upsertVectors()
Round trips1 (write + vector together)2+ (write, then upload)
Use caseStreaming ingestion, real-time pipelineBatch backfill, dataset migration
IdempotencyDepends on the write operation usedAlways idempotent per recordId
Availabilitycreate, upsert, set, createMany, importCsvStandalone call on any existing records
Multi-recordcreateMany or importCsv with indexed vectors[][]Single bulk payload

For streaming pipelines that produce records one-by-one or in small batches, inline vectors are simpler and more efficient. For seeding an index from a large existing dataset, upsertVectors is the right choice.


Vector format by method

MethodVector syntaxNotes
createvectors: VectorEntry[]single record
upsertvectors: VectorEntry[]single record, idempotent
setvectors: VectorEntry[]single record, full replace
createManyvectors: VectorEntry[][] (indexed)vectors[i]data[i]
importCsvvectors: VectorEntry[][] (indexed)vectors[i] → CSV row i

createMany and importCsv use an external indexed array so that each row's vector is unambiguously matched by position. For nested JSON imports use importJson to create the records, then call db.ai.indexes.upsertVectors() to seed the vectors separately.


Error conditions

ErrorCauseMethod
404 Not FoundNo external index exists for (label, propertyName)all
422 Unprocessable Entityvector.length does not match index.dimensionsall
422 Unprocessable EntityMultiple indexes match and similarityFunction was not specifiedall
400 Bad Requestvectors.length exceeds number of CSV data rowsimportCsv
Client Errorvectors.length exceeds data.lengthcreateMany (thrown synchronously)

importJson does not accept a vectors parameter. Use createMany for flat rows with inline vectors, or use importJson followed by db.ai.indexes.upsertVectors() for nested JSON payloads.