Skip to main content

Importing data from external sources

RushDB provides comprehensive toolkit to import data. While most of the data sources operate with flat tabular data and context awareness emerges at query time, in RushDB instead relationships are static because of graph nature.

This guide will help you to import your data and make it breathe: relationships, types inferring and records itself are created easily.

What you'll use:

  • records.createMany (JSON and CSV import)
  • relationships.createMany (bulk linking by key match)

Tip: You can do the same via REST. See REST docs: Records Import and Relationships API.

Most external systems already have stable identifiers (MongoDB's ObjectId, HubSpot record IDs, SQL primary/foreign keys). When importing to RushDB, store those external IDs on your records (e.g., mongoId, hubspotId, pgId). Then create relationships by matching those keys using relationships.createMany:

  1. Import data (keep external IDs as properties).
  2. Create relationships by joining source[key] = target[key].

Safeguards and notes

  • You control the relationship type and direction (default direction is out).
  • For key-based creation, provide both source.key and target.key.
  • Only use manyToMany when you explicitly want a cartesian link across filtered sets.

1) MongoDB → RushDB

Goal: Import MongoDB collections (e.g., users, orders) and connect them using Mongo's ObjectId values.

Recommended mapping

  • Persist the original _id as a string field on the RushDB record: mongoId.
  • For references (e.g., orders.userId), persist as userMongoId so you can join users.mongoId = orders.userMongoId.

Example

from pymongo import MongoClient
from bson import ObjectId
from rushdb import RushDB
import os

db = RushDB(os.environ["RUSHDB_API_KEY"])
mongo = MongoClient(os.environ["MONGO_URI"])
mdb = mongo["acme"]

# 1) Extract from Mongo
users = list(mdb["users"].find({"tenantId": "ACME"}))
orders = list(mdb["orders"].find({"tenantId": "ACME"}))

# 2) Normalize docs for RushDB
users_payload = [
{"mongoId": str(u["_id"]), "tenantId": u["tenantId"], "name": u["name"], "email": u["email"]}
for u in users
]
orders_payload = [
{"mongoId": str(o["_id"]), "tenantId": o["tenantId"], "total": o["total"],
"userMongoId": str(o["userId"])}
for o in orders
]

# 3) Import into RushDB
db.records.create_many(label="USER", data=users_payload)
db.records.create_many(label="ORDER", data=orders_payload)

# 4) Link: USER -[:ORDERED]-> ORDER using mongo ids
db.relationships.create_many(
source={"label": "USER", "key": "mongoId", "where": {"tenantId": "ACME"}},
target={"label": "ORDER", "key": "userMongoId", "where": {"tenantId": "ACME"}},
type="ORDERED",
direction="out",
)

mongo.close()

Example: REST (create-many)

POST /api/v1/relationships/create-many
{
"source": { "label": "USER", "key": "mongoId", "where": { "tenantId": "ACME" } },
"target": { "label": "ORDER", "key": "userMongoId", "where": { "tenantId": "ACME" } },
"type": "ORDERED",
"direction": "out"
}

Common pitfalls

  • Ensure you convert ObjectId to string when storing in RushDB; the join is string equality.
  • Keep tenant/workspace scoping in your where filters to avoid cross-tenant links.

2) HubSpot → RushDB

Goal: Import HubSpot objects (Contacts, Companies, Deals) and connect them using HubSpot IDs.

Recommended mapping

  • Store the HubSpot object ID on the record (e.g., hubspotId).
  • For associations, store the associated object’s HubSpot ID on the related record (e.g., a Deal with companyHubspotId).

Example

from hubspot import HubSpot
from rushdb import RushDB
import os

db = RushDB(os.environ["RUSHDB_API_KEY"])
hubspot = HubSpot(access_token=os.environ["HUBSPOT_TOKEN"])

# 1) Fetch Contacts and Companies
contacts_res = hubspot.crm.contacts.basic_api.get_page(limit=100, properties=["email"])
companies_res = hubspot.crm.companies.basic_api.get_page(limit=100, properties=["name", "domain"])

contacts = [
{"hubspotId": c.id, "email": c.properties.get("email"), "tenantId": "ACME"}
for c in contacts_res.results
]
companies = [
{"hubspotId": co.id, "name": co.properties.get("name"), "domain": co.properties.get("domain"), "tenantId": "ACME"}
for co in companies_res.results
]

# 2) Import
db.records.create_many(label="HS_CONTACT", data=contacts)
db.records.create_many(label="HS_COMPANY", data=companies)

# 3) Associate Contacts to Companies
db.relationships.create_many(
source={"label": "HS_CONTACT", "key": "companyHubspotId", "where": {"tenantId": "ACME"}},
target={"label": "HS_COMPANY", "key": "hubspotId", "where": {"tenantId": "ACME"}},
type="WORKS_AT",
direction="out",
)

Alternative: Deals to Companies

db.relationships.create_many(
source={"label": "HS_DEAL", "key": "companyHubspotId", "where": {"tenantId": "ACME"}},
target={"label": "HS_COMPANY", "key": "hubspotId", "where": {"tenantId": "ACME"}},
type="RELATED_TO",
direction="out",
)

Notes

  • HubSpot v3 uses string IDs; storing them verbatim is fine for equality joins.
  • If you rely on HubSpot association APIs, mirror those association IDs onto one side to enable the key match.

3) PostgreSQL → RushDB

Goal: Import relational tables (e.g., users, orders) and connect them using primary/foreign keys.

Recommended mapping

  • Store the SQL primary key as pgId (for users) and the foreign key as userPgId (for orders). Then join USER.pgId = ORDER.userPgId.

Example

import psycopg2
from rushdb import RushDB
import os

db = RushDB(os.environ["RUSHDB_API_KEY"])
conn = psycopg2.connect(os.environ["PG_URI"])
cur = conn.cursor()

# 1) Extract
cur.execute("SELECT id, name, email, tenant_id FROM users WHERE tenant_id = %s", ("ACME",))
users_rows = cur.fetchall()
cur.execute("SELECT id, user_id, total, tenant_id FROM orders WHERE tenant_id = %s", ("ACME",))
orders_rows = cur.fetchall()

# 2) Normalize
users = [{"pgId": str(r[0]), "name": r[1], "email": r[2], "tenantId": r[3]} for r in users_rows]
orders = [{"pgId": str(r[0]), "userPgId": str(r[1]), "total": r[2], "tenantId": r[3]} for r in orders_rows]

# 3) Import
db.records.create_many(label="USER", data=users)
db.records.create_many(label="ORDER", data=orders)

# 4) Link
db.relationships.create_many(
source={"label": "USER", "key": "pgId", "where": {"tenantId": "ACME"}},
target={"label": "ORDER", "key": "userPgId", "where": {"tenantId": "ACME"}},
type="ORDERED",
direction="out",
)

cur.close()
conn.close()

CSV path (no code runtime)

If you export tables to CSV, you can import with REST POST /api/v1/records/import/csv or SDK records.createMany, then run the same relationships.createMany call as above by joining the columns you preserved (e.g., pgId and userPgId).


4) Supabase → RushDB

Supabase uses PostgreSQL under the hood, so the mapping mirrors the PostgreSQL example. If you prefer the Supabase client:

from supabase import create_client
from rushdb import RushDB
import os

db = RushDB(os.environ["RUSHDB_API_KEY"])
supabase = create_client(os.environ["SUPABASE_URL"], os.environ["SUPABASE_SERVICE_ROLE_KEY"])

# 1) Extract
users = supabase.table("users").select("id,name,email,tenant_id").eq("tenant_id", "ACME").execute().data
orders = supabase.table("orders").select("id,user_id,total,tenant_id").eq("tenant_id", "ACME").execute().data

# 2) Normalize
users_payload = [{"pgId": str(r["id"]), "name": r["name"], "email": r["email"], "tenantId": r["tenant_id"]} for r in users]
orders_payload = [{"pgId": str(r["id"]), "userPgId": str(r["user_id"]), "total": r["total"], "tenantId": r["tenant_id"]} for r in orders]

# 3) Import and link
db.records.create_many(label="USER", data=users_payload)
db.records.create_many(label="ORDER", data=orders_payload)
db.relationships.create_many(
source={"label": "USER", "key": "pgId", "where": {"tenantId": "ACME"}},
target={"label": "ORDER", "key": "userPgId", "where": {"tenantId": "ACME"}},
type="ORDERED",
direction="out",
)

5) Firebase (Firestore) → RushDB

Map Firestore document IDs to a stable key. Example with collections users and orders (each order has userId that equals a user doc id):

import firebase_admin
from firebase_admin import credentials, firestore
from rushdb import RushDB
import os

db = RushDB(os.environ["RUSHDB_API_KEY"])
firebase_admin.initialize_app(credentials.ApplicationDefault(), {"projectId": os.environ["GCLOUD_PROJECT"]})
fs = firestore.client()

# 1) Fetch
users_snap = fs.collection("users").where("tenantId", "==", "ACME").get()
orders_snap = fs.collection("orders").where("tenantId", "==", "ACME").get()

# 2) Normalize
users = [{"firebaseId": d.id, "tenantId": d.get("tenantId"), "name": d.get("name"), "email": d.get("email")} for d in users_snap]
orders = [{"firebaseId": d.id, "tenantId": d.get("tenantId"), "total": d.get("total"), "userFirebaseId": str(d.get("userId"))} for d in orders_snap]

# 3) Import and link
db.records.create_many(label="USER", data=users)
db.records.create_many(label="ORDER", data=orders)
db.relationships.create_many(
source={"label": "USER", "key": "firebaseId", "where": {"tenantId": "ACME"}},
target={"label": "ORDER", "key": "userFirebaseId", "where": {"tenantId": "ACME"}},
type="ORDERED",
direction="out",
)

Notes

  • For multi-tenant Firestore, include a tenantId field and filter where accordingly.
  • If orders reference users via DocumentReference objects, resolve to ref.id when building the payload.

6) Airtable → RushDB

Use Airtable record IDs for joins. Example: link Contacts to Companies where a Contact has a single companyId field storing the linked record ID.

from pyairtable import Api
from rushdb import RushDB
import os

db = RushDB(os.environ["RUSHDB_API_KEY"])
api = Api(os.environ["AIRTABLE_TOKEN"])
base = api.base(os.environ["AIRTABLE_BASE_ID"])

# 1) Fetch
companies = base.table("Companies").all()
contacts = base.table("Contacts").all()

# 2) Normalize
companies_payload = [
{"airtableId": r["id"], "tenantId": "ACME", "name": r["fields"].get("Name"), "domain": r["fields"].get("Domain")}
for r in companies
]
contacts_payload = [
{
"airtableId": r["id"], "tenantId": "ACME",
"name": r["fields"].get("Name"), "email": r["fields"].get("Email"),
"companyAirtableId": r["fields"].get("Company", [None])[0]
}
for r in contacts
]

# 3) Import and link
db.records.create_many(label="AT_COMPANY", data=companies_payload)
db.records.create_many(label="AT_CONTACT", data=contacts_payload)

db.relationships.create_many(
source={"label": "AT_CONTACT", "key": "companyAirtableId", "where": {"tenantId": "ACME"}},
target={"label": "AT_COMPANY", "key": "airtableId", "where": {"tenantId": "ACME"}},
type="WORKS_AT",
direction="out",
)

Notes

  • If a contact can link to multiple companies, iterate those IDs and use records.attach per contact, or pre-expand into multiple joinable rows.

7) Notion → RushDB

Use Notion page IDs for joins. Example: People and Tasks databases; each Task has a single-person relation stored in assignee.

from notion_client import Client as NotionClient
from rushdb import RushDB
import os

db = RushDB(os.environ["RUSHDB_API_KEY"])
notion = NotionClient(auth=os.environ["NOTION_TOKEN"])

people_db_id = os.environ["NOTION_PEOPLE_DB_ID"]
tasks_db_id = os.environ["NOTION_TASKS_DB_ID"]

# 1) Fetch
people_res = notion.databases.query(database_id=people_db_id)
tasks_res = notion.databases.query(database_id=tasks_db_id)

# 2) Normalize
people = [
{
"notionId": p["id"], "tenantId": "ACME",
"name": (p.get("properties", {}).get("Name", {}).get("title", [{}])[0].get("plain_text", "Unknown"))
}
for p in people_res["results"]
]

tasks = []
for t in tasks_res["results"]:
props = t.get("properties", {})
assignees = props.get("assignee", {}).get("relation", [])
tasks.append({
"notionId": t["id"], "tenantId": "ACME",
"title": props.get("Name", {}).get("title", [{}])[0].get("plain_text", "Untitled"),
"assigneeNotionId": assignees[0]["id"] if assignees else None
})

# 3) Import and link
db.records.create_many(label="NT_PERSON", data=people)
db.records.create_many(label="NT_TASK", data=tasks)

db.relationships.create_many(
source={"label": "NT_TASK", "key": "assigneeNotionId", "where": {"tenantId": "ACME"}},
target={"label": "NT_PERSON", "key": "notionId", "where": {"tenantId": "ACME"}},
type="ASSIGNED_TO",
direction="out",
)

Notes

  • If a Task can have multiple assignees, either:
    • iterate assignee IDs and call records.attach per Task, or
    • pre-expand into multiple Task rows (each with a single assigneeNotionId) before import to keep createMany-by-key workflow.

Quick reference: core RushDB calls

from rushdb import RushDB

db = RushDB("RUSHDB_API_KEY")

# Import
db.records.create_many(label="USER", data=users)
db.records.create_many(label="ORDER", data=orders)

# Link by key equality
db.relationships.create_many(
source={"label": "USER", "key": "mongoId", "where": {"tenantId": "ACME"}},
target={"label": "ORDER", "key": "userMongoId", "where": {"tenantId": "ACME"}},
type="ORDERED",
direction="out",
)

Troubleshooting

  • Mismatched types: Ensure the join keys are the same type (strings are safest). Convert DB-specific IDs to strings before import.
  • Missing keys: Key-based mode requires both source.key and target.key. If you truly need cartesian linking, set manyToMany: true and provide non-empty where on both sides.
  • Scope filters: Always restrict with where (e.g., tenantId) to avoid unintended cross-linking.

See also