Performance
GraphDB is an in-memory database, so all operations are fast. However, understanding the complexity characteristics helps you design your data model and decide when indexes are worth the memory cost.
Complexity reference
Read operations
| Operation | Complexity | Notes |
|---|---|---|
read(id) | O(1) | Map.get() |
exists(id) | O(1) | Map.has() |
count() (no filter) | O(1) | Map.size |
count(where) | O(n) | Delegates to query() |
findOne(where) | O(n) worst case | Stops at first match |
query() (no index) | O(n) | Full scan of all documents |
query() (indexed, equality/eq) | O(1) + O(k) | O(1) index lookup, O(k) candidate evaluation |
query() (indexed, in operator) | O(m) + O(k) | O(m) lookups (m = array size), O(k) candidate evaluation |
Where n = total documents, k = matching candidates, m = size of the in array.
Write operations
| Operation | Complexity | Notes |
|---|---|---|
create(doc) | O(1) + O(i) | O(1) insert, O(i) index updates |
update(id, patch) | O(1) + O(i) | O(1) lookup + merge, O(i) index updates |
remove(id) | O(1) + O(i) | O(1) delete, O(i) index updates |
updateMany(where, patch) | O(n) + O(k) | O(n) query, O(k) sequential writes |
removeMany(where) | O(n) + O(k) | O(n) query, O(k) sequential writes |
populate(docs) | O(n) + O(n*i) | O(n) inserts, O(n*i) index rebuild |
clear() | O(1) | Map.clear() on docs and each index bucket |
Where i = number of indexed fields.
Listener operations
| Operation | Complexity | Notes |
|---|---|---|
on(event, handler) | O(1) | Set.add() |
listen(id, handler) | O(1) | Set.add() |
| Unsubscribe (cancel function) | O(1) | Set.delete() |
Sorting
| Operation | Complexity | Notes |
|---|---|---|
sortDocuments() | O(n log n) | Standard comparison sort |
Memory
| Structure | Memory | Notes |
|---|---|---|
| Document storage | O(n) | One entry per document |
| Index storage | O(n * i) | n = documents, i = indexed fields |
| Per-index bucket | O(distinct values) | One Set per distinct value per field |
When indexes help
Indexes accelerate queries that use equality-based lookups on the indexed field. They are most effective when:
- You query frequently by a specific field (e.g., looking up users by
email). - The field has high cardinality (many distinct values), so each index bucket is small.
- You combine indexed fields in a query, because the query planner intersects candidate sets smallest-first.
import { GraphDB } from '@graphdb/core';
type User = { name: string; email: string; age: number };
const db = GraphDB();const users = db.collection<User>('users', { indexes: ['email', 'age'],});
// Fast: "email" is indexed, O(1) lookupconst results = users.query({ email: 'alice@example.com' });
// Fast: "age" is indexed, O(1) lookup per value in the arrayconst youngUsers = users.query({ age: { in: [25, 26, 27] } });
// Fast: both fields indexed, candidate sets are intersectedconst specific = users.query({ email: 'alice@example.com', age: 25 });When indexes do not help
Indexes provide no benefit for:
- Range operators (
gt,gte,lt,lte): These require scanning values, not exact lookups. The query planner falls through to evaluating candidates or a full scan. containsandregexoperators: These cannot use hash-based lookups.- Low-cardinality fields: If a field only has a few distinct values (e.g., a
rolefield with “admin” and “user”), the index buckets are large, reducing the filtering benefit. - Write-heavy workloads with rarely-queried fields: Every
create,update, andremovemust update all indexes. If you index a field you never filter by, you pay the write cost for no query benefit.
// NOT accelerated by index: range operator requires scanconst adults = users.query({ age: { gt: 18 } });
// NOT accelerated by index: regex cannot use hash lookupconst matched = users.query({ name: { regex: /^Ali/ } });Index memory cost
Each index maintains a Map<value, Set<docId>> for one field. The memory cost is proportional to the number of documents multiplied by the number of indexed fields.
For a collection of 10,000 users with 2 indexed fields (email and age):
- The
emailindex stores up to 10,000 entries (one per unique email), each containing aSetwith one_id. - The
ageindex stores entries for each distinct age value. If ages range from 18 to 80, that is roughly 62 entries, each containing aSetwith the_ids of users of that age.
The overhead is modest for most in-memory use cases. If memory is a concern, only index fields you actually query by.
Practical guidance
Choosing what to index
Ask yourself these questions:
- Do I query this field frequently? If you only filter by
emailonce during app initialization, an index is unnecessary. - Is the filter an equality check? Indexes only accelerate
eq, primitive equality, andin. Range queries get no benefit. - Am I willing to pay the write overhead? Each indexed field adds O(1) work per write. For 2-3 indexes this is negligible. For 10+ indexes on a write-heavy collection, measure first.
Batch operations
Use populate() for loading initial data rather than calling create() in a loop. populate() inserts all documents first and rebuilds indexes once, rather than updating indexes per insert.
type User = { name: string; email: string; age: number };
// Good: single populate call, indexes built onceconst apiData = await fetchUsers();users.populate(apiData.map((u) => ({ ...u, _id: u.id, _createdAt: u.createdAt, _updatedAt: u.updatedAt,})));
// Less efficient: N individual creates, N index rebuildsfor (const u of apiData) { await users.create(u);}Query optimization
When combining filters, place indexed fields alongside non-indexed ones. The query planner will use the index to narrow down candidates before evaluating expensive checks:
// If "age" is indexed but "name" is not:// The planner uses the age index to find candidates,// then evaluates the regex only on those candidates.const results = users.query({ age: 25, name: { regex: /^Ali/ },});When to use GraphDB
GraphDB is a strong fit for:
- Prototyping and MVPs. Zero configuration, no server, no schema files. Define a type and start storing data.
- Client-side caching. Cache API responses in a queryable store with indexes for fast lookups.
- Offline-first applications. Use
populate()to hydrate from an API on startup, and syncers to push writes back when connectivity returns. - Small to medium datasets. Hundreds to low tens of thousands of documents per collection work comfortably in memory.
- Testing. Create a fresh database per test with no setup or teardown. No mocking needed for synchronous read operations.
When to consider alternatives
GraphDB is not designed for:
- Large datasets (100k+ documents). All data lives in memory. There is no disk-backed storage, no pagination at the storage layer, and no streaming.
- Persistence requirements. GraphDB is ephemeral. If the process restarts, data is gone unless you re-populate from an external source.
- Multi-process or multi-server. There is no built-in replication, clustering, or shared memory. Each process gets its own independent in-memory store.
- Complex relational queries. There are no joins, aggregations, or transactions. If your data model requires these, a relational database is more appropriate.
- Full-text search. The
containsandregexoperators work but are O(n) scans. Dedicated search engines are better for large-scale text search.