Skip to main content

Glossary

Core Concepts

Property Graph

A graph data model where data is organized as nodes (entities) and edges (relationships). Both nodes and edges can have properties (key-value pairs) and labels/types.

Node

The fundamental unit of data in a property graph, representing an entity (e.g., Person, Product). Nodes can have multiple Labels and Properties.

Edge

A directed connection between two nodes, representing a relationship (e.g., KNOWS, BOUGHT). Edges always have a single Edge Type and can have Properties.

Label

A tag applied to a node to categorize it (e.g., :Person, :Vehicle). Nodes can have multiple labels. Used for indexing and query filtering.

Properties

Key-value pairs attached to nodes or edges. Keys are strings, and values can be strings, integers, floats, booleans, vectors, etc.

Architecture

Raft Consensus

A distributed consensus algorithm used by Graphmind to ensure data consistency and fault tolerance across the cluster. It manages leader election and log replication.

WAL (Write-Ahead Log)

A persistence technique where modifications are written to a log file before they are applied to the database. Ensures durability and crash recovery.

RocksDB

An embeddable persistent key-value store used by Graphmind as the underlying storage engine for graph data.

Multi-Tenancy

Tenant

A logical isolation boundary within the database. Each tenant (graph namespace) has its own data, indices, and isolation from other tenants.

Graph Namespace

A named container for graph data. The default namespace is default. Create additional namespaces by specifying a different graph name in queries.

Vector Embedding

A list of floating-point numbers (e.g., [0.1, 0.5, -0.9]) representing the semantic meaning of text, images, or other data. Used for similarity search.

HNSW (Hierarchical Navigable Small World)

A graph-based index structure used for efficient Approximate Nearest Neighbor (ANN) search on vector embeddings. Provides sub-millisecond search latency.

NLQ (Natural Language Querying)

A feature allowing users to query the graph using plain language. The system uses an LLM to translate the request into an executable Cypher query.

GAK (Generation-Augmented Knowledge)

The inverse of RAG -- using LLMs to build the database rather than query it. The database acts as an agent to fetch, structure, and persist missing information on-demand.

Query Languages

OpenCypher

A declarative graph query language that uses ASCII-art style patterns (e.g., (a)-[:KNOWS]->(b)) to describe and query graph structures.

Protocols

RESP (Redis Serialization Protocol)

The wire protocol used by Redis. Graphmind implements RESP3, allowing any Redis client to connect and issue graph commands.

Performance & Internals

Late Materialization

An optimization where operators pass lightweight references (NodeRef(NodeId)) through the pipeline instead of full node copies. Properties are resolved lazily only when needed. Yields 4-5x improvement in multi-hop query latency.

Volcano Iterator Model

A query execution model where each operator (scan, filter, expand, project) implements a next() method. Operators pull records one at a time from their children, avoiding large intermediate materializations.

MVCC (Multi-Version Concurrency Control)

A concurrency control method where each node/edge maintains version history. Readers access consistent snapshots via get_node_at_version() without blocking writers.

CSR (Compressed Sparse Row)

A memory-efficient representation for graph adjacency. Stores all edge targets in a single contiguous array with an offset index per node, improving cache locality for traversals.