CLAUDE CODE MARKETPLACES

graphmemory

Build and query embedded GraphRAG knowledge graphs with DuckDB-backed vector, full-text, and hybrid search. Use when the user wants to store entities and relations, run RAG over a graph, extract knowledge graphs from text with DSPy, run graph algorithms (PageRank, centrality, components), merge/upsert nodes and edges, fuzzy-dedupe an existing graph, or visualize interactively in a browser. Trigger phrases include "knowledge graph", "GraphRAG", "graph database", "hybrid search", "extract entities and relations", "DuckDB graph", "embedded graph store", "dedupe graph nodes".

npx skills add https://github.com/bradAGI/GraphMemory --skill graphmemory
SKILL.md

GraphMemory

Embedded GraphRAG database built on DuckDB. Single Python package — no server, no external services. Ships vector (HNSW), full-text (BM25), hybrid search, fluent query builder, multi-hop traversal, fuzzy dedup, DSPy extraction, NetworkX algorithms, and a zero-dep D3.js visualizer.

When to reach for this

  • Knowledge graph with semantic search (not just a vector DB, not just a graph DB).
  • RAG where graph traversal is part of retrieval.
  • Extract entities/relations from text and store them durably with dedup.
  • Prototyping — file-backed or in-memory graph without Neo4j/Postgres/pgvector.

Do not use when: user already has Neo4j/Neptune/ArangoDB, or scale is hundreds of millions of nodes — GraphMemory is DuckDB-embedded, single-writer.

Install

pip install graphmemory
pip install graphmemory[extraction]   # DSPy entity/relation extraction
pip install graphmemory[algorithms]   # NetworkX algorithms

Decision table

User intentMethod
Insert one nodegraph.insert_node(node)
Bulk insertgraph.bulk_insert_nodes(nodes)
Insert-or-update by propertygraph.merge_node(node, match_keys=["name"])
Fuzzy insert-or-updategraph.merge_node(node, match_keys=["name"], similarity_threshold=0.9)
Dedupe edges on (src, tgt, relation)graph.merge_edge(edge)
Clean up existing duplicatesgraph.resolve_duplicates(match_keys=["name"], similarity_threshold=0.9)
Pure vector kNNgraph.nearest_nodes(vector, limit)
Pure BM25 textgraph.search_nodes(query, limit)
Combined text + vectorgraph.hybrid_search(query, query_vector, text_weight, vector_weight)
Lookup by propertygraph.nodes_by_attribute("name", "Alice")
Direct neighborsgraph.connected_nodes(node_id)
Multi-hop traversalgraph.query().traverse(source_id=id, depth=2).execute()
Filtered querygraph.query().match(type="Person").where(role="eng").execute()
GraphRAG context assemblygraph.retrieve(query, query_vector, max_hops, max_tokens)
End-to-end Q&Agraph.ask(query, query_vector, llm_callable=fn)
Extract + store from textextract_and_merge(graph, text, match_keys=["name"])
Extract in parallel across chunksextract_and_merge_parallel(graph, chunks, max_workers=8)
PageRank / centralitypagerank(graph), betweenness_centrality(graph)
Atomic blockwith graph.transaction(): ...
Browser visualizationgraph.visualize()

Canonical snippets

Init

from graphmemory import GraphMemory, Node, Edge, MergeStrategy

# database=None is in-memory; pass a path for persistence.
# vector_length and distance_metric are fixed at init time.
graph = GraphMemory(
    database="graph.db",
    vector_length=1536,              # must match your embedding model
    distance_metric="cosine",        # "l2" | "cosine" | "inner_product"
    hnsw_ef_construction=128,
    hnsw_ef_search=64,
    hnsw_m=16,
    auto_index=True,                 # HNSW auto-built on init
    max_retries=3,                   # transient IO error retry
)

Insert + merge

alice = Node(type="Person", properties={"name": "Alice"}, vector=embed("Alice"))
bob = Node(type="Person", properties={"name": "Bob"}, vector=embed("Bob"))
graph.insert_node(alice)
graph.insert_node(bob)
graph.insert_edge(Edge(source_id=alice.id, target_id=bob.id, relation="reports_to"))

# Idempotent re-ingest on a natural key
graph.merge_node(alice, match_keys=["name"])

# Fuzzy merge — tolerates "Alice Smith" vs "alice smith"
graph.merge_node(
    alice,
    match_keys=["name"],
    similarity_threshold=0.9,        # Jaro-Winkler threshold (1.0 = exact)
    vector_threshold=0.2,            # optional cosine distance cap
    match_type=True,                 # also require same `type`
    strategy=MergeStrategy.UPDATE,   # UPDATE | REPLACE | KEEP
)

Hybrid search

results = graph.hybrid_search(
    query_text="who leads ML?",
    query_vector=embed("who leads ML?"),
    text_weight=0.5,
    vector_weight=0.5,
    limit=10,
)
for r in results:
    print(r.score, r.node.properties)

GraphRAG

# Context-only (own the prompt)
result = graph.retrieve(
    query=q, query_vector=qv,
    max_hops=2, max_tokens=4000, search_limit=10,
)
print(result.context_text, result.token_estimate, result.seed_node_count, result.total_node_count)

# End-to-end — llm_callable signature: (system_prompt, user_prompt) -> str
def my_llm(system, user):
    return openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": system}, {"role": "user", "content": user}],
    ).choices[0].message.content

answer = graph.ask(query=q, query_vector=qv, llm_callable=my_llm)
print(answer["answer"])

Pass llm_callable=None to get retrieval-only output — useful to inspect the context before wiring an LLM.

Query builder

# Filter by type + property
engineers = graph.query().match(type="Person").where(role="engineer").execute()

# Multi-hop traversal — returns TraversalResult with depth + path
two_hop = graph.query().traverse(source_id=alice.id, depth=2).execute()

# Paginate + order
page = graph.query().match(type="Person").order_by("name").limit(20).offset(40).execute()

# Return edges instead of nodes
edges = graph.query().match(type="Person").edges().execute()

DSPy extraction

import dspy
from graphmemory.extraction import extract_and_merge, extract_and_merge_parallel

dspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))

# Single pass
node_results, edge_results = extract_and_merge(
    graph, text, match_keys=["name"], similarity_threshold=0.88,
)

# Parallel across chunks — two phases: nodes first (all chunks), then edges
# with the full node context. Saturates your RPM.
node_results, edge_results = extract_and_merge_parallel(
    graph,
    chunks=paragraph_chunks,
    match_keys=["name"],
    similarity_threshold=0.88,
    max_workers=8,                   # match your provider's RPM headroom
    on_progress=lambda phase, done, total: print(f"{phase}: {done}/{total}"),
)

Transactions

with graph.transaction():
    graph.insert_node(a)
    graph.insert_node(b)
    graph.insert_edge(Edge(source_id=a.id, target_id=b.id, relation="x"))
# Exception inside the block → ROLLBACK. Clean exit → COMMIT.

Advanced patterns

Two-pass dedup (idiomatic)

Extract with a loose threshold, then clean up with a tighter one. This is the pattern in examples/test_ingest.py.

# Pass 1 — during ingest, be permissive to avoid fragmenting entities
extract_and_merge_parallel(graph, chunks, similarity_threshold=0.88, max_workers=50)

# Pass 2 — after ingest, resolve residual duplicates more strictly
clusters = graph.resolve_duplicates(
    match_keys=["name"],
    match_type=True,
    similarity_threshold=0.9,
    vector_threshold=0.15,
)
for c in clusters:
    print(f"Kept {c.survivor.properties['name']}, merged {len(c.merged)} dups")

resolve_duplicates picks the first-seen node as survivor, reassigns all incoming/outgoing edges to it, and deletes the rest. Self-loops from the reassignment are dropped.

Custom chunking + sequential linking

Pattern from examples/lexical_graph.py:

prev = None
for chunk in chunks:
    node = Node(type="Chunk", properties={"text": chunk}, vector=embed(chunk))
    graph.insert_node(node)
    if prev is not None:
        graph.insert_edge(Edge(source_id=prev.id, target_id=node.id, relation="followed_by"))
    prev = node

Inspect before asking

result = graph.retrieve(query=q, query_vector=qv, max_hops=2, max_tokens=4000)
print(result.context_text)   # See exactly what the LLM would receive
# Tune max_hops / max_tokens / search_limit before wiring ask()

Gotchas

  • vector_length and distance_metric are locked at init. Swapping embedding models means a new database. Valid metrics: "l2", "cosine", "inner_product".
  • Missing vectors are silently zero-filled in insert_nodebulk_insert_nodes skips nodes whose vectors don't match vector_length and logs a warning. Validate upstream if correctness matters.
  • HNSW is auto-built on init (auto_index=True). Tune via hnsw_ef_construction, hnsw_ef_search, hnsw_m. Call graph.compact_index() after heavy deletes to reclaim space (also called automatically by delete_node).
  • FTS index is lazy — first search_nodes/hybrid_search call after writes rebuilds it. Expect first-search latency. Force a rebuild with graph.reindex() if you want it warm before traffic.
  • Edge dedup key is (source_id, target_id, relation). Relations are normalized (lowercased, underscored) before comparison — "Reports To" and "reports_to" collide. Edge properties are NOT part of the key.
  • delete_node cascades edges in both directions (as source AND as target). No orphan-edge safety net.
  • merge_node strategiesUPDATE shallow-merges dicts (incoming wins on collision), REPLACE overwrites wholesale, KEEP only inserts if new. Pick intentionally.
  • similarity_threshold=1.0 is exact match (the default). Lower it to enable Jaro-Winkler fuzzy matching on string properties. Non-string properties always use JSON equality.
  • match_type=True (default) requires same type for merge. Set False to merge across types — rarely what you want.
  • resolve_duplicates is O(n²)-ish in fuzzy mode. For large graphs, narrow with match_type and a tight vector_threshold first.
  • extraction and algorithms are optional extras. Wrap imports in try/except or check pip show before recommending code that depends on them.
  • Single-writer DuckDB. Connection pooling and @with_retry (exponential backoff on transient IO errors) are built in, but don't open the same file from multiple processes for concurrent writes.
  • cursor() returns independent cursors for concurrent reads; the main connection is RLock-guarded for writes.
  • ask() with llm_callable=None returns retrieval only — no generation. Always use this first to validate context before paying for LLM calls.

Data models

ModelKey fields
Nodeid: UUID, type: str | None, properties: dict, vector: list[float]
Edgeid, source_id, target_id, relation: str, weight: float | None
SearchResultnode, score (higher = better for both BM25 and hybrid)
NearestNodenode, distance (lower = closer)
TraversalResultnode, depth, path: list[UUID]
RetrievalContextnode, relationships: list[dict], hop_distance: int
RetrievalResultquery, contexts, context_text, token_estimate, seed_node_count, total_node_count
MergeResultnode, created: bool (True = inserted, False = updated)
EdgeMergeResultedge, created: bool
DuplicateClustersurvivor: Node, merged: list[Node]

All models are Pydantic. IDs auto-generate as UUIDs.

Examples in the repo

  • examples/openai_example.py — OpenAI embeddings, similarity search, attribute lookup
  • examples/lexical_graph.py — chunked Wikipedia text with SentenceTransformer, sequential followed_by edges
  • examples/dspy_example_typed_pred.py — DSPy typed-predictor extraction
  • examples/test_ingest.py — parallel extraction (50 workers, 0.88 threshold) + post-pass resolve_duplicates at 0.90

Read examples/test_ingest.py before building a real ingest pipeline — it's the template.

Testing

python3 -m pytest tests/tests.py -v

296 tests cover the public API. Run them when modifying the library.

Installs0
GitHub Stars150
LanguagePython
AddedMay 25, 2026
View on GitHub