Research

Compile-time semantics vs query-time memory

Most memory systems build understanding from conversations — and pay for it on every query. Enzyme pre-computes thematic questions (catalysts) from existing content at init time, so queries are a database lookup of pre-scored relationships. No LLM call, no embedding API, no external dependency.

8ms query latency vs 200ms–1.4s for competitors
0 API calls per query fully local — no dependency
~3K tokens per query vs 7–50K for competitors
15s cold start from content others need conversation history

Most systems trade one for the other. Enzyme breaks the tradeoff by moving compute to init time.

Query efficiency ↑ Efficient but shallow Deep + efficient Expensive + shallow Deep but expensive Enzyme Mem0 Honcho Zep Letta LLM Wiki
Fact recall Context depth → Cross-timeline patterns

What happens every time your agent asks a question.

Tokens per query
Enzyme
~3.1K
Mem0
~7.0K
Honcho
~13K
Zep
~15K
Letta
~10K
Query latency
Enzyme
8ms
Mem0
200ms
Honcho
200ms
Zep
500ms
Letta
2.0s
External API calls per query
Enzyme
0
Mem0
1–9
Honcho
2–4
Zep
1
Letta
3+

Most memory systems start empty. They need 10, 20, 50 conversations before they know anything useful. Enzyme reads what the user already brought in.

15s Enzyme From 1,000 existing docs
Mem0 Needs conversation history
Honcho Needs conversation history
hours Zep Graph construction
Letta Agent must self-build memory
~10m LLM Wiki Manual curation required

Numbers from published docs, papers, and benchmarks: Mem0 ECAI 2025 paper (arXiv:2504.19413), Honcho benchmark blog (evals.honcho.dev), Zep paper (arXiv:2501.13956), Letta docs. Enzyme numbers from internal benchmarks on 10K+ chunk vaults. Read the full analysis →

Where the understanding gets built

Query-time systems

Mem0, Honcho, Letta, and Supermemory build understanding from conversation messages. They require LLM inference at ingestion, query time, or both — and understanding accumulates from conversations, not from existing content. Excellent for fact recall from chat history. Structurally limited on cross-source patterns across a content corpus.

Compile-time semantics

Enzyme reads the full corpus at init and generates catalysts: thematic questions derived from the user's own structure — their tags, links, and folders. A catalyst might ask “the team revisited caching three times. What changed between each return?” — a question that cuts across 18 months of content. The engine pre-computes similarity between every catalyst and every document chunk. At query time, finding the right content is a database lookup of relationships that were already identified.

The indexing cost amortizes differently too. Mem0 runs an LLM call on every message ingested. Enzyme generates catalysts per entity — and a user's conceptual lens is stable even as their content grows. New documents slot into existing catalyst relationships without regeneration. The expensive work is one-time and sublinear: it scales with the number of entities (tags, links), not the number of documents.

What happens when a query arrives

Enzyme
embed query 1ms match catalysts 0.5ms lookup chunks 3ms rank + return 2ms
~8ms, 0 LLM calls
Mem0
embed query vector + BM25 embed entities score + return
~200ms, 0 LLM calls — but LLM per ingest + embedding API per query
Honcho
query dialectic agent reasoning return
~200ms+, 1–3 LLM calls
Letta
agent decides tool calls retrieve synthesize
~2s+, 3+ LLM calls

orange = LLM inference   yellow = embedding API call   Each adds latency and external dependency.

Beyond personal vaults

The numbers above come from personal knowledge bases, but the architecture applies to any accumulated content. A user who imports reading highlights, saved recipes, design explorations, or annotated research has the same structural property: content that exists before the first conversation, with patterns that span months of accumulation.

enzyme apply projects one corpus's catalysts onto another — the user's intellectual framework becomes a lens for unfamiliar content. For product teams, this means the concept graph is a portable artifact: compile it from what the user imports, and every agent session starts with understanding rather than building it from scratch.

The SDK is in private beta for product teams with accumulated user histories. Talk about your corpus →