Research

Compile-time semantics vs query-time memory

Most memory systems build understanding from conversations — and pay for it on every query. Enzyme pre-computes thematic questions (catalysts) from existing content at init time, so queries are a database lookup of pre-scored relationships. No LLM call, no embedding API, no external dependency.

8ms query latency vs 200ms–1.4s for competitors

0 API calls per query fully local — no dependency

~3K tokens per query vs 7–50K for competitors

15s cold start from content others need conversation history

Understanding depth vs query efficiency

Most systems trade one for the other. Enzyme breaks the tradeoff by moving compute to init time.

Query efficiency ↑ Efficient but shallow Deep + efficient Expensive + shallow Deep but expensive

Fact recall Context depth → Cross-timeline patterns

Per-query cost breakdown

What happens every time your agent asks a question.

Tokens per query

Enzyme

~3.1K

Mem0

~7.0K

Honcho

~13K

Zep

~15K

Letta

~10K

Query latency

Enzyme

8ms

Mem0

200ms

Honcho

200ms

Zep

500ms

Letta

2.0s

External API calls per query

Enzyme

Mem0

1–9

Honcho

2–4

Zep

Letta

Time to first useful response

Most memory systems start empty. They need 10, 20, 50 conversations before they know anything useful. Enzyme reads what the user already brought in.

15s Enzyme From 1,000 existing docs

∞ Mem0 Needs conversation history

∞ Honcho Needs conversation history

hours Zep Graph construction

∞ Letta Agent must self-build memory

~10m LLM Wiki Manual curation required

Numbers from published docs, papers, and benchmarks: Mem0 ECAI 2025 paper (arXiv:2504.19413), Honcho benchmark blog (evals.honcho.dev), Zep paper (arXiv:2501.13956), Letta docs. Enzyme numbers from internal benchmarks on 10K+ chunk vaults. Read the full analysis →

Where the understanding gets built

Query-time systems

Mem0, Honcho, Letta, and Supermemory build understanding from conversation messages. They require LLM inference at ingestion, query time, or both — and understanding accumulates from conversations, not from existing content. Excellent for fact recall from chat history. Structurally limited on cross-source patterns across a content corpus.

Compile-time semantics

Enzyme reads the full corpus at init and generates catalysts: thematic questions derived from the user's own structure — their tags, links, and folders. A catalyst might ask “the team revisited caching three times. What changed between each return?” — a question that cuts across 18 months of content. The engine pre-computes similarity between every catalyst and every document chunk. At query time, finding the right content is a database lookup of relationships that were already identified.

The indexing cost amortizes differently too. Mem0 runs an LLM call on every message ingested. Enzyme generates catalysts per entity — and a user's conceptual lens is stable even as their content grows. New documents slot into existing catalyst relationships without regeneration. The expensive work is one-time and sublinear: it scales with the number of entities (tags, links), not the number of documents.

What happens when a query arrives

Enzyme

embed query 1ms → match catalysts 0.5ms → lookup chunks 3ms → rank + return 2ms

~8ms, 0 LLM calls

Mem0

embed query → vector + BM25 → embed entities → score + return

~200ms, 0 LLM calls — but LLM per ingest + embedding API per query

Honcho

query → dialectic agent → reasoning → return

~200ms+, 1–3 LLM calls

Letta

agent decides → tool calls → retrieve → synthesize

~2s+, 3+ LLM calls

orange = LLM inference yellow = embedding API call Each adds latency and external dependency.

Beyond personal vaults

The numbers above come from personal knowledge bases, but the architecture applies to any accumulated content. A user who imports reading highlights, saved recipes, design explorations, or annotated research has the same structural property: content that exists before the first conversation, with patterns that span months of accumulation.

enzyme apply projects one corpus's catalysts onto another — the user's intellectual framework becomes a lens for unfamiliar content. For product teams, this means the concept graph is a portable artifact: compile it from what the user imports, and every agent session starts with understanding rather than building it from scratch.

The SDK is in private beta for product teams with accumulated user histories. Talk about your corpus →

Read the full position paper → Try Enzyme on your vault →