Compile-time semantics vs query-time memory
Most memory systems build understanding from conversations — and pay for it on every query. Enzyme pre-computes thematic questions (catalysts) from existing content at init time, so queries are a database lookup of pre-scored relationships. No LLM call, no embedding API, no external dependency.
Understanding depth vs query efficiency
Most systems trade one for the other. Enzyme breaks the tradeoff by moving compute to init time.
Time to first useful response
Most memory systems start empty. They need 10, 20, 50 conversations before they know anything useful. Enzyme reads what the user already brought in.
Numbers from published docs, papers, and benchmarks: Mem0 ECAI 2025 paper (arXiv:2504.19413), Honcho benchmark blog (evals.honcho.dev), Zep paper (arXiv:2501.13956), Letta docs. Enzyme numbers from internal benchmarks on 10K+ chunk vaults. Read the full analysis →
Where the understanding gets built
Query-time systems
Mem0, Honcho, Letta, and Supermemory build understanding from conversation messages. They require LLM inference at ingestion, query time, or both — and understanding accumulates from conversations, not from existing content. Excellent for fact recall from chat history. Structurally limited on cross-source patterns across a content corpus.
Compile-time semantics
Enzyme reads the full corpus at init and generates catalysts: thematic questions derived from the user's own structure — their tags, links, and folders. A catalyst might ask “the team revisited caching three times. What changed between each return?” — a question that cuts across 18 months of content. The engine pre-computes similarity between every catalyst and every document chunk. At query time, finding the right content is a database lookup of relationships that were already identified.
The indexing cost amortizes differently too. Mem0 runs an LLM call on every message ingested. Enzyme generates catalysts per entity — and a user's conceptual lens is stable even as their content grows. New documents slot into existing catalyst relationships without regeneration. The expensive work is one-time and sublinear: it scales with the number of entities (tags, links), not the number of documents.
What happens when a query arrives
orange = LLM inference yellow = embedding API call Each adds latency and external dependency.
Beyond personal vaults
The numbers above come from personal knowledge bases, but the architecture applies to any accumulated content. A user who imports reading highlights, saved recipes, design explorations, or annotated research has the same structural property: content that exists before the first conversation, with patterns that span months of accumulation.
enzyme apply projects one corpus's catalysts onto another — the user's intellectual framework becomes a lens for unfamiliar content. For product teams, this means the concept graph is a portable artifact: compile it from what the user imports, and every agent session starts with understanding rather than building it from scratch.
The SDK is in private beta for product teams with accumulated user histories. Talk about your corpus →