About Enzyme

Compile-time intelligence for knowledge bases. Built by an ML engineer who shipped recommendation systems to hundreds of millions of users.

At Spotify, I built agentic playlist curation — systems that had to take imprecise, personal, half-formed signals seriously. "Sunday morning cleaning music" is a mood and a tempo that someone can't quite articulate, and the recommendation engine had to understand it anyway. That work shipped to hundreds of millions of users.

Enzyme is the same problem from a different direction. Instead of understanding taste from listening behavior, it extracts conceptual structure from accumulated writing — notes, highlights, transcripts, agent outputs. The engine compiles a knowledge base into a concept graph in under 20 seconds, then serves semantic queries in 8ms on device. No conversation history needed. No cold start.

I started building it in 2023 and left Spotify in 2025 to work on it full-time. By then it had been my daily driver for two years — one of the first MCPs, integrated with Claude Code since early 2025, pressure-tested against thousands of documents.

The problem

Every knowledge base has the same gap. The content is there — documents, highlights, transcripts, saves, agent outputs — but the conceptual structure is implicit. An agent can search by keyword. It can't see the recurring tensions, the themes that span months, the connections between captures that don't share vocabulary.

Most memory and personalization tools try to build this understanding at runtime, through conversation. They start empty and accumulate over time. That works if the user is willing to have dozens of conversations before the system knows them. It doesn't work when the content already exists and the intelligence layer needs to be ready on day one.

The approach

Enzyme treats understanding as a compile step, not a runtime process. It reads the structure of a knowledge base — tags, links, folders, timestamps — and generates catalysts: questions that cut across the content and name what's latent in it. Those catalysts become the search layer. Queries run through them instead of through the user's words.

The pipeline is the same whether the input is a personal notes vault, a product corpus of user saves, or a research collection. The structure changes. The engine doesn't. And because catalysts are pre-computed, not generated at query time, queries are fast (8ms), local, and free — no per-query API cost, no data leaving the device.

The design philosophy

Enzyme is built for corpora that grew before anyone planned them. Half-finished tags, abandoned folders, links that meant something at 2am — these are real signals, not noise. The engine treats incomplete structure as intent, not failure. A knowledge base doesn't need to be organized before Enzyme can read it. The conceptual shape is already there.

Let's talk

If you've built something over years and you know there's more in it than you can currently reach — I'd like to hear what you're working with.

Built by Joshua Pham in New York. Founding members are shaping what comes next in Discord.