AI memory system
AI-NATIVE SAAS · SEED STAGE · B2B SaaS

A long-term memory layer for an agent that cut hallucinations and made conversations feel like the user was being remembered.

Drop in hallucination rate

~70%

4 weeks · 1 PM + 1 engineer

Memory lookup latency

< 200ms

Episodic, semantic, procedural

3 layers

Kick-off to production

4 weeks

“Memory used to be the thing that broke first when we scaled. Now it is the thing users compliment.”

CTO, AI-native SaaS

Before

Where the team was when we picked this up.

  • The agent forgot users between sessions. New conversations felt like cold starts.
  • Pulling the entire history into context worked at first then broke as users grew.
  • No way to tell when memory was the cause of a bad answer versus the model itself.

What we built

Three-layer store

Episodic (what happened, when), semantic (what the user is and cares about) and procedural (how this user likes to be handled). Each written and retrieved differently.

Selective recall

A small retrieval model decides what to pull into context for each turn. Cheap, fast, and trained on the team’s real conversations.

Memory evals

A test set of multi-session conversations with expected recall behaviour. Catches regressions before the next release.

What changed

  • Users describe the agent as feeling like it knows them.
  • Costs went down because the agent stops dragging entire transcripts into every prompt.
  • The team can ship model upgrades without holding their breath.

After

Same team. Same week. Different shape of work.

Stack

Anthropic ClaudepgvectorPostgresRedisCustom evalsTypeScript

Timeline & team

4 weeks · 1 PM + 1 engineer

Got a workflow like this one?

Book a working session. We will tell you whether this is a four-week build or something bigger, and what it would take to ship it.

Book a working session

Cookie settings

Optional analytics and marketing cookies only run if you allow them.