Threadline MCP server is live — install in Cursor in 60 seconds
v0.1.8 — MCP-native, OpenAI · Anthropic · LangChain

Your agents forget.Threadline remembers.

A persistent memory layer for AI agents. Two lines of code, one context object that travels with your user across every model, session, and product.

200 tokens of signal, not 2,000 tokens of noise — relevance-scored injection.

Free to startNo credit card~38ms p50 retrieval
agent.ts● live
1import { Threadline } from "threadline-sdk";
2 
3const tl = new Threadline({ apiKey: "tl_live_..." });
4 
5// Inject relevant user context before your LLM call
6const { injectedPrompt } = await tl.inject(userId, basePrompt);
7 
8// After your LLM responds, update the user's memory
9await tl.update({ userId, userMessage, agentResponse });

Compatible with everything you already use

OpenAI
Claude
LangChain
Cursor
Vercel AI SDK
Mistral
Ollama
MCP
OpenAI
Claude
LangChain
Cursor
Vercel AI SDK
Mistral
Ollama
MCP
01 / The problem

Why agents forget.

Three architectural failures every team hits once they go past a demo. Threadline eliminates all three.

Context windows blow up

Stuffing every chat into the prompt burns tokens and degrades quality after ~8 turns.

Threadline

Threadline distills sessions into compact, recallable facts.

RAG retrievals go stale

Vector search returns yesterday's answer, not what the user said five minutes ago.

Threadline

Real-time memory updates with versioning and recency scoring.

Sessions die at logout

Switch models, switch products, restart the app — your agent forgets the user.

Threadline

One context object, portable across every LLM and surface.

~38ms retrieval
7 context scopes
2 lines to integrate

~38ms retrieval means zero-latency injection before every LLM call — your agent never waits.

your-app.com

Your Agent

user interaction

prompts + history
context layer

Threadline

threadline.to

enriched context
custom agent

Your Product

already knows

One context object · every agent · no repeated conversation

1. inject() enriches your prompt → 2. LLM call with context 3. update() captures new facts

Two lines. That's it.

Choose your integration method and start giving your agents memory in under a minute.

TypeScript
01npm install threadline-sdk
02 
03import { Threadline } from "threadline-sdk"
04 
05const tl = new Threadline({ apiKey: "tl_live_..." })
06 
07const { injectedPrompt } = await tl.inject(userId, basePrompt)
08 
09await tl.update({ userId, userMessage, agentResponse })

Everything your agent needs to know.

Seven scopes of context, automatically extracted and maintained.

communication_style

How they like to be spoken to

e.g. "Prefers concise, bullet‑point answers"

ongoing_tasks

What they're currently working on

e.g. "Building a Stripe payments API, deadline Q2"

key_relationships

People and roles they mention

e.g. "Co‑founder Sarah handles design"

domain_expertise

Their technical background

e.g. "5 years TypeScript, Next.js, Supabase"

preferences

How they want the agent to behave

e.g. "No code examples unless asked"

emotional_state

Current mood and urgency signals

e.g. "High urgency, deadline pressure this week"

general

Core identity and context

e.g. "Vidur, Delhi, pre‑revenue founder"

User‑owned context. Not yours. Not ours. Theirs.

Built for the world where users demand control over their AI data.

OAuth‑style grants

Users approve what each agent can see

Hard delete

Users can permanently erase their context

Full audit trail

Every read and write is logged

Context Dashboard

Manage what agents can access

communication_style

Granted

ongoing_tasks

Granted

emotional_state

Revoked

domain_expertise

Granted

Auth0 solved identity. Threadline solves context — because if your memory layer can't forget, it shouldn't remember.

FeatureThreadlineMem0SupermemoryZepLetta
Persistent memory
Works with any LLM
MCP compatible
User‑owned context
OAuth‑style grant system
Scoped agent access
Idempotent grants / scope expansion
Full audit trail
Hard delete by user
Retrieval latency~38ms*~200ms<300ms~300msN/A
Free tier10K mem/mo1K/moLimitedNoneNone

* Threadline's inject() is a direct database lookup, not vector search. Intelligence happens at update() time, not retrieval time — which is why retrieval is this fast. Competitors perform full semantic search at retrieval time.

Start free. Scale when you're ready.

Free up to 10,000 memories a month — no credit card. Builder and Scale tiers when your agent grows up.

See full pricing

Your agents are forgetting your users. Fix it in two lines.

Free to start. No credit card. Works with everything you already use.

Get API Key →