threadline-sdk v0.1.9 — drop-in memory for any LLM agent. See quickstart
v0.1.9 — Developer-direct SDK · OpenAI · Anthropic · Local models

Agent memory thatdoesn't lock you to a model.

Threadline is the memory layer for developers building AI agents. Inject context before your LLM call, update after. Two lines of code, one context object that travels with your user across every model, session, and product you ship.

200 tokens of signal, not 2,000 tokens of noise — relevance-scored injection.

Free to startNo credit card~38ms p50 retrieval
agent.ts● live
1import { Threadline } from "threadline-sdk";
2 
3const tl = new Threadline({
4 apiKey: process.env.THREADLINE_API_KEY
5});
6 
7// Inject user context before your LLM call
8const { injectedPrompt } = await tl.inject(userId, basePrompt);
9 
10// After your LLM responds, persist new facts
11await tl.update({ userId, userMessage, agentResponse });

Compatible with everything you already use

OpenAI
Claude
LangChain
Cursor
Vercel AI SDK
Mistral
Ollama
MCP
OpenAI
Claude
LangChain
Cursor
Vercel AI SDK
Mistral
Ollama
MCP
01 / The problem

Why agents forget.

Three architectural failures every team hits once they go past a demo. Threadline eliminates all three.

Context windows blow up

Stuffing every chat into the prompt burns tokens and degrades quality after ~8 turns.

Threadline

Threadline distills sessions into compact, recallable facts.

RAG retrievals go stale

Vector search returns yesterday's answer, not what the user said five minutes ago.

Threadline

Real-time memory updates with versioning and recency scoring.

Sessions die at logout

Switch models, switch products, restart the app — your agent forgets the user.

Threadline

One context object, portable across every LLM and surface.

~38ms p50 retrieval
7 context scopes
2 lines to integrate

~38ms p50 retrieval means zero-latency injection before every LLM call — your agent never waits.

your-app.com

Your Agent

user interaction

prompts + history
context layer

Threadline

threadline.to

enriched context
custom agent

Your Product

already knows

One context object · every agent · no repeated conversation

1. inject() enriches your prompt → 2. LLM call with context 3. update() captures new facts

Two lines. That's it.

Choose your integration method and start giving your agents memory in under a minute.

TypeScript
01npm install threadline-sdk
02 
03import { Threadline } from "threadline-sdk"
04 
05const tl = new Threadline({
06 apiKey: process.env.THREADLINE_API_KEY
07})
08 
09const { injectedPrompt } = await tl.inject(userId, basePrompt)
10 
11await tl.update({ userId, userMessage, agentResponse })

Everything your agent needs to know.

Seven scopes of context, automatically extracted and maintained.

preferences

How they want the agent to behave

e.g. "No code examples unless asked"

goals

What they're working toward

e.g. "Launching a SaaS product, targeting developers"

knowledge

Their technical background and expertise

e.g. "5 years TypeScript, Next.js, Supabase"

history

What's happened across past sessions

e.g. "Last session: deploying to Cloudflare Workers"

relationships

People and roles they mention

e.g. "Co-founder Sarah handles design"

communication_style

How they like to be spoken to

e.g. "Prefers concise, bullet-point answers"

general

Core identity and context

e.g. "Maya, San Francisco, building a fintech app"

User‑owned context. Not yours. Not ours. Theirs.

Built for the world where users demand control over their AI data.

OAuth‑style grants

Users approve what each agent can see

Hard delete

Users can permanently erase their context

Full audit trail

Every read and write is logged

Context Dashboard

Manage what agents can access

preferences

Granted

goals

Granted

history

Granted

relationships

Revoked

Auth0 solved identity. Threadline solves context.

FeatureThreadlineMem0SupermemoryZepLetta
Persistent memory
Works with any LLM
MCP compatible
User‑owned context
OAuth‑style grant system
Scoped agent access
Idempotent grants / scope expansion
Full audit trail
Hard delete by user
Retrieval latency~38ms p50*~200ms<300ms~300msN/A
Free tier10K mem/mo1K/moLimitedNoneNone

* Threadline's inject() is a direct database lookup, not vector search. Intelligence happens at update() time, not retrieval time — which is why retrieval is this fast. Competitors perform full semantic search at retrieval time.

Start free. Scale when you're ready.

Free up to 10,000 memories a month — no credit card. Builder and Scale tiers when your agent grows up.

See full pricing

Ship agent memory in production. Two lines, any LLM, free for 10K memories.

Free to start. No credit card. Works with everything you already use.

Get API Key →