Acervo

Semantic compression layer for AI agents.
Your agent's context window is finite. Acervo makes it infinite.

21x
vs Agent efficiency
100%
Ground Truth
79
Turns Tested
3
Domains

v0.5.0 · April 2026 · Apache 2.0 · Open Source

The Problem

Every AI chat sends the entire conversation history on every turn. Turn 1 costs 200 tokens. Turn 50 costs 9,000. Turn 100 hits the context window limit and starts losing information.

Acervo replaces growing history with a constant ~350 tokens of compressed knowledge.

How It Works

S1Extract entities
& topics
S2Search graph
by topic (BFS)
S3Compress to
~350 tokens
LLMResponds with
graph knowledge

After the LLM responds, S1.5 extracts new knowledge back into the graph. The graph grows. The context doesn't.

Live Example

Real data from the C1 test scenario. The graph grows as the user talks, and retrieval uses BFS traversal from the topic seed.

👤 "Tenemos 4 proyectos: Butaco con Angular, Checkear con React y Supabase, Walletfy con Next.js y Supabase, Altovallestudio con Next.js y Firebase"
S1 extracts 9 entities, 18 edges · Graph: 9 nodes
👤 "¿Qué proyectos usan Supabase?"
S2 BFS from Supabase → HOT: Supabase · WARM: Checkear, Walletfy
🤖 "Checkear y Walletfy usan Supabase." — 81 tokens of context
👤 "¿Y cuáles usan Firebase?"
S2 BFS from Firebase → HOT: Firebase · WARM: Butaco, Altovallestudio
🤖 "Butaco y Altovallestudio." — same graph, different traversal

The graph grows. The context doesn't. Turn 50 costs the same as turn 2.

The Efficiency Story

An agent with tools makes 2–3 tool calls per question, consuming 7,000+ tokens. Acervo answers the same questions with ~350 tokens and zero tool calls.

21x
fewer tokens than an agent with tools · same questions, same correct answers
12x in v0.4 → 21x in v0.5 (format compression)

What We Test

85%
Resolve
100%
Ground
67%
Recall
100%
Focus
89%
Adapt

RESOLVE (85%) — Can Acervo answer questions that need project knowledge? A stateless LLM can't. An agent needs 3 tool calls. Acervo has it in the graph.

GROUND (100%) — Does Acervo prevent hallucination? "Does this project use GraphQL?" — Acervo checks the graph: no GraphQL node.

RECALL (67%) — Can Acervo remember facts from earlier? Our weakest category. Improving extraction = improving recall.

FOCUS (100%) — Does Acervo send only relevant context? When you ask about auth, only auth nodes arrive.

ADAPT (89%) — Can Acervo handle topic changes? BFS starts from a different seed node — context switches instantly.

Graph Quality

v0.5 introduced automated quality specs: verify the graph contains what it SHOULD and doesn't contain what it SHOULDN'T. No more phantom entities.

ProjectChecksEntitiesNodesEdges
P1 Code (Todo App)28/2872311,109
P2 Literature (Sherlock Holmes)21/21540307
P3 PM Docs32/326108331

Conversation Scenarios

v0.4 only worked with pre-indexed projects. v0.5 adds real-time conversation memory: as you chat, the knowledge graph grows. Every fact becomes a node. Every relationship becomes an edge.

ScenarioTurnsPassedGraphEntity Acc
C1: Multi-project Portfolio107/1013n / 27e72%
C2: Personal Knowledge63/65n / 4e60%
C3: Progressive Building87/86n / 5e83%

v0.4 → v0.5

Metricv0.4.0v0.5.0Change
ArchitectureGod module (1,848 LOC)Hexagonal (~200 LOC)Refactor
GROUND accuracy92%100%+8%
Efficiency ratio12.1x21.3x+76%
Conversation pipelineNot working71% pass rateNEW
BFS semantic layersNot workingHOT/WARM/COLDNEW
Graph quality specsNone85/85 checksNEW
warm_tokens > 0 (conv)0%80%+
RESOLVE accuracy100%85%-15%

Known Gaps

v0.5 is not perfect. Here's what we know doesn't work well yet.

RECALL: 67%
S1 doesn't always extract facts from the assistant's response. Fix: more S1.5 training examples in fine-tune v3.
RESOLVE dropped 100% → 85%
BFS-based S2 is more precise but sometimes misses the right seed node. Fix: combine BFS with keyword fallback in v0.6.
Person extraction in non-technical context
The model misses person names outside code discussions. Fix: more diverse training data.
S1 intent: 78% accuracy
Over-classifies as "overview" when it should be "specific." Fix: more intent training examples.