Acervo Benchmark Report

Approach Comparison

How Acervo compares to a stateless LLM and an agent with tools on the same questions.

RESOLVE — 13 turns

Questions that require project-specific knowledge to answer.

Approach	Can Answer	Avg Input Tokens	Avg Steps
Stateless LLM	8%	—	—	baseline
Agent + Tools	100%	7,462	2.8	multi-step
Acervo	100%	616	0	12.1x faster

GROUND — 11 turns

Questions where the answer must be grounded in actual project data, not general knowledge.

Approach	Can Answer	Avg Input Tokens	Avg Steps
Stateless LLM	27%	—	—	baseline
Agent + Tools	100%	5,500	2.3	multi-step
Acervo	91%	600	0	9.2x faster

Component Health

Internal diagnostic scores for each pipeline stage.

78%

S1 Intent

56%

S2 Activation

32%

S3 Budget

81%

S3 Quality

Category × Component Matrix

Where each category is strong or weak across pipeline components.

Category	S1 Intent	S2 Activation	S3 Budget	S3 Quality	Final Score
RESOLVE	73%	50%	0%	100%	100%
GROUND	80%	50%	0%	77%	92%
RECALL	n/a	n/a	n/a	33%	67%
FOCUS	73%	38%	40%	89%	100%
ADAPT	89%	100%	100%	78%	100%

S1 Intent Misclassifications

9 turns where the model classified user intent incorrectly.

Turn 2 "What technologies does this project use?" expected: overview got: specific
Turn 6 "Interesting, this is a well-structured project" expected: chat got: overview
Turn 19 "Ok, I think I understand the project now" expected: chat got: specific
Turn 1 "What is this book about?" expected: overview got: specific
Turn 8 "These are great detective stories" expected: chat got: specific
Turn 14 "Overall, what themes run through these stories?" expected: overview got: specific
Turn 1 "What project is documented here?" expected: overview got: specific
Turn 2 "What documents are available?" expected: overview got: specific
Turn 9 "This project seems well-documented" expected: chat got: overview

Pipeline Validation

Index → Curate → Synthesize results for each fixture project.

P1: Todo App TypeScript/React

Files indexed31

Sections38

Symbols134

Folders17

Chunks350

Graph nodes235

Graph edges1,163

Entities11

Synthesis nodes4

Express.js SQLite JWT Nuxt Tailwind CSS React MongoDB Docker

P2: Literature Sherlock Holmes EPUB

Files indexed1

Sections33

Chunks2,056

Graph nodes43

Graph edges126

Entities8

Synthesis nodes1

Sherlock Holmes John Watson Baker Street Irene Adler Professor Moriarty A Scandal in Bohemia

P3: Project Docs ADRs, Issues, Sprints

Files indexed11

Sections84

Folders3

Chunks160

Graph nodes116

Graph edges383

Entities14