What 1,600+ AI Learnings Reveal

TL;DR

Via's learnings database holds 1,604 learnings and 54 meta-learnings captured from real agent work. The most valuable entries are the 360 errors — institutional memory that prevents repeated mistakes. The 54 meta-learnings are a roadmap: 15 persona gaps tell me which specialists to add, and 9 mismatch entries tell me where routing is failing.

The Dataset

Via's learnings system has been capturing knowledge from every agent run since the system went live. The database currently holds 1,604 learnings and 54 meta-learnings. This isn't a massive dataset by ML standards, but it's a dense one — every entry was generated by an agent doing real work, not synthetic benchmarks.

Here's the breakdown by category:

Learning Type	Count	%	What it captures
Insights	703	43.8%	Techniques that worked, patterns worth reusing
Errors	360	22.4%	Mistakes, gotchas, things that broke
Sources	195	12.2%	Useful APIs, documentation, reference material
Decisions	185	11.5%	Architectural choices with rationale
Patterns	148	9.2%	Reusable approaches observed across multiple runs

The distribution itself tells a story. Nearly half the learnings are insights — "this works well, do more of it." A quarter are errors — "this broke, don't do it again." The remaining quarter is roughly split between reference material, design decisions, and patterns.

Why Errors Are the Most Valuable Category

The 360 errors are the crown jewels of the database. Each one represents a mistake that an agent made, diagnosed, and documented. When a future agent working on a similar task receives the learning "Avoid: go build fails silently when the embedding directive references a missing file," it sidesteps that debugging session entirely.

This is institutional memory for software agents. The same concept exists in human organizations — senior engineers who say "don't do that, we tried it in 2019 and it broke production." The difference is that Via captures this automatically as a side effect of doing work.

The most common error categories:

1. API misuse (32% of errors). Incorrect parameters, deprecated endpoints, missing headers. These are the kind of mistakes every developer makes the first time they use an API. With the learnings system, the first agent figures it out, and no subsequent agent repeats it.

2. Configuration issues (24% of errors). Environment variables missing or malformed, config files in the wrong format, path resolution bugs. These are frustrating because they're not code bugs — they're environment bugs. The learnings system turns each one into a warning for future agents.

3. Tool constraints (18% of errors). Trying to use a tool in a way it wasn't designed for. Trying to grep a binary file. Trying to use SQLite's json_extract on malformed JSON. Trying to run tsc without a tsconfig.json in scope. Each constraint, once discovered, becomes permanent knowledge.

4. Order-of-operations errors (14% of errors). FTS5 triggers created before the main table. Migrations run before the schema exists. Tests executed before the build. These are the most insidious because the error messages often point to the wrong cause.

5. Other (12% of errors). Encoding issues, permission problems, rate limits, and edge cases.

Meta-Learnings: The System Watching Itself

Beyond task-level learnings, Via captures meta-learnings — observations about the orchestration system itself. These come from agents tagging observations about the workflow rather than the task:

Meta-Learning Type	Count	What it means
`meta_gap`	15	Missing persona — no specialist for this task type
`meta_issue`	12	System behavior problem — agent told to plan but task requires writing
`meta_observation`	11	Workflow insight — "research phases produce better output with file path constraints"
`meta_mismatch`	9	Wrong persona assigned — writer agent doing review work

These 54 meta-learnings are my development roadmap.

The 15 meta_gap entries tell me exactly which personas are missing from the system. "No DBA specialist for database migration tasks." "No DevOps persona for CI/CD pipeline work." "No UX specialist for frontend interaction patterns." Each gap entry is a signal to add a new persona to the configuration.

The 9 meta_mismatch entries tell me where the keyword-based persona selector is failing. A "writer agent assigned to review task" means the selector matched "write" in "write a review" and picked the writer instead of the reviewer. These are the clearest argument for upgrading from keyword matching to semantic persona selection.

The 11 meta_observation entries are workflow optimizations hiding in plain sight. "Research phases consistently produce better output with explicit file path constraints" led me to always include target directory paths in research phase descriptions. Small change, measurable improvement.

Patterns Across Domains

Looking at the 148 pattern entries reveals which insights generalize across different types of work:

Hybrid search outperforms pure approaches. This pattern appears in variations across research tasks, code search, and knowledge retrieval. FTS5 alone misses semantic matches. Embeddings alone miss exact terminology. The combination consistently outperforms either approach individually. This validated the 30/70 keyword/semantic scoring used across all of Via's search systems.

Smaller, focused phases produce better output than broad ones. Agents given a tight scope — "research OAuth providers that support PKCE" — produce more actionable output than agents given a broad scope — "research authentication." This pattern drove the decomposer to produce more phases with narrower scopes.

Documentation-before-implementation reduces rework. This mirrors my own experience during the documentation binge at the start of the sprint. Agents that plan before coding produce fewer errors and require fewer iterations.

What the Data Doesn't Show

The learnings database also reveals gaps in the system itself:

No learning decay. Old learnings persist forever. A workaround for a bug that's been patched six weeks ago still gets injected into agent prompts. I need TTL (time-to-live) or confidence decay on learnings.

No quality feedback loop. The system tracks how often a learning is seen (via the deduplication mechanism), but not whether it was helpful. An agent might receive 10 learnings, use 2, and ignore 8. There's no signal for which ones mattered.

No cross-domain analysis. The learnings are stored with a domain tag, but there's no mechanism to identify patterns that span domains. A financial modeling insight that applies to data pipeline design won't surface in a development context.

Despite these gaps, the signal is strong enough to matter. Agents running today are measurably better than agents running two weeks ago. The feedback loop works — it just needs refinement.

Next: Why I Route Research to Gemini

Teaching AI to Learn From Its Mistakes — How the learnings capture and deduplication system works
Why I Built a Multi-LLM Orchestration System — The earlier system that became Via's routing layer

What 1,600+ AI Learnings Reveal

TL;DR

The Dataset

Why Errors Are the Most Valuable Category

Meta-Learnings: The System Watching Itself

Patterns Across Domains

What the Data Doesn't Show

Related Posts

Why I Built a Multi-LLM Orchestration System (And You Might Want One Too)

Why I Built a Personal Intelligence OS

Starting Line: The Case for Personal AI

TL;DR

The Dataset

Why Errors Are the Most Valuable Category

Meta-Learnings: The System Watching Itself

Patterns Across Domains

What the Data Doesn't Show

Related Reading

Related Posts

Why I Built a Multi-LLM Orchestration System (And You Might Want One Too)

Why I Built a Personal Intelligence OS

Starting Line: The Case for Personal AI