Skip to main content
// JH

· 6 min read

What 1,600+ AI Learnings Reveal

I analyzed the 1,604 insights and 54 meta-learnings captured by Via's agents. Here are the patterns, the most valuable errors, and what the system is telling me to build next.

ai · data-analysis · learnings · meta-learning

TL;DR

Via's learnings database holds 1,604 learnings and 54 meta-learnings captured from real agent work. The most valuable entries are the 360 errors — institutional memory that prevents repeated mistakes. The 54 meta-learnings are a roadmap: 15 persona gaps tell me which specialists to add, and 9 mismatch entries tell me where routing is failing.


The Dataset

Via's learnings system has been capturing knowledge from every agent run since the system went live. The database currently holds 1,604 learnings and 54 meta-learnings. This isn't a massive dataset by ML standards, but it's a dense one — every entry was generated by an agent doing real work, not synthetic benchmarks.

Here's the breakdown by category:

Learning TypeCount%What it captures
Insights70343.8%Techniques that worked, patterns worth reusing
Errors36022.4%Mistakes, gotchas, things that broke
Sources19512.2%Useful APIs, documentation, reference material
Decisions18511.5%Architectural choices with rationale
Patterns1489.2%Reusable approaches observed across multiple runs

The distribution itself tells a story. Nearly half the learnings are insights — "this works well, do more of it." A quarter are errors — "this broke, don't do it again." The remaining quarter is roughly split between reference material, design decisions, and patterns.

Why Errors Are the Most Valuable Category

The 360 errors are the crown jewels of the database. Each one represents a mistake that an agent made, diagnosed, and documented. When a future agent working on a similar task receives the learning "Avoid: go build fails silently when the embedding directive references a missing file," it sidesteps that debugging session entirely.

This is institutional memory for software agents. The same concept exists in human organizations — senior engineers who say "don't do that, we tried it in 2019 and it broke production." The difference is that Via captures this automatically as a side effect of doing work.

The most common error categories:

1. API misuse (32% of errors). Incorrect parameters, deprecated endpoints, missing headers. These are the kind of mistakes every developer makes the first time they use an API. With the learnings system, the first agent figures it out, and no subsequent agent repeats it.

2. Configuration issues (24% of errors). Environment variables missing or malformed, config files in the wrong format, path resolution bugs. These are frustrating because they're not code bugs — they're environment bugs. The learnings system turns each one into a warning for future agents.

3. Tool constraints (18% of errors). Trying to use a tool in a way it wasn't designed for. Trying to grep a binary file. Trying to use SQLite's json_extract on malformed JSON. Trying to run tsc without a tsconfig.json in scope. Each constraint, once discovered, becomes permanent knowledge.

4. Order-of-operations errors (14% of errors). FTS5 triggers created before the main table. Migrations run before the schema exists. Tests executed before the build. These are the most insidious because the error messages often point to the wrong cause.

5. Other (12% of errors). Encoding issues, permission problems, rate limits, and edge cases.

Meta-Learnings: The System Watching Itself

Beyond task-level learnings, Via captures meta-learnings — observations about the orchestration system itself. These come from agents tagging observations about the workflow rather than the task:

Meta-Learning TypeCountWhat it means
meta_gap15Missing persona — no specialist for this task type
meta_issue12System behavior problem — agent told to plan but task requires writing
meta_observation11Workflow insight — "research phases produce better output with file path constraints"
meta_mismatch9Wrong persona assigned — writer agent doing review work

These 54 meta-learnings are my development roadmap.

The 15 meta_gap entries tell me exactly which personas are missing from the system. "No DBA specialist for database migration tasks." "No DevOps persona for CI/CD pipeline work." "No UX specialist for frontend interaction patterns." Each gap entry is a signal to add a new persona to the configuration.

The 9 meta_mismatch entries tell me where the keyword-based persona selector is failing. A "writer agent assigned to review task" means the selector matched "write" in "write a review" and picked the writer instead of the reviewer. These are the clearest argument for upgrading from keyword matching to semantic persona selection.

The 11 meta_observation entries are workflow optimizations hiding in plain sight. "Research phases consistently produce better output with explicit file path constraints" led me to always include target directory paths in research phase descriptions. Small change, measurable improvement.

Patterns Across Domains

Looking at the 148 pattern entries reveals which insights generalize across different types of work:

Hybrid search outperforms pure approaches. This pattern appears in variations across research tasks, code search, and knowledge retrieval. FTS5 alone misses semantic matches. Embeddings alone miss exact terminology. The combination consistently outperforms either approach individually. This validated the 30/70 keyword/semantic scoring used across all of Via's search systems.

Smaller, focused phases produce better output than broad ones. Agents given a tight scope — "research OAuth providers that support PKCE" — produce more actionable output than agents given a broad scope — "research authentication." This pattern drove the decomposer to produce more phases with narrower scopes.

Documentation-before-implementation reduces rework. This mirrors my own experience during the documentation binge at the start of the sprint. Agents that plan before coding produce fewer errors and require fewer iterations.

What the Data Doesn't Show

The learnings database also reveals gaps in the system itself:

No learning decay. Old learnings persist forever. A workaround for a bug that's been patched six weeks ago still gets injected into agent prompts. I need TTL (time-to-live) or confidence decay on learnings.

No quality feedback loop. The system tracks how often a learning is seen (via the deduplication mechanism), but not whether it was helpful. An agent might receive 10 learnings, use 2, and ignore 8. There's no signal for which ones mattered.

No cross-domain analysis. The learnings are stored with a domain tag, but there's no mechanism to identify patterns that span domains. A financial modeling insight that applies to data pipeline design won't surface in a development context.

Despite these gaps, the signal is strong enough to matter. Agents running today are measurably better than agents running two weeks ago. The feedback loop works — it just needs refinement.

Next: Why I Route Research to Gemini


Related Posts

Jan 12, 2026

Why I Built a Multi-LLM Orchestration System (And You Might Want One Too)

Jan 22, 2026

Why I Built a Personal Intelligence OS

Jan 25, 2026

Starting Line: The Case for Personal AI