From 7,415 Lines of Bash to a Personal Intelligence OS

On December 22, 2025, I watched two AI agents review each other's code in a terminal window. Claude would analyze the bash orchestration scripts, propose an optimization, then pass the turn to Gemini. Gemini would validate the change, suggest a refinement, pass it back. Nine rounds of this — the entire cycle compressed into a single extended session.

By round six, Claude replaced the polling loop with a Redis BLPOP call. Turn latency dropped from 2.5 seconds to under a second. The agents had optimized away the bottleneck I'd spent two days ignoring.

That was the first mission. Eighty-seven days later, Via is a 40-skill Personal Intelligence OS backed by 55,413 lines of Go, 18 internal packages, and 982 tests. It manages my email, publishes my articles, tracks my budget, schedules my social posts, and orchestrates multi-phase missions across two AI runtimes. This is the story of how it got there — and why every clever idea I had along the way turned out to be wrong.

Four generations of Via architecture shown as a left-to-right progression: a bash script shack, a chaotic swarm sprawl, a clean Go CLI building, and a gleaming 40-skill tower — tiny mascot at the top looking back across 87 days of compounding work.

The Bash Orchestrator (December 2025)

The first version of Via was 7,415 lines of bash. Not a prototype — a production system. mission.sh at 1,064 lines was the largest file. redis.sh at 503 lines handled all state. sync-watcher.sh at 405 lines coordinated agent turns through a polling loop that checked Redis every few seconds.

The architecture was deliberately primitive:

┌─────────────┐     ┌───────────┐     ┌─────────────┐
│  Claude CLI  │◄───►│   Redis   │◄───►│  Gemini CLI  │
│  (--dangerously- │     │  (strings,  │     │   (--yolo)   │
│  skip-permissions)│     │   lists)    │     │              │
└──────┬──────┘     └─────┬─────┘     └──────┬──────┘
       │                  │                  │
       └──────────┬───────┘──────────┬──────┘
                  │                  │
           ┌──────┴──────┐   ┌──────┴──────┐
           │  start.sh   │   │ sync-watcher │
           │  (launcher) │   │  (polling)   │
           └─────────────┘   └─────────────┘

No compile step. No type system. No tests. Every agent communication was a human-readable Redis operation — RPUSH to send, LPOP to receive. I could debug the entire system by running redis-cli MONITOR in a second terminal.

The self-improvement mission — the orchestrator reviewing and improving itself — produced nine rounds of optimization in that single December 22 session. Rounds 1-5 were incremental: temp file elimination saved 0.1-0.2 seconds per turn, Redis batching collapsed 4 calls into a single Lua script for another 0.3-0.5 seconds. Useful but unremarkable.

Round 6 was the inflection point. Claude proposed replacing sleep $POLL_INTERVAL with Redis BLPOP — a blocking pop that returns instantly when data arrives instead of checking every few seconds. The watcher went from 0-2 seconds of random delay per turn to under 5 milliseconds of event notification. On a 20-round mission, that was 120 seconds of overhead dropping to 100.

By round 9, the agents declared V1.0 "Event-Horizon" production-ready. Turn latency overhead was down 65% from baseline.

I ran three real missions that week: an EA documentation cleanup (20 rounds), a voice interface prototype, and a 45-round Instagram seeding campaign. The bash orchestrator worked. It was ugly, but it worked.

The Swarm Experiment (January 2026)

Bird's-eye view of a command table: twelve identical agent robots orbiting the mascot at center, eight already fading out mid-deletion — the moment of reckoning when 12-agent chaos contracts to 4.

The bash orchestrator worked, so naturally I tried to make it better.

January 2026 was the most architecturally ambitious month of the project. I built autonomous agent loops, ran a three-language competition, and explored 12-agent swarm patterns. Almost none of it survived.

It started with Ralph — named after the Simpsons character — an autonomous loop that let agents iterate on self-improvement without human intervention. Automatic crash recovery, graceful shutdown handlers, config validation. Ralph was elegant. Ralph was also unpredictable. The agents would drift from objectives three rounds into a mission, spending tokens on tangential explorations that had nothing to do with the task.

Then came the swarms. On January 8, I ran a 6-phase self-improvement mission using a Claude swarm architecture. On January 9 — the highest commit density day in the entire project history, roughly 30 commits — I added multi-model task delegation, a web dashboard, hybrid Claude+Gemini swarms, and achieved 81% cost savings through smart model routing. The Gemini Pro delegation alone was 3x cheaper than routing everything through Sonnet.

The language competition started January 23. Three implementations running in parallel:

Implementation	Language	Key Feature	Outcome
Go-Swarm	Go	Server delegation, SSE streaming	Won
Rust-Swarm	Rust	Async streaming, WebSocket, consensus	Explored deeply, abandoned
Ruby-Swarm	Ruby	Quick prototype	Abandoned within days

Rust-Swarm went through two full versions. V1 had streaming, consensus, and parallel execution. V2 added a tool system and competed head-to-head against Go-Swarm V2. I was convinced Rust would win on performance.

The speed-budget analysis killed that conviction permanently. I measured where Via actually spent its time: 85-95% was network I/O — waiting for LLM API responses. The remaining 5-15% was local computation. Optimizing local computation by 3x with Rust — even if I achieved that — would be invisible against the network-bound bottleneck. Go's simpler concurrency model and faster compile times won on developer velocity, not runtime performance.

By January 24, I ran a major cleanup: removed the Rust and Ruby swarms, the AION UI, the Todo app prototype, the archive directory. The codebase contracted. On January 25, I attempted to decompose what remained into 13 standalone packages with a CLI+skills pattern — voice conversation with wake word detection, a macOS Electron app, a QA agent, a speculative execution engine. None of these packages survived into February.

The swarm era produced four insights that shaped everything after:

Mission size has a cliff at 28KB. Missions between 5-8KB succeed at 73%. Missions above 28KB succeed at 0%. Not a gradual degradation — a cliff.

3-4 agents is the sweet spot. More agents means more file overlap, which means more serial phases, which negates the parallelism that justified more agents in the first place.

70.7% of friction is the LLM's behavior, not the orchestrator. I spent a month optimizing the 29.3%.

The boring architecture works. Filesystem IPC, subprocess spawning, JSON files. Every time I tried something clever — 12-agent swarms, custom serialization, Rust for the hot path — it failed. Every time I returned to simple tools, it succeeded.

The Go CLI (January 26 — February 20, 2026)

On January 26, I committed the first line of Via as a standalone Go application. Not a port of the bash orchestrator — a ground-up rebuild. The initial commit included task detection, mission orchestration, a learnings system, parallel execution with a worker pool, and skill installation for Claude Code.

The architectural changes were total:

Aspect	Gen 0-1 (Bash)	Gen 2 (Go CLI)
State	Redis	Filesystem (JSON)
Communication	Redis pub/sub + BLPOP	Filesystem IPC
Execution model	Turn-based polling	DAG-based parallel
Configuration	`.orchestration.env`	`CLAUDE.md` + skill files
Tests	0	Growing

Redis was gone. Completely. Filesystem IPC replaced it — every message a human-readable JSON file with atomic writes. Per the evolution learnings: "Trades throughput for debuggability — every message is a human-readable JSON file." At personal scale, I never needed the throughput.

Then the feature explosion hit. Between January 30 and February 2, Via accumulated embeddings, semantic search, pattern detection, P2P integration, Telegram integration, token optimization, YNAB skill wrapping, multi-agent configuration, 34 documented features, 91 mission files, and 62 research documents totaling 1.9 megabytes. This was the maximum feature sprawl point. Most of it was scaffolded but never completed.

On February 3, I made the architectural decision that defined everything after: Via pivoted from a standalone CLI to a Claude Code plugin system. Documentation was consolidated into 5 canonical files. Via stopped trying to be an independent product and became an extension layer. Claude Code already handled agent execution, context management, and tool access. Duplicating that was wasted effort. Via would add orchestration, learnings, personas, and skill routing on top.

Plugins grew fast. Obsidian and Todoist by February 7. Auto-generated routing indexes with generate-agents-md.sh. Substack and Medium by February 13.

On February 4, the Go orchestrator CLI was born in a separate repository — ~/skills/orchestrator. This was the orchestration engine, distinct from Via the platform. DAG-based parallel execution. Smart model detection with capability-based routing. An event-loop scheduler. A persona pool with a skill selector.

By mid-February, the orchestrator had file locking on state.json, learning retrieval gates, checkpoint/resume, retry logic, and a persona system where each persona carried goals, anti-patterns, methodology, examples, and a self-check. The learnings store evolved from flat files to SQLite with Gemini semantic search. Meta-learnings — learnings about how the system learns — appeared on February 10.

On February 20, I rewrote the Go orchestrator from scratch. The second major rewrite in two months. The first rewrite (bash to Go) was driven by capability. The second (Go V1 to Go V2) was driven by reliability — the DAG executor had race conditions that corrupted phase outputs during parallel execution. V2 was 70.6% faster than V1 on average mission duration.

The Skill Engine (February 24 — March 17, 2026)

Let's get into the current architecture, because this is where the system stops being a project and starts being infrastructure.

On March 5, I killed the plugin abstraction. Plugins became skills — directories with SKILL.md files living directly in .claude/skills/. No registry. No installation mechanism. No version management. The filesystem is the registry. The routing index auto-generates from what's there.

Now, you might hear "no registry, no version management" and think this is less sophisticated. That would be wrong. The simplicity is intentional. For a solo developer maintaining 40 skills, a package registry is overhead that buys nothing. I know what's installed because I installed it. The filesystem-as-registry pattern means adding a skill is mkdir plus writing a markdown file. Removing one is rm -rf. No dependency resolution, no version conflicts, no publish step.

How the Engine Works

The orchestrator decomposes every task before executing anything. internal/decompose/decompose.go — 1,411 lines — takes a natural language task, calls an LLM to break it into 3-8 phases with explicit dependencies, then falls back to keyword matching if the LLM is unavailable. Each phase gets a persona, a model tier, and a working directory.

The engine (internal/engine/engine.go, 1,100 lines) resolves the resulting DAG, dispatches phases in topological order, and runs independent phases in parallel. Three retries per phase with exponential backoff. Checkpoint after every phase completion. If the process crashes at phase 5 of 8, it resumes from phase 5 — not from scratch.

Task: "Research golang error handling and write a blog post"

┌──────────────┐
│  Decomposer  │ ── Breaks into phases with dependencies
└──────┬───────┘
       ▼
┌──────────────────────────────────────────┐
│           DAG Execution Engine            │
│                                          │
│  Phase 1: Research  ──┐                  │
│  (Sonnet, researcher) │                  │
│                       ├──► Phase 3: Write │
│  Phase 2: Gather     ─┘   (Opus, writer) │
│  (Haiku, scout)                          │
│                                          │
│  [Checkpoint after each phase]           │
└──────────────────────────────────────────┘

No event bus. No WebSocket protocol. No message queues. Just a DAG, a loop, and JSON checkpoints on the filesystem. That is it.

Model Routing

The task classifier in internal/routing/tasktype.go sorts every task into one of 8 types — implementation, bugfix, refactor, research, deployment, docs, test, writing — using keyword matching with a priority hierarchy. Bugfix outranks everything. Writing is near the bottom. Each type maps to a model tier: "think" routes to Opus, "medium" to Sonnet, "fast" to Haiku.

Model resolution follows a cascade: explicit assignment in the phase definition beats the persona's default preference, which beats the routing table lookup, which beats the global fallback. The routing database (internal/routing/db.go) stores patterns with confidence scores between 0.0 and 1.0, decaying scores for persona/model combinations that produce poor audit results. The system learns which expert handles which task type — not by training weights, but by tracking what worked.

The Learning Loop

Workers emit markers during execution — LEARNING:, PATTERN:, DECISION:, GOTCHA:, FINDING:. The learning system in internal/learning/db.go (888 lines) extracts these markers, generates vector embeddings, and stores them in SQLite. Future phases receive the top-N most relevant learnings by cosine similarity against the current task.

The system doesn't just store learnings — it tracks compliance. When a learning is injected into a phase's context, the system checks whether the worker's output reflects it. If the output ignores an injected learning, the compliance score drops. This is a feedback signal that neither OpenClaw's frequency-based promotion nor Hermes' skill nudges provide. Via measures whether its own memory actually influences behavior.

The outer loop closes after each mission. internal/audit/ runs a post-mission scorecard across 5 axes on a 1-5 scale. Phase-level evaluation compares actual outputs against the decomposer's original plan. Routing confidence scores update based on outcomes. If a persona consistently underperforms on a task type, future tasks get routed elsewhere. The audit changes the routing, the routing changes the results, the results change the next audit.

26 Events Across Three Transports

The event system defines 26 typed events across 10 categories: mission lifecycle, phase lifecycle, worker lifecycle, decomposition, learning, DAG progress, role handoffs, contract validation, review loops, and git operations. Each event carries an ID, type, timestamp, sequence number, mission ID, phase ID, and a structured data payload.

Three transport layers serve different consumers:

JSONL file log (~/.via/events/{mission_id}.jsonl) — append-only, one event per line, survives crashes
Unix domain socket (~/.via/daemon.sock) — local IPC for CLI tools, under 1ms latency
SSE over HTTP (127.0.0.1:8390/events) — Server-Sent Events for browser frontends, auto-reconnect with Last-Event-ID

A ring buffer of 1,000 events enables replay on reconnect. The daemon is optional — if the socket doesn't exist, the engine writes file logs only. Debugging a failed mission means jq over a JSONL file. No OpenTelemetry. No Jaeger. No distributed tracing infrastructure. For a system with one user, jq is the observability platform.

Multi-Runtime (March 2026)

On March 14, Via became runtime-agnostic. The orchestrator can dispatch work to Claude Code or Codex based on task characteristics. Runtime auto-selection heuristics — added March 15 — choose the runtime per phase. This is the architectural equivalent of Gen 1's multi-model delegation, but one layer up: not which model handles the task, but which entire runtime environment.

The same week, the event system matured into event-first live state. CLI status is projected from the event log, not read from a state file. Cancellation, delivery drop surfacing, and sequence integrity — all derived from the event stream. The principle from Gen 0 Round 6 — events over polling — returned two generations later as a proper state projection system.

Git integration arrived March 16: worktree isolation for parallel missions, PR creation with a --pr flag, Codex review integration, and an advisory file-claim registry that prevents two parallel missions from modifying the same file.

What the Numbers Say

Eighty-seven days. Four generations. Two complete rewrites.

Metric	Gen 0 (Dec)	Gen 1 (Jan)	Gen 2 (Feb)	Gen 4 (Mar)
Language	Bash	Go/Rust/Ruby	Go	Go
Lines of code	7,415	~5,000 mixed	~10,000	55,413
State management	Redis	Redis	JSON files	SQLite + JSON
Agent sweet spot	2	3-12 (explored)	3-4 (validated)	3-4 (enforced)
Tests	0	89	growing	982
Skills	0	0	9 plugins	40 skills
Internal packages	0	0	emerging	18

The 40-skill count represents the current working set: orchestrator, gmail, linkedin, reddit, substack, obsidian, todoist, ynab, scout, engage, publish, scheduler, elevenlabs, contentkit, ai-analyzer, article-pipeline, youtube-pipeline, and 23 others covering everything from PDF generation to systematic debugging to Tailwind CSS patterns.

Each skill is a directory. Each directory contains a SKILL.md. The orchestrator reads those files at spawn time, inlines the relevant ones into each worker's context, and the worker gains that capability for its phase. No dynamic loading. No runtime plugin system. Just markdown files and filesystem reads.

What Died Along the Way

A developer graveyard at night: nine tombstones with carved icons for each failed approach — Rust gear, swarm nodes, loop arrow, broken package box — while the purple-analytical mascot takes notes, and teal sprouts grow from the graves of the best failures.

The dead-end list is longer than the feature list:

Dead End	Generation	Why It Failed
Rust rewrite	Gen 1	Speed budget: 3x on 5% = invisible
12-agent swarms	Gen 1	Diminishing returns past 3-4 agents
Ralph autonomous loops	Gen 1	Agents drift from objectives
13-package extraction	Gen 1	Premature decomposition
TOON custom format	Gen 0-2	Landed on the "do not build" list
Standalone CLI product	Gen 2	Claude Code is the runtime; Via is the layer
Plugin abstraction	Gen 2-3	Skills-as-directories is simpler
Web dashboard	Gen 0-1	TUI sufficient; `jq` sufficient-er
Signal/iMessage bots	Gen 2	Archived and never reopened

Every dead end taught something. The Rust rewrite taught me to measure before optimizing. The 12-agent swarms taught me that coordination overhead compounds faster than parallel throughput. The plugin abstraction taught me that the simplest registry is no registry. Ralph taught me that autonomous loops need objectives with gravity — something that pulls agents back when they drift.

The most expensive lesson: the standalone CLI product. I spent weeks building Via as a self-contained system — its own agent execution, its own context management, its own tool access. Then I realized Claude Code already does all of that. Via's value isn't in running agents. It's in deciding which agent runs, with what knowledge, in what order, and learning from the result.

The Decomposer Problem

Via's architecture has a single point of failure, and I haven't solved it.

The decomposer — the component that breaks tasks into phase DAGs — determines everything downstream. If it misclassifies a task, the wrong model handles it. If it creates bad dependencies, phases execute in the wrong order. If it scopes phases too broadly, they hit the 28KB mission-size cliff from Gen 1.

OpenClaw and Hermes don't have this problem because they don't plan ahead. Their reactive architectures adapt mid-execution. If the first approach fails, they pivot. Via commits to a plan before running a single phase. If the plan is bad, every phase is suboptimal.

The review loop — added in Gen 4 — partially mitigates this. After a worker completes, the engine can re-invoke it with review feedback. But the review loop operates within the plan. It can fix a phase's output. It cannot restructure the plan itself.

The decomposer also creates a provider lock-in risk. Task classification and decomposition assume Claude's capabilities. Switching providers means revalidating every routing pattern, every keyword hierarchy, every confidence score. The speed-budget analysis killed the Rust rewrite. The decomposer dependency could kill a provider migration the same way — not because it's technically impossible, but because the validation cost exceeds the benefit.

Passive learning is another open vulnerability. Via's learning extraction requires agents to emit markers. If a worker doesn't output LEARNING: or PATTERN:, the system captures nothing from that phase. Hermes solves this with active skill nudges — periodic prompts that ask the agent to persist what it learned. Via relies on cooperation. When the agent cooperates, the learning loop works. When it doesn't, the mission produces output but no institutional memory.

87 Days of Compound Architecture

Worm's-eye view of a towering generational column with four stacked rings — bash, swarm, Go CLI, Skill Engine — each containing the same lightning bolt symbol at its center. Tiny yellow-state mascot at the base, looking up at the principle that survived every rewrite.

Via is not the system I set out to build. The bash orchestrator was supposed to be a weekend experiment. The swarm architecture was supposed to be the production system. The standalone CLI was supposed to be the product.

Every generation replaced the previous one's assumptions. Redis was foundational until filesystem IPC made it unnecessary. Plugins were the extension model until skills-as-directories made them redundant. The Rust rewrite was inevitable until the speed budget proved it invisible.

The pattern that survived all four generations: events over polling, DAG over turn-taking, learnings captured at execution time, and the constraint that boring architecture outperforms clever architecture at personal scale. Gen 0's BLPOP is Gen 4's event-first state projection. Gen 1's 3-4 agent sweet spot is Gen 4's enforced parallelism limit. The principles held. The implementations were disposable.

Fifty-five thousand lines of Go, 40 skills, 982 tests, 18 packages — and the system that coordinates all of it still resolves to the same insight Claude proposed in round 6 of that December session: stop polling, start listening.