Skip to main content

· 6 min read

Three Tools I Built Because Someone on the Team Was Stuck

A CS team that couldn't reproduce bugs, QAs blocked by stale cache, and translations that wouldn't update. Three frustrations, three tools.

Developer Tools · Internal Tooling · Sentry · DevOps · Problem Solving · Case Study

Company: TitanFX LTD Role: Senior Frontend Developer Duration: 8 months (Nov 2024 - June 2025) Team: Solo developer (weekends)

Tech stack: React, Node.js, TypeScript, Redis, Sentry, Slack API, Docker

"The user clicked something and it broke"

Monday morning. You open Jira. There's a bug ticket from CS: "The user did something and it stopped working." A screenshot of a blank screen. No steps. No context. Just a sentence and a white rectangle.

So you open the page. You click around. You try to guess what the user might have done, in what order, with what data, on what kind of account. You can't reproduce it. You ask CS for more details. They tell you what they already told you. They described what they saw. The problem is that what they saw was the end state -- an error message, a spinner that never stopped, a screen that went white. They weren't the one clicking through the app. The user was. And the user is gone.

This happened constantly. Not because CS was lazy -- they reported exactly what was in front of them. But a bug report without the user's actual path through the app is a guessing game. You're reverse-engineering a journey from its last frame.

I built a user journey tracker on top of Sentry. It records the user's navigation path, their clicks, and key state transitions as they move through the app. When something breaks, that journey is already captured. CS attaches the journey trace to the ticket. The developer opens it and sees the whole sequence -- page by page, action by action, right up to the moment it failed.

Bug reports went from "it broke" to "here's exactly how it broke." Developers stopped guessing. CS stopped getting follow-up questions they couldn't answer.

"QA can't work because the cache is lying"

QAs on lower environments would run a test, see stale data, and fail it. Then they'd check the database. The data was correct. The cache was just serving old values.

The fix was simple: clear the cache. The problem was who could do it.

Only developers had SSH access. So the QA would message a developer, explain the situation, wait for them to connect, run a command, and confirm it was done. Sometimes the developer was in a meeting. Sometimes they were heads-down on something else. A five-second operation became a thirty-minute interruption for two people.

But here's the thing -- nobody was doing anything wrong. The QA followed the process. The developer responded when they could. The workflow was broken, not the people. A mechanical task -- flushing a cache key, something that requires zero judgment -- was gated behind access that only developers had.

I built a cache management dashboard. It shows what's cached, lets you inspect specific keys, clear individual entries, or flush everything. QAs can see what the cache is holding and decide for themselves whether it's the problem. No SSH. No waiting. No interrupting someone else's work.

The cache tool is the one I think about most. Not because it was technically interesting -- it wasn't. Because the principle behind it was so clean: if someone regularly needs to ask a developer to do something mechanical, that's a missing tool. Not a workflow.

"The translations updated but the app didn't notice"

Translation files at TitanFX are generated during the CI pipeline. On production, this works fine -- you update a translation in Loco, the pipeline runs, the new strings get baked into the build, the app shows the updated text.

On lower environments, it doesn't work like that. QAs would update a translation in Loco to test how a new string looks in Japanese or Spanish. Then they'd refresh the app. Old text. Clear the browser cache. Still old. Wait. Nothing.

The translation files only update on the next deploy. On lower environments, deploys don't happen on a translation change -- they happen when code changes. So the QA is stuck. The translation is correct in Loco. The app is showing the old one. And there's nothing they can do about it except ask for a deploy or wait for one that has nothing to do with their work.

And here's where it gets subtle. QAs reasonably expect a direct relationship: I changed the translation, the app shows the new text. That expectation is correct. The architecture just didn't support it outside of production's pipeline. The gap wasn't in anyone's understanding. It was in the system.

I built a tool that pulls the latest translations from Loco and replaces them in the running application. It handles both server-rendered and client-rendered pages. It only runs on lower environments -- production still goes through the pipeline, untouched. This was specifically for unblocking QA so they could verify translations without waiting for a deploy.

The common thread

These aren't a suite. They weren't planned together or built on a shared platform. They're three separate responses to three separate moments where someone on the team was stuck and the existing tools didn't help them.

The instinct is simple: when someone is blocked by a missing tool, build the tool.

I still believe that. But I've started to wonder about the other side of it. Every tool you build is a tool you now maintain. The journey tracker needs updating when the app's routing changes. The cache dashboard needs to stay in sync with new cache layers. The translation tool breaks if Loco changes their API. You solve one person's Thursday afternoon and create your own quiet obligation that stretches out indefinitely.

That Monday morning ticket from CS -- the blank screen, the one-line description, the guessing game. I think about it sometimes. Not because the journey tracker didn't fix it. It did. But because there's always another version of that ticket. Another moment where someone is stuck and the tooling doesn't reach far enough. You build the tool. Then you maintain the tool. Then someone finds a new gap. The question isn't whether to build it. The question is whether you've accepted what building it actually costs.

Enjoyed this post?

Subscribe to get weekly deep-dives on building AI dev tools, Go CLIs, and the systems behind a personal intelligence OS.

Related Posts

Dec 1, 2024

What Building a Product Alone Teaches You

Nov 1, 2024

What Nobody Tells You About Migrating a Live Trading Platform