tl;dr — I built a full-stack Kubernetes desktop application — Go backend, React/TypeScript frontend, brochure website — in 21 days using AI-assisted “vibe coding.” 74,675 lines of source code. 461 commits. 129 pull requests. 100% merge rate. I never read a single line of the source code. Here’s exactly how it worked, what I actually typed, and what I learned.
What I Built
Clusterfudge is a native macOS desktop app for managing Kubernetes clusters. Think of it as a Lens competitor: cluster overview, pod management, log streaming, exec terminals, Helm chart management, YAML editing, resource wizards, a troubleshooting page, and an AI debugging terminal that calls Claude Code to diagnose live pods. Plus a brochure website.
The stack is Go (backend, via Wails framework), React with TypeScript (frontend), and Astro (brochure site). It’s a monorepo with 654 files.
I didn’t write any of it. I described what I wanted in plain English and the AI wrote the code, ran the tests, committed to branches, opened PRs, and iterated until everything worked.
By The Numbers
| Metric | Value |
|---|---|
| Development span | 21 days |
| Source lines of code | 74,675 |
| Total commits | 461 |
| Pull requests (all merged) | 129 |
| AI sessions | 135+ |
| Human prompts typed | 1,025+ |
| AI responses generated | 14,346+ |
| Tokens processed | 1.27 billion+ |
| API calls | 14,145+ |
| Conversation log data | 130 MB+ |
Source lines of code counts Go and TypeScript source files only. The total codebase — including configuration, generated code, tests, and the brochure site — grew to 128K lines; see the Codebase Growth chart in the raw data below.
Some derived metrics that tell a sharper story:
| Metric | Value |
|---|---|
| Lines of code per day | 3,556 |
| Commits per day | 22 |
| PRs per day | 6.1 |
| Git insertions per human prompt | 167+ |
| AI responses per human prompt | 14.0 |
That last number means for every sentence I typed, the AI generated fourteen responses — reading files, editing code, running tests, committing, pushing. I wasn’t having a conversation. I was issuing work orders.
Note: Development happened across two machines. Machine 1 logged 77 sessions (396 prompts, 584M tokens, 63.7 MB). Machine 2 logged 58+ sessions (629+ prompts, 684M+ tokens, 66.3 MB). The “+” markers indicate conservative lower bounds — some sessions were too short to instrument, and the two machines’ Claude Code Desktop app sessions aren’t included.
The Real Workflow
Here’s what my day actually looked like:
- Morning: Open Claude Code, point it at a task list in a markdown file, say “/dev-loop”
- Walk away: Let it work autonomously. Some sessions ran 8-10 hours
- Check in: “PR open?” / “is the CI passing?”
- Test the app: Build and run the actual application, find UX issues by using it
- Report bugs: Describe what’s wrong in plain English, no code references
- Repeat
I was a product manager who could QA. The AI was the engineering team.
The first 10 days followed a phased architecture approach. I wrote nine phase documents (or rather, described what each phase should contain and had the AI write them), then pointed the AI at each one sequentially. PRs #6 through #16 implemented the entire application in order: Go backend, frontend shell, resource views, log streaming, Helm management, polish, and competitive feature parity. Each phase was a “fire and forget” — one prompt, hours of autonomous work.
What I Actually Typed
1,025+ human prompts across 135+ sessions on two machines. The median prompt was 12 words. 24% were five words or fewer. Here are the patterns that emerged.
Fire and Forget
The dominant pattern. Point at a task list, say “go”:
“please /lenny-dev-loop tier 1, tier 2 and then tier 3.”
“please /dev-loop for docs/AUDIT.md”
A single prompt could trigger the AI to read task files, implement features across dozens of files, run tests, fix failures, commit, push, open PRs, and work through an entire checklist. One prompt. Hours of work.
Delegate the Delegation
In the later days, the fire-and-forget pattern evolved. Instead of pointing at a task list, I told the AI to figure out its own parallelism:
“let’s go the whole hog and do it all, feel free to spin up a team.”
“Please spin up a team to check every single element and page for light mode as it seems like there are lots of bits that are suitable for light mode.”
“please go ahead and fix everything in the order that you think makes the most sense. Feel free to spin up a team as you see fit.”
The AI would spawn multiple sub-agents, divide the work, and coordinate. I wasn’t just delegating tasks — I was delegating the delegation itself.
Skill-Based Delegation
As the project matured, I stopped writing raw prompts for repetitive workflows and started invoking specialized skills:
“please use /frontend-design to review and see how we can improve things”
“maybe it’s worth using the /frontend-design skill to get a list of ideas”
“can you create me a new /lenny-blog-post”
Each skill encapsulates a multi-step workflow — research, plan, implement, test, commit, PR — behind a single invocation. The evolution from raw prompts to skill-based delegation mirrors how you’d build internal tools for a growing team.
QA by Using the App
I tested the running application and reported bugs the way a user would:
“the sidebar on the right that appears when click on a pod, for example. Can we make the size of it draggable (so a user can increase/decrease) the width and remembered when it’s opened in future again.”
“Deployments->Right hand panel, History table row has cursor pointer but doesn’t do anything?”
No file names. No component names. No line numbers. Just “this thing is broken” and the AI figured out the rest.
Status Pings
Brief check-ins. Often one or two words:
“PR open?”
“committed and pushed?”
“is the CI passing?”
“passed yet?”
I managed the AI like a junior developer — periodic check-ins without micromanaging.
Course Corrections
Quick redirects when the AI went the wrong way:
“No, I want to keep shortcuts. Just not make them editable.”
“yes revert, then come up with a plan to tackle H8, H10 and H15. Then /dev-loop them”
Correct once, then immediately delegate back. I rarely needed to correct the same thing twice.
Fact-Checking the AI
The AI occasionally made claims about the product that weren’t true. Catching these required actually knowing the product — another reason the human-as-QA role matters:
“wait, the app has been tested for 5k resources?”
It hadn’t. The AI had fabricated a marketing claim for a blog post. I caught it because I’d been using the app daily. Similarly:
“are we 100% confident that everything in the /features page is accurate and it’s working and live in the desktop app?”
Trust but verify. The AI is good enough that you stop reading the code — but you never stop testing the product.
Strategic Product Decisions
The highest-leverage prompts. A single sentence could trigger thousands of lines of change:
“can you move all the other files into desktop/ folder so we can turn this into a mono repo.”
That one prompt produced PR #82: 8,769 additions across 607 files. A complete monorepo restructure from a single sentence.
Ops Debugging by Pasting Errors
Days 18–21 introduced a completely different prompt style. Instead of describing UX bugs, I was pasting CI/CD failures, build logs, and GitHub Actions output directly into the chat:
“on my mac runner Build (macos-arm64) the follow step takes ten+ minutes.
actions/setup-go@v5 Setup go version spec 1.25.0”
“What secrets do I need to add to the repo for the release actions”
“ok, secreet have all been added. They are the same as the ../iddio-mono project. Just want to check that is ok?”
One prompt was a 60-line Cloudflare build log, pasted raw. The AI parsed it and fixed the misconfigured build command. Release engineering, CI debugging, and infrastructure work followed the same pattern as feature development: describe the problem, let the AI fix it. The domain shifted from product to ops, but the workflow didn’t.
Typos Everywhere, and It Didn’t Matter
The prompts are littered with typos — “brqnch”, “commmit”, “truely”, “writen”, “packge”, “remainnig thins”, “fone size”, “nuch movement”, “Triangel”, “termology”, “stevedor”, “cna” — and the AI understood every single one without asking for clarification. I typed fast and sloppy because it didn’t matter. This is a genuine productivity feature.
The Build-Then-Fix Loop
Here’s something the numbers reveal clearly: ~30% of PRs were bug fixes or review follow-ups. The workflow wasn’t “build it perfectly the first time.” It was:
- Build it fast
- Test it by using it
- Report what’s broken
- Fix it
- Repeat
This is the “70% problem” in practice. The AI gets the broad strokes right on the first pass. The remaining 30% is iterative refinement — the human QA-ing, redirecting, and fine-tuning.
One session captures this perfectly. I typed 37 prompts. The AI generated 653 responses. Each fix revealed the next bug, like peeling layers of an onion:
“the app doesn’t work. The homescreen just has the spinner going round and not stopping.”
[AI fixes it]
“ok, there are no more errors, but it is showing zero for everything. Also I can’t copy any text or right hand click.”
[AI fixes it]
“still getting the error undefined is not an object”
[AI fixes it]
“ok, ive merged the PR. Let’s fix some other stuff. I can’t drag or resize the window”
That session generated 128,000 output tokens — roughly 96,000 words of code and reasoning — from a human who typed maybe 500 words total.
The Other Extreme: Rapid-Fire Design Sessions
Not every session was fire-and-forget. The welcome screen redesign on Day 19 was the opposite: 24 prompts in 80 minutes, each one a micro-adjustment:
“can we split it into two columns”
“ok, I think we need to move to a horizontal tabs with a slide in/out for each page”
“great stuff. Let’s centre it and set a max width”
“let’s centre align the nav items as well. Also let’s reduce the max width for just AI Assistant and Kubeconfig to 50% of what it currently is.”
“the green is a little too light, can we go a bit darker”
“can we move the toast down slightly, so it doesn’t sit above the very top title bar.”
This was standing over the AI’s shoulder, directing every decision in real time. One prompt every three minutes, each building on the visual result of the last. The contrast is stark — some sessions produce 8,769 lines from a single sentence, others produce fine-grained tweaks from a rapid stream of feedback. Both are vibe coding. The mode switches based on what the work needs.
The Moments That Made Me Laugh
The Naming Rabbit Hole
I spent an entire 30-minute session brainstorming product names. It went from licensing strategy to nautical terms to pirate words to foreign languages to shipping container terminology:
“what does kubernetes mean?”
“what other Nautical terms are there?”
“what about pirate words?”
“ok, let try foreign words”
“are there any other shipping container termology?”
“What about clusterfudge”
The name stayed as KubeViewer at the time. The same brainstorm surfaced again in a different session weeks later — “what about stevedore” — before being rejected again. Some decisions just keep circling back. Eventually, Clusterfudge won out.
Pixel Surgery
The most precise prompts were UI adjustments measured in individual pixels:
“Close icon move: 1px up, 2px left. Minus icon move: 1px up. Triangel icons move: 1px up, 1px right”
Then immediately:
“ok, that was way too nuch movement, can we go back and do half as much?”
Then:
“Almost perfect. Can we move the x a tidy bit to the right, and triangles to the left?”
Three rounds of pixel nudging to position window control icons. Vibe coding’s equivalent of standing behind a designer’s shoulder saying “a bit to the left… no, back a bit.” It ended with: “icons are a perfect size now. Please commit and push.”
AI Building AI (Then Debugging the AI Debugger)
The app has a built-in AI debugging terminal that calls Claude Code to analyze Kubernetes pods. Eight sessions were spawned not by me, but by the application itself — code I’d never read, invoking AI to debug live clusters. Genuinely recursive: I used AI to build a feature that autonomously calls AI.
Then Days 18–19 added another layer: I had to QA the AI feature itself. The prompts read like debugging inception:
“without the ability to type anything at the end. Also seems like I don’t see the claude code welcome screen, it seems like it skips that bit”
“The js library we are using for the terminal doesnt seem great when entering interactive shells like gemini and claude”
“flashing, text in wrong order, repeated text”
I was using AI (Claude Code in my terminal) to fix the AI feature (the in-app debugger) that calls AI (Claude/Gemini/Codex) to diagnose Kubernetes pods. Three levels deep. At one point I had the desktop app open with a live AI debugging session, Claude Code fixing the terminal rendering in a split pane, and the AI inside the app re-running to verify the fix worked. Turtles all the way down.
Accidental Sessions
One session contains exactly one human message: “ccx”. Another includes the prompt “ta]”. Both were accidental keypresses that spawned AI responses. Even typos get logged — and cost tokens.
The Token Economy
1.27 billion tokens sounds absurd. It is. Here’s the breakdown across both machines:
| Metric | Value |
|---|---|
| Total tokens | ~1,268,000,000 |
| Cache read tokens | ~1,228,000,000 (97%) |
| Cache write tokens | ~36,500,000 (3%) |
| Output tokens (actual code) | ~2,600,000 (0.2%) |
| API calls | 14,145+ |
97% of tokens were cache reads — the AI re-reading the same codebase context thousands of times as it iterated. Only 2.6 million tokens were actual generated output. The rest was the AI maintaining its mental model of a growing codebase across 14,000+ API calls.
To put 1.27 billion tokens in perspective, that’s roughly equivalent to reading 3,800 novels. The AI effectively read the codebase cover-to-cover thousands of times during development.
What This Would Have Cost on the API
At pay-as-you-go API pricing (Claude Opus 4.6 on OpenRouter: $5 per million input tokens, $25 per million output tokens), 1.27 billion tokens would have run up a serious bill. In an agentic coding workflow, input tokens dominate heavily — the AI reads far more code than it writes. Assuming an 80/20 input/output split:
| Split (in/out) | Input Cost | Output Cost | Total |
|---|---|---|---|
| 90/10 | $5,715 | $3,175 | ~$8,890 |
| 80/20 | $5,080 | $6,350 | ~$11,430 |
| 70/30 | $4,445 | $9,525 | ~$13,970 |
The most likely estimate: ~$11,000 for 21 days of development.
Several factors could shift this significantly. Prompt caching — if sessions reused large system prompts — can drop input costs by 90%. The batch API cuts everything by 50% for non-interactive work. And if any of those tokens included extended thinking (billed as output at $25/M), the output share climbs and the bill could push toward $15–20K+.
I did this on a Max Plan — a flat monthly subscription. The entire 21-day project cost a single month’s fee. At API rates, the same work would have cost roughly 50–70x more. The Max Plan doesn’t just change the economics of vibe coding — it makes this style of development viable in the first place. Without flat-rate pricing, you’d self-censor every “let it run for 8 hours” session. The moment you start watching the meter, you stop letting the AI work autonomously, and that’s where most of the value is.
Context Window Pressure
The AI context window peaked at 168K tokens — approaching the ~200K limit. Seven sessions hit this ceiling. As the codebase grew, the conversation history competed with the source code for context space. The following data is from Machine 2 through Day 17:
| Context Size | API Calls | Share |
|---|---|---|
| Under 50K tokens | 1,163 | 26% |
| 50-100K tokens | 1,635 | 35% |
| 100-150K tokens | 1,339 | 29% |
| Over 150K tokens (near ceiling) | 475 | 10% |
Most calls used 50-150K of context. The sessions that hit the ceiling were all long feature-implementation sessions where the AI was working with both the full codebase and a deep conversation history.
On Day 19 (March 18 at ~10am), Claude’s context window expanded from ~200K to 1 million tokens. The timing was remarkable — the rename to Clusterfudge (PR #90) landed just before the switch, and then the floodgates opened: blog posts, multi-AI provider support, MIT license, OSS sync, CalVer release workflow, welcome screen redesign, website audit, and mobile fixes all shipped in the hours after. The ceiling that seven sessions had hit in the first 17 days simply stopped existing.
The efficiency gain shows up clearly in the data. Commits per API call — a rough measure of how much work each AI round-trip produced — jumped dramatically:
| Day | Commits | API Calls | Commits/Call |
|---|---|---|---|
| Mar 6 (pre-1M) | 43 | 801 | 0.054 |
| Mar 8 (pre-1M) | 18 | 532 | 0.034 |
| Mar 10 (pre-1M) | 26 | 1,285 | 0.020 |
| Mar 18 (1M arrives) | 52 | 768 | 0.068 |
| Mar 19 (post-1M) | 49 | 233 | 0.210 |
| Mar 20 (post-1M) | 70 | 123 | 0.569 |
March 20 was nearly 30x more efficient than March 10 in commits per API call — 70 commits and 19 merged PRs from just 123 API calls, with a 128K-line codebase. The nature of the work shifted too (big features → smaller polish PRs), so it’s not a clean comparison, but the trend is hard to ignore. With 5x the context headroom, the AI spent less time re-reading and more time shipping.
What I’d Do Differently (Tips for Vibe Coding at Scale)
Write phase documents first
The phased approach was the single best decision. Nine documents describing what each phase should contain, implemented sequentially. The AI had clear scope, clear deliverables, and could work autonomously for hours. Without phase docs, I’d have been micromanaging every feature.
Use audit documents as task lists
Three times during development I had the AI audit its own work — scanning for placeholder data, broken routes, dead code, security issues. Each audit produced a markdown checklist. Then I pointed the AI at the audit document and said “fix everything.” Self-auditing is one of the most powerful vibe coding patterns.
Build, then QA by actually using the app
Don’t try to get it right the first time. Build fast, then test the running application yourself. The prompts that produced the best results were the ones describing real UX problems I found by using the thing: “the spinner never stops”, “I can’t drag the window”, “this dropdown needs styling.” The AI is better at fixing problems you can describe than predicting problems you can’t.
Keep prompts short
My median prompt was 12 words. The most effective prompts were one-sentence directives. The AI doesn’t need context — it has the codebase. It needs direction. “Make every table column sortable. Commit to a branch and open a PR” is a perfect prompt. It’s a complete work order in two sentences.
Let it run
Some of my sessions ran 8+ hours. The temptation is to check in constantly. Don’t. Let it work. Check when you see a PR notification. The “fire and forget” pattern produced the most code per prompt of any approach.
Use the interrupt
When the AI starts heading the wrong direction, interrupt immediately. Don’t wait for it to finish a wrong approach. I interrupted 21 times on one machine alone. It’s not rude — it’s efficient.
Embrace the build-then-fix loop
30% of PRs were bug fixes. That’s not a failure rate — it’s the process. Build at 70%, QA it yourself, fix the remaining 30%. The cost of iteration has collapsed. Perfectionism on the first pass is waste.
Don’t read the code
This sounds counterintuitive. But the moment you start reading source code, you’re doing the AI’s job. Your job is to use the application, describe what’s wrong, and make product decisions. If you can test it, you can fix it — without ever opening a file.
Days 18–21: From App to Product
The first 17 days built an application. The next four turned it into a product. 176 commits. 40 pull requests. 21 additional AI sessions logging 111 million tokens across 1,124 API calls. The work shifted from features to everything around features — naming, licensing, release engineering, website polish, and launch readiness.
The Rename (Day 19)
KubeViewer officially became Clusterfudge. One prompt. PR #90 touched 140 files — every import path, every config reference, every UI string. The AI did a clean rename across the entire monorepo without breaking a single test. The brainstorming session from Day 12 finally paid off.
Going Open Source (Day 19)
Three PRs in rapid succession: add an MIT license (#93), build an OSS sync workflow to push a clean public repo (#94), and move internal GTM documents out of the public directory. The AI separated private strategy docs from public source code, set up an orphan branch sync to a separate GitHub repo, and wired it all into CI. Open-sourcing a project is exactly the kind of tedious, error-prone work that AI handles perfectly — lots of file moves, config changes, and workflow YAML that needs to be exactly right.
Release Engineering (Days 19–21)
This was the most surprising productivity gain. In three days, the AI:
- Built a CalVer release workflow that auto-generates changelogs by prompting itself to summarise the diff (#95)
- Configured self-hosted macOS runners and fixed Go toolchain PATH issues
- Added Linux cross-compilation and APT repository publishing to the release pipeline
- Wired an update checker into the title bar so users see new versions (#127, #128)
- Cut four tagged releases (v2026.0319.1034, v2026.0319.1610, v2026.0319.1825, v2026.0320.2328)
Release engineering is traditionally the work nobody wants to do. Workflow YAML, signing, packaging, distribution. The AI treated it the same as any other task — read the docs, write the config, iterate until CI passes.
Website Polish (Days 19–21)
The brochure site got dark/light mode with system theme inheritance, mobile responsiveness fixes, a demo GIF, copy-to-clipboard on the install command, and an interactive canvas hero background. Eleven PRs of pure front-end polish across the Astro site — the kind of work that makes a product feel real.
The Biggest Day (Day 21)
March 20 was the single busiest day of the entire project: 70 commits and 19 merged PRs. Launch prep creates a long tail of small fixes — a broken dark mode icon, a missing cursor pointer on a button, a namespace default that should be “all” instead of “default.” Each fix was its own branch, its own PR, its own merge. The AI handled the volume without slowing down.
The day ended with the start of the next feature: port forwarding. PR #126 planned the architecture. PR #130 built the dialog and wired it into the pod list. Even on launch day, the AI was already building the next thing.
Multi-AI Provider Support (Day 19)
The built-in AI debugging terminal was hard-coded to one provider. One prompt turned it into a pluggable system supporting multiple AI backends, plus added a local terminal mode for users who want to run their own tools. PR #92 — 573 insertions across 19 files. The AI refactored its own AI integration.
Blog Posts Written by AI (Days 19–20)
The meta moment: I pointed the AI at the codebase and asked it to write blog posts about the features it had built. PR #91 produced “Building a Pod Security Scanner You Actually Use.” PR #107 produced “How the Troubleshoot Engine Turns Status into Diagnosis.” The AI wrote marketing content about code it had written, for a product it had built. Turtles all the way down.
The Uncomfortable Truth
I built a 74,000-line, full-stack Kubernetes desktop application with a brochure website, release pipeline, and open-source distribution in 21 days. Every line AI-generated. 129 pull requests, all merged. Three self-audits. Security hardening. Unit tests for 44 components. Dead code cleanup. Four tagged releases. An APT repository. Auto-update notifications.
The uncomfortable truth isn’t that this was possible. It’s that this was easy. The hardest parts were product decisions — what to build, what to cut, what to name it. The engineering was the cheap part.
A year ago, this project would have taken a small team several months. Today, one person with AI tools and product instinct does it between checking emails. The leverage is absurd.
If you’re an engineer, the implication is clear: your value is in knowing what to build, not how to build it. Taste, product thinking, and the willingness to QA your own work are the skills that matter. The code writes itself.
If you’re a founder, the implication is bigger: the cost of building just dropped by an order of magnitude. The bottleneck is no longer “can we afford to build this?” It’s “should we build this?” That’s a taste question, not an engineering one.
1,025+ prompts. 21 days. 74,000 lines. 129 pull requests. 1.27 billion tokens. Four production releases. And the name? Clusterfudge. It made the cut after all.
Raw Data
Codebase Growth
How the codebase grew day by day. The initial import was templates and design system files; the real build started March 4.
Net Lines of Code
0 25K 50K 75K 100K 125K
|---------|---------|---------|---------|---------|
Feb 28 ██████████▏ 27K
Mar 1 ████████████▎ 35K
Mar 2 █████████████▊ 39K
Mar 3 ███████████████▏ 43K
Mar 4 ████████████████████████████████▊ 80K ← Phase 1-9
Mar 5 ████████████████████████████████▉ 81K
Mar 6 ███████████████████████████████████▍ 87K
Mar 7 █████████████████████████████████████████▊ 102K ← Peak commits (48)
Mar 8 ██████████████████████████████████████████▏ 103K
Mar 9 ██████████████████████████████████████████▍ 104K
Mar 10 ██████████████████████████████████████████▏ 103K ← Dead code cleanup
Mar 11 ██████████████████████████████████████████▋ 105K
Mar 12 ██████████████████████████████████████████▊ 105K
Mar 15 █████████████████████████████████████████████████▏ 114K ← Brochure site
Mar 16 █████████████████████████████████████████████████▍ 116K
Mar 17 █████████████████████████████████████████████████▎ 117K
Mar 18 █████████████████████████████████████████████████▎ 117K ← Rename + OSS
Mar 19 ████████████████████████████████████████████████████▉ 125K
Mar 20 ██████████████████████████████████████████████████████ 128K ← Peak PRs (19)
Mar 21 ██████████████████████████████████████████████████████ 128K
Context Window Pressure Over Time
As the codebase grew, the AI needed more context to hold it in memory. Average context per API call tracked codebase size closely. The ~200K token limit created a hard ceiling that sessions started hitting once the codebase crossed ~80K lines.
Avg Context (tokens) Codebase
0 25K 50K 75K 100K 125K 150K 175K
|------|------|------|------|------|------|------|
Mar 2 ██████████████▍ 39K lines
Mar 5 ██████████████████▊ 81K lines
Mar 6 █████████████████████▎ 87K lines
Mar 7 █████████████████▍ 102K lines
Mar 8 █████████████████████▎ 103K lines
Mar 9 ████████████████████▉ 104K lines
Mar 10 ██████████████████████▉ ← peak avg context 103K lines
Mar 11 ████████████▋ ← short sessions only 105K lines
Mar 16 ███████████████████▋ 116K lines
|------|------|------|------|------|------|------|
▲ peak: 168K tokens (7 sessions hit ceiling)
Daily Activity
Commits, API calls (per machine), and output tokens by day. M1 and M2 refer to the two development machines.
| Date | Commits | API Calls (M2) | API Calls (M1) | Output Tokens | Codebase |
|---|---|---|---|---|---|
| Feb 28 | 1 | — | — | — | 27K |
| Mar 1 | 7 | — | — | — | 35K |
| Mar 2 | 7 | 106 | — | 12K | 39K |
| Mar 3 | 5 | — | — | — | 43K |
| Mar 4 | 35 | — | — | — | 80K |
| Mar 5 | 24 | 343 | — | 47K | 81K |
| Mar 6 | 43 | 801 | — | 163K | 87K |
| Mar 7 | 48 | 295 | — | 54K | 102K |
| Mar 8 | 18 | 532 | — | 104K | 103K |
| Mar 9 | 11 | 324 | — | 83K | 104K |
| Mar 10 | 26 | 1,285 | — | 214K | 103K |
| Mar 11 | 9 | 159 | — | 25K | 105K |
| Mar 12 | 3 | 4 | — | 1K | 105K |
| Mar 15 | 1 | — | — | — | 114K |
| Mar 16 | 43 | 794 | 1,506 | 117K | 116K |
| Mar 17 | 5 | — | — | — | 117K |
| Mar 18 | 52 | — | 768 | — | 117K |
| Mar 19 | 49 | — | 233 | — | 125K |
| Mar 20 | 70 | — | 123 | — | 128K |
| Mar 21 | 4 | — | — | — | 128K |
API calls and token data are from Machine 2 through Mar 16, and Machine 1 from Mar 16 onward. Machine 1 logged 40 sessions across Days 17–21 with 2,630 API calls and 243 million tokens. Some days have commits but no API call data because those sessions ran on a machine without detailed logging.
The Full PR List
Every pull request, in order. All 129 merged. Zero rejected. Four PR numbers (#110, #117, #131, #132) were either closed or still open — the rest shipped.
| PR | Lines Changed | Files | What It Did |
|---|---|---|---|
| #1 | +4,204 | 9 | UI templates with demo data |
| #2 | +1,790/-55 | 40 | Collapsible sidebar, 18 resource pages |
| #3 | +681/-12 | 9 | Table header hover, hex grid gaps |
| #4 | +1,224 | 1 | Phase 9 spec document |
| #5 | +139/-139 | 5 | Rename frontend/ to ui/ |
| #6-#8 | +5,942/-22 | 44 | Phase 1: Go backend foundation |
| #9 | +9,479/-1,901 | 27 | Phase 2: Project setup |
| #10 | +7,254/-71 | 35 | Phase 3: Core Kubernetes backend |
| #11 | +5,569/-369 | 50 | Phase 4: Frontend shell |
| #12 | +5,956/-205 | 99 | Phase 5: Resource views |
| #13 | +2,456/-120 | 28 | Phase 6: Log streaming, exec terminal |
| #14 | +2,950/-59 | 32 | Phase 7: Helm, YAML editor |
| #15 | +3,050/-713 | 46 | Phase 8: Polish, packaging |
| #16 | +4,968 | 64 | Phase 9: Competitive features |
| #17-#20 | +1,352/-531 | 64 | Review feedback, CI fixes |
| #21 | +3,190/-2,441 | 79 | Replace all placeholder data |
| #22-#28 | +2,237/-1,541 | 90 | Audit fixes, security hardening |
| #29-#35 | +2,763/-477 | 83 | ESLint, settings wiring, build |
| #36 | +151/-157 | 33 | Make all columns sortable |
| #37-#43 | +2,270/-247 | 76 | Error handling, hex grid, no-explicit-any |
| #44-#45 | +4,818/-6 | 50 | Unit tests (44 components + 4 handlers) |
| #46-#55 | +10,139/-583 | 98 | Feature parity: secrets, CRDs, port forwarding, Helm repos |
| #56-#60 | +2,652/-245 | 52 | Competitor analysis, resizable panels, welcome redesign |
| #61-#67 | +1,363/-3,677 | 45 | Cursor fixes, beta toggle, dead code removal |
| #68-#74 | +993/-357 | 40 | Documentation, audit items, backups |
| #75 | +0/-0 | 1 | Fix Cmd+Tab icon size |
| #76-#79 | +2,110/-54 | 33 | AI debugging terminal |
| #80-#86 | +10,685/-419 | 681 | Wire placeholder pages, brochure site, monorepo |
| #87 | +2,796/-2,738 | 23 | Vibe code stats analysis |
| #88 | +664/-397 | 19 | Bottom tray redesign as dock with pod picker |
| #89 | +169/-50 | 10 | Fix terminal resize, exec pipe error, log timestamps |
| #90 | +724/-707 | 140 | Rename KubeViewer → Clusterfudge |
| #91 | +325/-238 | 13 | Blog post: Pod Security Scanner |
| #92 | +573/-262 | 19 | Multi-AI provider support, local terminal |
| #93-#94 | +472/-693 | 49 | MIT license, OSS sync workflow |
| #95 | +938/-4 | 7 | CalVer release workflow |
| #96 | +808/-161 | 11 | Welcome screen redesign |
| #97-#99 | +4,762/-7,024 | 26 | Website audit fixes, mobile responsiveness |
| #100-#101 | +8,286/-8,415 | 23 | Demo hero image, website cleanup |
| #102 | +208/-66 | 27 | Light mode audit across full UI |
| #103-#106 | +66/-34 | 12 | PATH fix, binary size copy, version refs, blog date |
| #107-#108 | +129/-234 | 7 | Blog post: Troubleshoot Engine, docs review |
| #109, #111–#116 | +1,227/-1,786 | 52 | Restore blog posts, README overhaul, Lens comparison, mobile fixes |
| #118-#119 | +12/-12 | 3 | Dark mode icon fix, cursor theme toggle |
| #120 | +1,132/-3 | 12 | Demo cluster setup with Kind + Podman |
| #121-#123 | +36/-1,156 | 19 | Beta nav fix, default all namespaces, hero text |
| #124-#125 | +108/-25 | 6 | Update asset names, AWS CLI PATH fix |
| #126 | +390/-57 | 6 | Port forwarding architecture plan |
| #127-#129 | +185/-366 | 8 | Update checker fix, title bar notification, release v2026.0320 |
| #130 | +338/-2 | 4 | Port forward dialog |
| #133 | +1/-1 | 1 | Lint rule strengthening |
The “s-curve” is visible. Early PRs were massive foundation work (+9,479 lines). Middle PRs were targeted fixes (+151 lines to make columns sortable). Late PRs swung big again for the brochure site (+8,769 lines). The final stretch (Days 18–21) shifted to product work — rename, OSS, releases, website polish — with bursts of small, focused PRs. March 20 alone produced 19 merged PRs and 70 commits, the single busiest day of the project.