Now It's Easier to Fix the Problem Than Talk Yourself Out of Caring

A Github screenshot with a button marked "Heck yeah you should file that PR"

My Rust bonafides are nil. I hosted a Rustaceans meetup at the old Replicated HQ in 2019, have politely kiboshed every “maybe we should rewrite this in Rust” pitch I’ve heard as a manager & lead, and have never shipped a line of Rust in my life. But last night I filed a feature request on a Rust project, and this morning I submitted an 824-line PR to fix it.

There are at least five moments along the way where, in the before times, I would have closed the tab and just moved on with my life.

The socks

It started on Friday in Pasadena. The SoCal Linux Expo is my high school reunion — but instead of my high school classmates, it’s every LA ex-colleague who has ever had an Ops in their title. I caught some talks, saw some of my favorite people, then hit the expo hall. I wasn’t looking for swag, but the black socks with rainbow stripes caught my eye. Any-LLM… I literally just heard of this project the previous week, on the excellent Tool Use podcast. These folks might have some cool tips on wrangling AI agents, I thought, and indeed they did.

Any-LLM is one of several Mozilla AI projects they were showing off. The driving concern of their org: support the open options in every AI tool category. CEO John Dickerson explained that of course he’s using Claude Code — they can’t NOT use the proprietary frontier models and harnesses. Opus and Codex are just too good. But they are working hard to help make the open-weight models and open-source tools remain viable. Any-LLM is an SDK adapter layer for apps to avoid lock-in on any single model. Any-Agent decouples agent frameworks. Octonous looks like an open alternative to n8n. See also: any-guardrail, mcpd, llamafile, and encoderfile.

All very cool. But the thing that got my attention was the Agent of Empires tool for juggling CLI sessions. Not an official Mozilla project, but a side project from Nathan Brake who would be at the booth later.

I have been multi-tabling Claude Code instances for a few months now. I am not yet ready to go full Gastown, or to build my own dark factory, but I am definitely in the market for whatever comes right after 9 tabs of Ghostty + Glass.aiff ding sound when a tab needs my input. I went home and brew-installed ‘aoe’ then went to bed.

Saturday, I returned to SCALE and found Nathan at the booth, and he explained what he’s doing with Agent of Empires. It’s not an opinionated orchestrator, but rather a tool that makes it feasible for one person to directly oversee and manage a screen full of CLI sessions. It has built in support for launching agents in Docker sandboxes, and uses tmux (and its familiar keybindings) to manage the sessions.

Good enough for me. Lots of smart ergonomics throughout — especially the feature that lets you toggle any session slot between the Claude/agent session and a bare terminal in that same folder.

The confession

Claude isn’t magic until you run claude --dangerously-skip-permissions. It’s not an agent loop if you’re getting prompted all the time, and if you stop and read what it’s asking for, they are often dumb dead-end requests. It’s only when you let your agent get relentless that you start seeing impressive outcomes. But you really ought to keep your relentless agents far from your Github org-owner privileges and AWS access keys.

I will confess that I had not been 100% buttoned up when it came to keeping Claude on a leash.

I’ve been lucky. No disasters. A few embarrassments where Claude pushed straight to main, with code I hadn’t yet tested. Nothing catastrophic, but a growing sense of dread. Fine-grained tokens and sandboxes have been on the to-do list for a while, and this was the serendipitous nudge to get it done.

Agent of Empires would make sandboxing convenient enough to actually do. Each session can launch in a Docker sandbox container with its own environment variables. I can use fine-grained GitHub PATs scoped to individual orgs, managed through 1Password’s op CLI, injected into sandboxed sessions through aoe profiles.

Time to start YOLOing responsibly.

The wall

After setting up four org-scoped profiles (three client projects, and one of my own projects), I launched a session and saw nothing. The session was created, but it was nowhere onscreen.

This is nope-ramp number 1 (the off-ramp from this whole journey, where I could easily have just said nah, and taken the easy exit and gone on to everything else in my life). 2024 Frank might have skimmed the docs, googled a bit, maybe filed an issue, and most likely just uninstalled and moved on. Instead, I pointed Claude at the aoe repo and asked it to debug.

Claude traced the problem to app.rs, line 383:

let instance = match self.home.get_instance(session_id) {
    Some(inst) => inst.clone(),
    None => return Ok(()),  // silently drops cross-profile sessions
};

The TUI only loads instances from the active profile. When you create a session in profile B while viewing profile A, the creation succeeds but the attach silently fails. Your session is invisible.

That’s a bug. But the deeper problem is a design assumption: profiles couple configuration isolation with view isolation. If you have four profiles, you have four separate screens. You toggle between them with Shift-P. There’s no single-pane view of all your sessions.

I asked Claude to confirm: is there any way to see all sessions from all profiles on one screen?

No, but here are some workarounds:

Stuff everything into one profile and manage env vars manually? No.
Stick aoe config files with personal settings into every project? Hell no.
Use the ‘P’ shortcut to cycle between profile screens constantly, with no single overview of all the sessions? Why would I ever.

A fork in the road

Nope-ramp number 2: It’s a bug, not a misunderstanding. Do I fork and fix this for myself, file an issue and see what happens, or just walk away from this promising-looking tool that’s just not yet ready for me?

Let’s have Claude file an issue and see what happens! I prompted Claude to include lots of specific details to make things easily reproducible, and filed one for the silent attach bug, and another for the missing unified view. I included details bout my use case: least-privilege credential isolation across orgs, but all sessions visible from one screen. Without an agent-helper, this would have taken time and effort.

I went to bed.

When I woke up, Nathan had already responded. “That’s a great idea. Would you like to file a PR or shall I address it?”

Nope-ramp number 3: In another, not-long-ago-era, I could have gotten hung up on this point. Why shouldn’t I dive in and fix it? I have always wanted to learn Rust, and this might be that moment. But this is a distraction, and perhaps I should just let the maintainers decide whether and how to fix it. Decisions, decisions! In 2026, the cognitive load is so much lighter to just sic Claude on the problem, come back in an hour, and if Claude has a fix, the PR is the answer, and if not, I say “thanks, I’ll let you handle it” and move on. But knowing Rust was not even a consideration.

The build

I used a suite of Claude Code ‘superpowers’ skills to drive the work. Brainstorming presented three design approaches. I picked the one that makes the unified view the default rather than hiding it behind a flag. Settings and toggles are for the mild-at-heart.

The implementation plan came out to 12 tasks. Claude worked through it in batches of three tasks with checkpoints. TDD throughout: write the failing test, implement the fix, verify green. The core change replaced a single Storage with a HashMap<String, Storage> and added collapsible profile headers for visual grouping.

Nope-ramp number 4: the Rust toolchain. My installed version was rusty and needed an update. Cargo wasn’t even on my PATH because of a stale Homebrew symlink. None of these sorts of issues mean anything anymore when Claude can just chew through them and fix them without comment.

Final count: 824 additions, 185 deletions. 841 unit tests, 7 integration tests, 21 end-to-end tests. All passing.

The wave

I’m not the only one doing this on this repo.

Agent of Empires is two months old. 1,077 stars. In the ten days before my PR, the project absorbed 50-plus pull requests from 15-plus external contributors.

alepar hit a tmux scope bug. Filed an issue. PR’d the fix the same day. A second PR fixed a bash 3.2 compatibility issue on macOS. One-line change. Found by Claude Code.
jerome-benoit has 12 merged PRs. His bug reports include truth tables and impact assessments across every affected file.
nirok80 wanted support for a new AI agent (pi.dev). Added the full integration. 263 additions and 5 new e2e tests.
hansonkim submitted a PR that the maintainer closed in favor of a different approach. His response: “Writing code isn’t the hard part. Setting the right direction and approaching strategically is what matters.”

The project’s PR template includes an explicit AI disclosure section with checkboxes: “No AI was used,” “AI was used for drafting/refactoring,” and “This is fully AI-generated.” Multiple external PRs check the last box. Claude Code, Opus 4.6. The PR descriptions are delightfully clear and thorough.

What changed

The open source contribution funnel used to be so leaky. You find a tool, but have to consider your feelings about the language it’s written in. You hit a problem. You check if there’s already an issue. How long do issues stay open in this repo? Is it worth filing a PR? Will the PR ever get merged? At each of these steps, you have to wonder if it’s worth the effort.

Reasonable people have reasonable reasons to drop off.

AI Agents can handle the coding now, but they have also changed the decision math at every point along the path. The mere presence of a mostly-working new tool inspired me to dive in and set up my 1Password CLI integration (which is delightfully op), to finally setup a dotfile repo with chezmoi (not fully convinced yet, but what’s the worst that can happen), to file an issue AND a PR, and then finally, the last nope-ramp — to write a blog post about this experience.

Time for another confession. I built a Claude ‘blog-drafter’ skill to parse my session transcripts and draft a narrative. The output was so dreadful, I had to rewrite it by hand, but it was motivating! I doubt I would have taken the time to write this if I hadn’t first imagined I could lean on an agent to carry me.

I wrote zero blog posts last year, and zero OSS contributions. This week I wrote one of each. What’s changed? Now I have a mental prosthesis that keeps pushing me to say “yes and” instead of agonizing over “why not”.

The Setup

Agent of Empires (aoe): manages multiple agent-coding sessions via tmux, with per-session Docker sandboxes and profile-based configuration. Installed from njbrake/agent-of-empires.

Profile structure (in ~/.agent-of-empires/):

Four org-scoped profiles, each with sandbox_enabled = true, yolo mode on, and an org-specific GH_PAT_* environment variable
Global config uses aoe-dev-sandbox as the default Docker image

Credential management:

Fine-grained GitHub PATs stored in 1Password, one per org, scoped to minimum required permissions
gh-pat-rotate script pulls fresh tokens from 1Password via op read and writes them to ~/.bashrc.d/sandbox-env.sh
Chezmoi manages all dotfiles and scripts through a fshot/dotfiles repo

Claude Code skills used during the PR:

superpowers:systematic-debugging for the initial bug investigation
superpowers:brainstorming for the feature design
superpowers:writing-plans for the TDD implementation plan
superpowers:executing-plans for the actual build

CLAUDE.md: The aoe project has thorough agent instructions covering build commands, testing guidelines, commit conventions, and a migration system for breaking data changes. Having good CLAUDE.md files in a repo is the best thing a maintainer can do to improve AI-assisted contributions.

The PR: #427 - Unified all-profiles TUI view