You evaluate agents for a living. So I brought receipts, not adjectives.
You just shipped progressive disclosure for ~63% token savings — which tells me you care about the thing most people hand-wave past: what these models actually cost to run, per task, in the real world. That's a language I speak. So instead of telling you I'm credible, here's a working tool from my POC archive that lives in exactly your wheelhouse.
A neutral model benchmark — and I make zero dollars from which one wins.
No gateway. No routing. No resale of inference. Just commodity API keys, an honest harness, and cost computed from each response's own reported token usage × published price — never an estimate. The whole run below cost about ten cents. Poke at it:
Here's the actual pitch in one line: single-agent skills → multi-agent orchestration. Your repo nails the "what good skills look like" layer. I've been building the layer above it — a decision console that sits over multi-agent ideation runs (one run took 53 raw ideas down to 36 that cleared the gates, each with an adversarial six-dimension verdict). Skills are the instruction set; orchestration is the runtime. I think that's the interesting frontier, and I think you do too.
I build like an orchestrator, not a typist.
Spec first, then dispatch parallel agents against interface contracts, then a convergence gate before anything merges. Memory-native, so the system remembers across sessions instead of re-learning every morning. Ship fast, verify harder. The page you're reading was built this way.
Lived experience + systems thinking
I build for communities everyone else overlooks — bilingual by default, because the people I build for are. Equity data, civic infrastructure, working-shift tools. The systems lens is what turns lived experience into something that scales.
Spunky, warm, allergic to corporate
I'd rather show you a working thing than send you a deck about it. If something's a toy model, I say it's a toy model. If a number's directional, I label it directional. Trust is the only product that compounds.
Seven things I shipped this month. Plus the platform underneath.
The platform underneath it all: DojoGenesis — AgenticGateway (multi-provider routing), a CLI, and MCP servers. Real multi-agent infrastructure, not a demo reel. And the velocity is the point — here's the archive the benchmark above came from:
Where your skills end, ours begin.
I'd love to map the seam between your skills repo and our orchestration layer — find the complementary edge and see if there's a collaboration worth building. No pitch deck. Just two builders comparing notes on what comes after single-agent skills.
What a great skill looks like. The instruction set. The discipline a single agent runs on.
Orchestrating many of them — parallel dispatch, memory, convergence gates, neutral evals. The runtime.