The Architecture of Leverage

On Bio Tokens and Agentic Thinking

Since publishing my last essay on agentic operating, the conversations it sparked have pulled me toward a deeper layer of the idea.

The operating model was one thing. The philosophy underneath it was another.

What I found myself trying to explain, again and again, was not just how these systems work, but the philosophy and mental models that led me there in the first place. This essay is a distillation of that layer.

The scarce resource in knowledge work is not time. It is biological processing capacity.

Lately I have been calling these bio tokens. Not API tokens. Human ones. Units of attention, judgment, synthesis, emotional steadiness, and cognitive energy. Each of us gets a finite daily budget. The budget changes with sleep, stress, and mental state. Some days the cap is high. Some days it collapses. But there is always a cap.

That cap matters more than most productivity systems admit.

A lot of ambitious people still behave as if the problem is scheduling. Better calendars. Better task managers. Better routines. Those things help, but they miss the deeper constraint. The real bottleneck is how many high-quality bio tokens you can deploy before your thinking degrades. Once those are gone, the day may continue, but the sharpness does not.

This is why two hours of clear thinking can outperform ten hours of fragmented motion. And it is why the most valuable systems are not the ones that help you do more things. They are the ones that help you spend your bio tokens on the right layer of the stack.

That is increasingly how I think about leverage.

Engineering Saw It First

Zed's framing around agentic engineering put language around something real. The model is not “AI replaces the engineer.” It is closer to this:

quality responsibility stays with the human
taste and craftsmanship still matter
directing agents is its own craft
leverage comes from tight feedback loops

That is exactly right. But engineering is only the first place this pattern became legible.

Why? Because engineering already had a culture of tools, automation, iteration, and explicit feedback loops. It was the right terrain for agentic work to emerge early. Developers were already used to compilers, terminals, scripts, tests, CI, and composable systems. Adding agents was a continuation of that logic.

But the underlying principle is much larger than software.

Any field where the product is judgment is governed by the same constraint. Investing. Research. Operating. Writing. Strategy. In each case, the visible output may differ, but the hidden bottleneck is the same: a finite human capacity to absorb context, hold state, compare possibilities, and make good calls.

That is the real abstraction. Agentic engineering is not the full category. It is the first clear instance of a broader shift toward agentic thinking.

The New Problem Is Not Work. It Is Token Allocation.

Once you see knowledge work through this lens, a lot of bad productivity advice starts to look shallow.

The question is not: how do I get through my inbox faster?

The question is: what am I spending irreplaceable bio tokens on that should have been converted into cheaper digital tokens?

That distinction matters.

A bio token is scarce, variable, and tightly coupled to human judgment. A digital token is cheap by comparison. It can be used for retrieval, assembly, formatting, summarization, cross-referencing, drafting, and state tracking. Digital tokens are imperfect, of course. They need constraints. They need supervision. They need taste upstream and judgment downstream. But they are abundant enough to absorb a growing share of lower-order cognitive work.

From my own experience, most leverage has come from learning how to make that conversion deliberately.

At first, the gains are small: offload note cleanup, task extraction, first-pass research, context assembly before a meeting or a writing session. Then, over time, the system becomes more ambitious. It starts carrying more of the surrounding state: what changed, what is unresolved, what evidence supports a claim, what counterarguments matter, what should be followed up tomorrow.

The point is not that I stop spending bio tokens.

The point is that I stop spending premium bio tokens on retrieval, coordination, and re-entry. I can spend them on higher-order thinking instead: framing the real question, pressure-testing the thesis, noticing what does not fit, making the call.

This is what agentic systems are really buying you when they work well. Not raw speed. Better allocation.

Delegation Is Old. Loop Closure Is New.

None of this begins with AI.

Long before models, ambitious people learned how to extend themselves by delegation. Founders hired chiefs of staff. Executives built layers of assistants and operators. Investors built research teams. Institutions created entire managerial systems to preserve executive attention for the decisions that mattered most.

Empires were built this way.

In other words, the basic insight is not new: if human judgment is scarce, protect it.

What is new is the speed at which loops can close.

Traditional delegation creates leverage, but it also creates coordination overhead. Every handoff needs framing. Every follow-up needs clarification. Every update carries latency. Every loop depends on human availability. The work moves, but the metadata around the work moves more slowly.

That metadata is where much of the hidden drag lives.

Why does this task matter? What changed since yesterday? Which source is strongest? What remains uncertain? What is blocked? What is the next best question? What prior context should shape the decision?

In older systems, that metadata was expensive to keep current. It lived in scattered documents, in assistants' heads, in status meetings, in memory, in half-written notes. A lot of executive energy was burned just reconstituting the state of the work.

Agentic systems change this because they can carry the metadata forward continuously.

They do not just help complete a task. They help preserve and update the context around the task. They can ingest the new information, compare it against prior state, mark what changed, suggest what matters, and tee up the next loop. That means the bottleneck is no longer just execution speed. It is the combined speed of execution and context propagation.

That is where the day-to-day velocity becomes visceral.

You do not just feel that more work is getting done. You feel that the system around the work is moving with you. Context is fresher. Open loops close faster. Re-entry costs fall. The distance between thought, test, feedback, and revision shrinks.

This is not a small optimization. It changes the texture of operating.

What This Looks Like In Practice

I built a system for myself around this idea.

The core goal was simple: spend fewer bio tokens on remembering, reassembling, and restarting.

It has three loops.

First, stream capture: one place to drop thoughts, notes, tasks, observations, fragments, whatever passes through the day. No ceremony. Minimal friction.

Second, daily closeout: an agent-assisted triage process that turns the stream into structured next states. Tasks become tasks. ideas become research threads. observations get linked to ongoing work. What matters gets surfaced in the context of my priorities.

Third, morning briefing: a prepared context layer for the next day. Open loops, calendar, project deltas, research movement, outstanding questions.

The system matters less than the principle behind it.

I am not trying to automate myself out of the loop. I am trying to preserve my best bio tokens for the part of the loop only I can do.

You feel this most clearly when I start exploring a new domain. The loop now closes differently. I can capture a question in passing, have the surrounding research and counterarguments assembled into a working frame, review it the next morning with context intact, push on the weak points, share it with others, and update the thesis quickly. The gain is not only that I reached fluency faster. It is that far fewer bio tokens were burned on reconstruction.

The same daily budget now gets spent on sharper questions.

That is the unlock I keep feeling.

Toward an Architecture of Leverage

This is why I think the implications extend well beyond engineering.

The frontier is not merely better chat interfaces or smarter copilots. It is the design of systems that convert low-value bio-token spend into digital-token throughput while preserving human responsibility for quality.

That last part matters. This is not a case for surrendering judgment. If anything, the better the system gets, the more valuable human taste becomes. Someone still has to define what matters, detect when the model is wrong, choose between trade-offs, and decide when a conclusion is actually good enough to act on.

But the composition of the day changes.

Less time spent rebuilding context. Less energy spent on first-order assembly. More capacity for second-order reasoning. More room for taste. More room for synthesis. More room for actual thought.

That is why I think the people who benefit most from agentic systems will not simply be the ones who “use AI.” It will be the ones who understand cognition as a scarce resource and design around it accordingly.

The real opportunity is not outsourcing thought. It is increasing the share of your finite human thought budget that gets spent where it compounds.

For a long time, leverage came from capital allocation, team design, and delegation. Those still matter. But we are entering a phase where a new form of leverage sits on top of them: the ability to systematically convert lower-order bio-token consumption into digital-token flow, while accelerating the metadata that keeps work coherent.

That is what I mean by an architecture of leverage.

Not a productivity trick, but a system for preserving scarce human cognition, closing loops faster, and directing finite judgment toward the layer where it compounds.

The people who learn to build those systems early will not just move faster. They will think with a different level of continuity.