atelier

Agent-Compatible Systems

Software whose surfaces remain workable when the user on the other side is a program — the design problem of building tools that humans and agents can both inhabit, without one having to pretend to be the other.


A friend showed me, half embarrassed, the output of a chat assistant he had pointed at his quarterly spreadsheet. The agent had read the visible cells, missed the formulas underneath, summarised what it saw in confident prose, and produced a number that was almost — but not quite — what the document actually meant. The mistake was not really the agent’s. The spreadsheet had been built for a person who could see at a glance that one column was derived from another, and the agent had no glance.

I want a name for the design problem this exposes, and the most honest one I have found is agent-compatibility. The question is not whether software can be wrapped in a chat interface, or whether a model can be coaxed into producing valid API calls. It is whether the software, considered as a system of affordances, can be operated by something that does not have eyes, does not have intuition about screen layout, and does not have the patience to read documentation between mouse clicks. Most software, judged on that standard, cannot — not because the agent is stupid, but because the surfaces we built for humans hide the very structure an agent would need.

The literature here is younger than the problem. Anthropic’s Model Context Protocol and the various agent-to-agent proposals being floated by the larger labs are early attempts at a shared vocabulary between agents and the world they act in, and they will not be the last. I read these efforts the way I read the early HTTP drafts — provisional, opinionated, almost certainly not the shape we will end up with, but valuable for naming the joints where future systems will need to bend.

What “compatible” is doing in the phrase

I am deliberately not saying agent-native. A native system is one built for agents from the start, often at the cost of human use. That is an interesting research direction, but it is not the one I find most pressing. The more common situation, and the one most worth thinking about, is the system that already has human users and now needs to admit a second class of caller without betraying the first.

Compatibility, in this register, is closer to what accessibility means in interface design than to what API-first means in cloud architecture. Accessibility succeeds when a screen-reader user and a sighted user can complete the same task using the same underlying structure, with neither path treated as a stripped-down version of the other.The analogy is imperfect — agents have very different perceptual constraints from humans with disabilities — but the design discipline of refusing to fork the system into two unequal copies carries over. Agent compatibility, when it works, works on similar terms: the agent and the human are not competing for the system’s attention, but reading the same affordances at different bandwidths.

Where current software fails the test

A few patterns keep recurring in the systems I have looked at, and they are not what the marketing material warns about.

The most common is invisible state. A button that is disabled for an agent’s logical reasons — a quota, a lock, a pending operation — is often disabled in the human UI for purely visual reasons, with no machine-readable signal of why. The human reads the surrounding context and infers. The agent guesses, and frequently guesses wrong, and if it is the kind of agent that retries on failure, it will guess wrong many times in quick succession.

Underneath that sits implicit composition. Most non-trivial workflows in mature software are not single API calls but choreographies — open this, edit that, save in the right order, close in the right order. The choreography lives in the user’s head, reinforced by years of muscle memory. An agent meeting the same software for the first time encounters a flat list of endpoints and no theory of when to call which.

The third pattern is the cruellest, because it looks like care. Call it consequence opacity. A destructive action presents the human with a confirmation dialog because the designer wanted a moment of friction before something irreversible. The agent receives the same dialog as a parseable JSON response, accepts it, and proceeds. The friction was performative, not informational; it carried no machine-legible warning that the next operation could not be undone.

None of these are unsolvable. All of them require treating the agent as a real reader of the system, with its own perceptual limits, rather than as a faster mouse with a worse memory.

Open questions

I am most interested in the questions that sit between the two extremes — neither wrap everything in a chat box nor rebuild the world for autonomous swarms. The middle ground is where most software will actually live for the next several years, and it is the part of the problem that is least discussed.

State and provenance pose the most basic question: how to expose the history of a system so an agent can reason about what changed and why. There is also a documentation question — what a capability description should look like when its primary reader is a model with a finite context window, rather than a developer with a search bar and time to read. Undo and history present a different opportunity: they already exist for the human, and could, with care, be the same surface the agent reads, if anyone designs them to be. Authorisation is harder and, I think, understudied — how to communicate consent and intent to a caller who has no skin in the game and cannot, by construction, feel the weight of a decision. And the last and least clear is about failure modes: what it means for an interface to fail loudly enough for a model to notice, without screaming at the human.

Most of the questions are old. The CSCW literature has been asking related ones about non-human collaborators for decades, and the API-design canon already contains most of the technical vocabulary one would need. The contribution, if there is one, is in noticing how rapidly the assumptions of those older fields are about to be tested by callers that read everything and understand most of it.

A working stance

The systems I find most promising are the ones whose authors treat agent-compatibility as a property of the underlying model rather than as a translation layer bolted on top. Once you accept that agents will read your software, the design pressure flows backwards: state wants to be inspectable, actions want to be named, side effects want to be declared, and the gap between what the UI shows and what the system means starts wanting to close.

These are not new asks. The instinct that drives Malleable Software — that systems should not hide their structure from their readers — sits underneath them, as does the older API-design discipline of making side effects explicit. What has changed is the population of readers, and with it the cost of pretending those properties were optional.

That last gap is where I expect most of the interesting work to happen. It is also, not coincidentally, where most of the interesting work in software design has always happened.