Harness Engineering / Long form

Your autonomous agent is only as safe as the discipline around it

On this page

An autonomous agent writing working code is not the impressive part anymore. The impressive part is the discipline around it that lets it run against your codebase while you sleep — and you don't wake up to a production incident.

Most of the autonomous-agent conversation is stuck on capability. Can it close the ticket. Can it open the PR. Can it run the loop overnight. Those are settled questions in 2026. The variable that actually decides whether you ship or bleed is the one nobody screenshots: the management discipline around the agent.

So here's my position. An autonomous agent is only as safe as the harness and the management around it. Capability is a commodity. The harness is the moat. I observe teams handing an agent the keys to things that matter — prod data, merge rights, customer-facing surfaces — with less ceremony than they'd give a first-week intern. I recommend the opposite: no autonomous agent touches anything that matters until six things exist around it. Harness engineering is the scaffolding around AI-native work — retrieval, scoped tasks, review gates, self-report verification, rollback, and human ownership. Not a tool you install. A perimeter you draw.

Retrieval, or the agent is guessing

An agent with no retrieval is an agent improvising from training data and the last thing you typed. It will write something plausible. Plausible is not the same as correct in your repo, with your conventions, against your schema. Retrieval — pulling the agent's working context from your real code, docs, and decisions instead of its imagination — is the minimum infra, not a nice-to-have. Anthropic's own engineering writeup on building effective agents lands in the same place: start simple, ground the agent in real context, and add autonomy only where it earns its keep (www.anthropic.com). Research never guess applies to the agent too.

A perimeter, not the keys

The single highest-leverage decision is scope. What is this agent allowed to touch, and what is the blast radius if it's wrong? On Nerdie — my autonomous board solver running in production, 624 cron cycles in eleven days, 4.7× the team's prior ticket throughput, ~24% of post-launch work carrying a "done by AI" label — the agent's live runtime is comment-only. It can research, draft, and propose. It cannot transition tickets, edit fields, or merge from the cron loop. That's not timidity. That's a low-blast-radius perimeter: the autonomy is real, the surface it can damage unsupervised is deliberately small. Give the agent a perimeter, not the keys to prod.

The gate is where safety lives, not the generation

Cheaper code generation is not the win people think it is. Cheaper unreviewed change is a liability. Every change an autonomous agent proposes should pass through the same gate a human's would — a PR, green CI, a review that resolves before anything lands. The agent makes change cheap; the gate is what keeps cheap change from becoming expensive change. If your safety story is "the agent is usually right," you don't have a safety story. You have a vibe.

Trust the output, verify the self-report

This is the part most people skip, and it's the sharpest one. An autonomous agent will tell you it's done. It will say the tests pass, the bug is fixed, the edge case is handled. That self-report is generated text, not a guarantee — it is exactly the kind of confident sentence the model is best at producing whether or not it's true. The harness verifies independently: the CI run, the actual test output, the diff, the rated outcome. On Nerdie, 42 outcomes were rated by the team, not by the agent grading its own homework. Working isn't the bar; surviving is — and "the agent said it works" is not evidence that it works.

Rollback and an owner with a name

Two things close the loop. Rollback: every autonomous change needs a fast path back, because the question is never if it ships something wrong, only when. And ownership: a human owns the outcome. The agent is not accountable, cannot be on call, and will not explain itself to a customer. Someone with a name signs for what the agent shipped. That's the difference between automation and abdication.

The skeptic is right until you show the harness

If you have a senior engineer who refuses to let an autonomous agent near the main branch, that engineer is not behind. They're doing risk assessment, and on the evidence most teams give them, they're correct. The answer to legitimate skepticism is never a mandate or a hype deck. It's the harness — retrieval, perimeter, gate, verification, rollback, owner — made visible. The skeptic is right until you prove otherwise. Show the perimeter and the skeptic becomes your best harness reviewer.

A harness isn't process for its own sake. It's the difference between "touch it and break it" and "touch it with confidence." The autonomous agent is the easy half. The discipline around it is the half that decides whether you built leverage or a liability.

That discipline is portable, but it is not generic — it has to be cut to your repo, your risk surface, and your team. I give away the map; the hands are mentorship. If you're running autonomous agents and want the harness built around your actual stack, that's what I do inside Carevia mentorship → Need hands, not a map? Carevia mentorship.