What does 'loops instead of prompts' mean for coding?

It means you stop typing a new instruction for every step and instead design a system that prompts the agent for you: set a goal, let the agent act, check whether the goal is met, and repeat until it is or a limit stops it. The prompts do not disappear. A program issues them inside the loop rather than a person typing each one. This is the same mechanism Anthropic described in 2024, an LLM using tools in a loop, now run on a schedule against a goal.

Is prompt engineering dead?

No, but it has moved from a headline skill to table stakes. The hype around it as a standalone job has cooled (searches for 'ai prompt engineering course' are down about 90% year over year), even as 'prompt engineering' itself stays a high-volume topic, and leading builders now talk about designing loops rather than crafting single prompts. Good prompts still live inside the loop. The higher-value skill in 2026 is designing the loop around them: the goal, the check that proves the goal, and the limits that stop the run.

What is loop engineering?

Loop engineering, a term popularized by Addy Osmani in June 2026, is designing the system that prompts an AI agent on a schedule instead of prompting it by hand. In his words it is 'replacing yourself as the person who prompts the agent.' In practice it means writing down a goal, a check that verifies the goal, and stop conditions, then letting the agent run between them. It sits on top of two earlier skills: writing a good prompt, and building the harness a single agent runs inside.

What makes an AI agent loop reliable enough to run unattended?

A check it cannot skip and limits it cannot exceed. The loop needs a way to verify each pass (tests that pass, a schema that validates, a reviewer that signs off), a cap on iterations and spend, and a stop condition. Risky or irreversible actions should sit behind a named human approval. In our own field study of 942 agent runs, about 70% succeeded and nearly every failure was operational (bad configuration, timeouts, abandoned tool calls) rather than the model reasoning wrong, which is encouraging: those are the fixable, engineerable kind.

Should my team run coding agents in a loop?

On the right work, yes, and it is worth doing now. Start where verification is cheap and the action is reversible, so a wrong result is caught and redone at low cost. Keep risky, irreversible steps behind a human approval, and build an evaluation that tells you when the loop has drifted. The deciding factor is not the model. It is whether you have a check you trust for the work you are automating.

Agentic coding grew up: from prompts to loops

TL;DR

Real shift: builders who ship the tools (Anthropic's Claude Code lead, an engineering leader at Google Chrome) and Andrej Karpathy now set a goal and let the agent loop (act, check, repeat) instead of typing a prompt for every step.
A loop is not a new model. It is the mechanism Anthropic described in 2024, an LLM using tools in a loop, plus a goal, a check, and a stop condition. Practitioners have been doing it by hand for over a year. One built a working programming language this way.
The check is the lever. A strong check turns a loop's speed into output you can trust. Skip it and the speed evaporates: in a 2025 METR study, experienced developers were 19% slower with AI because they had to verify everything by hand.
We run agents in loops in production. The approval gate is what makes them safe to leave alone, and strengthening the check is what lets a loop run further. We call that limit the verifier ceiling.

In June, the developer Peter Steinberger posted two sentences that landed because a lot of people building with AI had already felt the shift. The post drew more than 8 million views.

Peter Steinberger's post on X, June 7, 2026. @steipete

He is describing something real and genuinely useful. Instead of typing a new instruction for every step, you set a goal and let the agent run: act, check the result, adjust, repeat, until the work is done. Boris Cherny, who built Claude Code at Anthropic, put it the same way: “I don’t prompt Claude anymore. I have loops running. They’re the ones prompting Claude… My job is to write loops.” Andrej Karpathy, who popularized “vibe coding” a year ago, told the No Priors podcast he has not typed a line of code by hand since December.

We have been running agents in loops in production for a while, and the excitement is earned. A loop can carry real work overnight. The teams getting the most out of it have also learned the one habit that turns a loop from a neat trick into something you can trust, and that habit is most of what follows.

A loop is an old idea, newly within reach

None of this needs a new model or a new capability. Anthropic described the mechanism back in December 2024: an agent is “just” an LLM “using tools based on environmental feedback in a loop.” Act, look at what happened, decide the next move, repeat, until the goal is met or a stop condition trips. The loop is the runtime. The model is one step inside it.

Builders have been running that loop by hand for more than a year. In July 2025, Geoffrey Huntley published what he called Ralph, which in its plainest form is one line of shell: a Bash loop that pipes the same prompt file into a coding agent over and over (while :; do cat PROMPT.md | claude-code ; done). Each pass starts with a fresh context window and reloads its specs and plans from files, so state lives on disk instead of in the agent’s memory. Huntley used it to build a working programming language, compiler and all. One line of shell, run patiently, produced something most teams would scope as a quarter of work.

So the jargon is friendlier than it sounds. Loop engineering, the name Addy Osmani gave the pattern in June, is just writing down three things and letting the agent run between them: a goal, a way to check the goal is met, and the limits that stop the loop. The prompts did not disappear. They moved inside the loop, where a program issues them instead of a person typing each one. The skill moved up a level too, which is why the “become a prompt engineer” course searches are fading while the work itself is growing.

We built one, and the gate is what makes it work

Our agent workbench, Packwolf, runs this way. Each agent wakes on its own tasks and works the loop without anyone watching: it researches, drafts, writes, and keeps going until it has a spec or a plan worth showing. At that point it does not ship. It pushes the work into an approval gate, which triggers a review. The reviewer, another agent on routine work or a person on the higher-stakes calls, does one of three things: sends it back into the loop with notes, promotes it to the next agent in the chain, or escalates it to a human for sign-off. Nothing leaves the loop without clearing that gate.

The loop is the cheap part, and that is good news: starting an agent and letting it churn is a few lines of code. The gate is what makes the whole thing worth running. It is what lets the team leave the loop going overnight and trust what comes back in the morning, because every result has been checked before it counts. The gate is not a brake on the loop. It is the thing that lets you take your hands off the wheel.

A loop is only as autonomous as its check is trustworthy. Strengthen the check and you can let it run further.

Strengthen the check and the loop runs further

This is the rule that has held across every loop we have run, and it is an encouraging one once you see it. A loop’s trustworthy output rises with the strength of the check that gates each pass. Call it the verifier ceiling. Raise the ceiling, by making the check stronger, and you can safely let the loop run further and lean on it harder. The check is the lever, and it is squarely in your control.

Weak check versus strong check, on the same loop output. ideius

You can watch what happens when teams skip the check. In 2025, METR ran a careful study of 16 experienced open-source developers working on their own large repositories. With AI tools allowed, they were 19% slower, even though they felt about 20% faster, because the time went into reviewing and correcting output by hand. Read it the builder’s way and it is a roadmap: the speed is real, and a check you trust is what converts it into shipped work instead of a second pass of manual review. Osmani, who named loop engineering, lands in the same place. Your job, he writes, is to ship code you confirmed works, and to build the loop “like someone who intends to stay the engineer.” The check earns its keep most on the quiet wins and the quiet misses alike: a confident wrong answer that sails through unchecked costs far more later, the same failure tax that makes a cheap coding model expensive once you count what its mistakes cost to fix.

What a loop needs before you let it run

Name three things and you can turn one loose with confidence. A goal the agent can pursue without you re-steering it at every step. A check that proves the goal is met: tests that pass, a schema that validates, a reviewer that signs off. And limits that end the run: a cap on iterations, a stop when the changes stop changing, and a ceiling on tokens or spend. If you cannot yet name the check, that is useful information, because the fix is the goal, not the agent.

The good news from running these at scale is where loops actually stumble, which is rarely the reasoning. In our own field study of an autonomous agent workforce, across 942 runs about 70% succeeded, and almost every failure was operational: a bad configuration, a timeout, a tool call abandoned halfway. Those are the engineerable kind. Build the recovery, the checks, and the stop conditions well, and the loop gets reliable fast. That plumbing is most of what we build when a client wants agents they can rely on.

What this means if you run a business

This is one of the better shifts in how software gets built in years, and it is available to you now. The teams that win with it are not the ones chasing the headline or handing a loop their codebase on faith. They put agents in loops behind checks they trust, starting on the work where a check is cheap and a mistake is catchable. Pick a reversible task, keep the risky and irreversible steps behind a named human approval, build the evaluation that tells you when a loop has drifted, and make sure an unattended loop with broad access cannot move client data somewhere it should not go, the same care behind shadow AI. It also helps to know which model belongs inside the loop, which is its own comparison worth running on your own work.

We have a stake to name here. ideius builds these systems, so a team that reads this may later hire us to build one. The advice does not change either way: the loop is the easy part to buy, and the check is the part that deserves your judgment. We are vendor-neutral and earn the same whatever you end up running, which is why we would rather see you spend on the verifier than on a bigger model.

The work did not disappear. It moved up a level, from typing the next instruction to designing the loop and the check that decides when the work is done. That is a better job, and the teams building it now are the ones who will trust their agents while everyone else is still typing.