The Five Failure Modes — What Breaks in Long-Running Agents

January 7, 20267 min readreference

Long-running agents fail in five predictable layers, each harder to detect than the last. Crashes get the most attention — they're visible, recoverable, and well-understood in distributed systems. Silent drift does the most damage: the agent is running, producing output, compiling code, and pointed entirely in the wrong direction.

I've built enough of these to have opinions about the order of operations.

What breaks — and what breaks worse

Long-running agents don't fail in mysterious ways. They fail in layers, and each layer is harder to detect than the last.

Hard crashes. The most obvious failure and the most preventable, which makes it the most frustrating. The API times out. Your internet drops. Sandboxed execution environments help contain blast radius, but they don't solve the state recovery problem. Three hours of work, gone. The fix is straightforward — checkpointing, retry logic, timeout handlers. These are solved problems in distributed systems. But now the agent recovers from crashes and keeps running. Which surfaces the next problem.

Resource bleed. A runaway agent burning through Opus-tier tokens for two hours on a task that should have taken thirty minutes. I've had sessions where the cost of a single agent run exceeded what I'd normally spend in a week, because the agent was churning through retries and I wasn't watching. So you add cost caps and monitoring. The agent stays within budget. But it's still running — and staying within budget doesn't mean it's doing useful work.

Context window overflow. Every tool call, every file read, every response accumulates. Eventually the conversation gets so long that the model starts losing track of earlier decisions. It doesn't announce this. It just quietly forgets that it already considered and rejected an approach, then tries it again. Cost caps don't catch this — the agent isn't wasting tokens on retries. It's wasting them on reasoning it's already done.

Circular reasoning. The direct consequence of overflow. The model gets stuck in a loop — try approach A, it fails, try approach B, it fails, go back to A. I've watched Claude Code attempt the same three fixes in rotation for twenty minutes. The model isn't stupid. It just doesn't have a good mechanism for tracking "I already tried this and here's specifically why it didn't work" across a long conversation. If you're watching, you can catch it. But what about the failure mode you can't see at all?

Silent drift. The agent doesn't crash. It doesn't error out. It doesn't loop. It just slowly stops doing the thing you asked it to do. Maybe it was supposed to refactor a module for performance and it starts refactoring for readability instead. Maybe it was adding a feature and it gets pulled into fixing tangentially related bugs. You come back after an hour and the work is technically competent but pointed in the wrong direction.

Why drift is the real problem

Most writing about agent reliability focuses on crash recovery. That's the easy problem. If your agent crashes, you know it crashed. You can build infrastructure for that.

Drift is harder because there's no error signal. The agent is running fine. It's producing output. The output is plausible. It's just not what you wanted. And by the time you notice — if you notice — you've burned through context window, tokens, and time.

The insidious thing about drift is that it looks like productive work. The agent is writing code. The code compiles. It might even pass tests. But the direction shifted twenty minutes ago, and every minute since then has been competent effort in the wrong direction. Crashes are loud failures. Drift is a quiet one, which makes it far more expensive in practice.

And drift doesn't just waste the drifted work — it contaminates the context. Once the agent has spent twenty minutes going in the wrong direction, all that wrong-direction reasoning is now in the conversation history, influencing subsequent decisions. Correcting the drift isn't as simple as saying "go back to what I originally asked" — the model's context is now polluted with a detailed record of the wrong approach. Sometimes it's faster to start a fresh session than to try to redirect.

A crash loses your work. Drift corrupts your context. Both are bad. Corruption is harder to recover from.

So the natural response is to build more infrastructure — drift detection, state tracking, periodic health checks, anomaly monitoring. But that instinct has its own failure mode.

The trap

There's a tension between harness complexity and harness reliability that organizes this entire problem. Every layer of infrastructure you add to make your agent more resilient is another layer that can break.

Each failure mode from the escalation above tempts you to add a layer. Crashes → checkpointing. Resource bleed → cost monitoring. Overflow → context management. Circular reasoning → state tracking. Drift → anomaly detection. Five failure modes, five infrastructure layers, five new surfaces that can break. You start with "I need checkpointing." Reasonable. Then retry logic. Then health monitoring. Then dependency graphs for multi-step tasks. Then a message broker for job queuing. Suddenly you're building a distributed system instead of a thin coordination layer, and the orchestrator is more complex than the work it's orchestrating.

I've seen agent systems where the monitoring and recovery code was more complex than the agent itself — and the monitoring code had bugs that caused more failures than it prevented. The cure starts to look like the disease.

What survives the trade-off

Given the trap, the question isn't "what's the ideal agent harness?" It's "what's the minimum infrastructure that handles the most failure modes without becoming the next thing that breaks?"

Smaller tasks beat longer sessions. The single most effective pattern, and it sidesteps nearly every failure mode at once. Instead of one four-hour agent session, run four one-hour sessions with clear handoff points. Each session starts with a summary of what's been done and what's next. Context overflow eliminated. Circular reasoning bounded. Drift caught at natural checkpoints. Resource bleed capped by session length. It's not sophisticated. It works.

Checkpointing at the task boundary. Save state between sessions, not within them. If a one-hour session crashes, you lose an hour, not four. The recovery point is always a clean handoff document, not a corrupted mid-session state.

Cost caps before you start. Set a budget. Not a vague sense of "I'll keep an eye on it" — an actual number. "This task should cost no more than $X. If it hits that, stop." I learned this one the expensive way.

Periodic summaries for drift. Structure long tasks so the agent reports back every N steps with a summary of what it's done and what it plans to do next. This gives you a chance to catch drift early. It's manual. It's annoying. It works better than the alternative, which is coming back to a finished task that's wrong.

Where this is going

The tooling for long-running agents is still primitive. We're at the "write your own checkpointing" stage, roughly where web development was in the "write your own HTTP server" era. Someone will build the framework that makes this boring and reliable.

The interesting open question is whether the right abstraction is a harness that wraps the agent, or whether the models themselves need to get better at long-running tasks — better at tracking their own state, recognizing when they're going in circles, knowing when to stop and ask. Both will probably happen. I'm more optimistic about the model improvements, because every harness I've built has felt like a workaround for things the model should just handle.

Until then, save your state, set your cost limits, and resist the urge to build the sixth layer. The boring infrastructure is what makes the interesting work possible — and the trap is building so much of it that it stops being boring.

Related Posts

X
LinkedIn