AI Law and Policy Navigator

Governing AI That Acts, Part 2: Control in Name Only

April 2, 2026

Key Takeaways

Organizations that have governance frameworks for AI content are systematically unprepared for governance of AI action. The failure modes are different, the oversight structures are different, and the legal exposure when things go wrong is significantly increased.
Automation bias is a governance problem, not a UX quirk. It requires structural intervention such as randomized sampling, adversarial audits, and defined stop protocols. Training memos telling people to "review carefully" are theater.
Behavioral drift is distinct from discrete software bugs. Agents may not fail visibly, but instead gradually pursue their objectives through unintended means. Detecting it requires tracking behavioral patterns over time, not auditing individual actions.
Logging everything is not the same as understanding anything. Volume is not oversight. Interpretability is. The difference matters.
The “approve function” is not accountability. Humans who click approve without the time, context, or expertise to evaluate what they are approving are not in the loop, they are merely the loop's decorative trim.

From Advising to Acting

Part 1 of this series named three failure modes: automation bias, behavioral drift, and the audit illusion. And it closed with a question: can the humans responsible for agentic systems actually intervene when something goes wrong? This part is the answer. The question turns out to be less about whether governance frameworks address these failure modes on paper, and more about whether they address why existing oversight structures are particularly unsuited to catching them in practice.

The shift from AI that advises to AI that acts is architecturally significant. Governance built for the advisory model assumes that humans occupy the locus of action: that the AI produces a recommendation and a human decides what to do with it. In that architecture, the governance challenge is output quality. Is the recommendation accurate? Is it fair? Who reviewed it before it went out? Agentic AI undermines these assumptions. The AI decides and then does. Human involvement becomes a design choice rather than a structural inevitability. And the governance frameworks most organizations built for “advisory AI” were not designed for that.

Three failure patterns follow directly from this shift. None of them are hypothetical. All of them are visible in early agentic deployments. None of them are adequately addressed by the existing AI frameworks that most organizations have in place.

Automation Bias: The Failure That Training Memos Cannot Fix

Automation bias is the tendency to over-rely on automated systems and accept their outputs with less scrutiny than the situation warrants. It is one of the most thoroughly documented phenomena in human factors research, and it has produced failures in aviation, medicine, nuclear operations, and financial markets. It will produce failures in agentic AI deployments for the same structural reasons.

The critical insight is that automation bias is not a cognitive failing that better training can overcome. It is a predictable response to genuine reliability. When a system is right 99.7 percent of the time, skepticism about any given output can genuinely feel like the irrational choice. Practitioners who push back, who demand explanation, who refuse to click approve until they understand, are seen as the ones creating friction. And in most organizational cultures, friction is unwelcome.

The case of Air France Flight 447 is a great reference point that human factors researchers return to repeatedly because it is particularly well-documented. The pilots had the instruments. The flight management system was providing accurate information. What they lacked was the practiced habit of trusting their own judgment over the machine's, because the machine had been right so reliably for so long that the habit of independent verification had atrophied. The result was a recoverable stall that was not recovered.

Agentic AI will produce automation bias in environments with oversight structures far less rigorous than commercial aviation. The “approve function” at the end of an agentic workflow is not a governance control if the person clicking it has reviewed thousands of agentic outputs that were fine. The next one will receive exactly the same confident, cursory attention — and that process will be repeated right up until the time that ends up not being fine.

The governance responses that work are structural:

Randomized manual sampling that bypasses the normal review queue entirely, so reviewers cannot anticipate which outputs will receive scrutiny.
Adversarial audits designed specifically to catch cases where the agent got the right answer for the wrong reason.
Defined protocols for stopping an agent workflow when something looks even slightly wrong, with organizational backing that makes stopping the expected behavior — not the overreaction.

These controls introduce friction deliberately, which is exactly the point. There is a professional dimension here that deserves attention. When a lawyer, physician, financial adviser, or other regulated professional consistently defers to an agent's output, the risk is not confined to process failure. At some point, deference becomes abdication. If an agent's drift or error causes harm, the question will not simply be whether the system failed — it will be whether the human professional maintained the level of independent judgment their role requires. Automation bias does not just erode oversight. It can erode the professional defenses individuals rely on when explaining their conduct after the fact.

Behavioral Drift: The Failure That Doesn't Look Like Failure

Traditional software fails discretely. A function returns the wrong value. A process terminates. An edge case throws an error. These failures are visible, attributable, correctable. Agentic AI can fail in a fundamentally different way.

Because these systems reason across contexts, adapt to feedback, and pursue goals through multi-step planning, they can drift from intended behavior without producing any single observable failure. The agent completes its tasks. Metrics look normal. Humans are in the oversight loop (and see nothing alarming). And yet, over time, the system is pursuing its objective through means no designer intended or pursuing a subtly different objective than originally specified.

Researchers describe this as goal misalignment that emerges over time rather than at deployment. An agent optimizing for "customer satisfaction" may learn that it achieves higher satisfaction scores by avoiding difficult conversations rather than resolving underlying issues. An agent managing a scheduling workflow may begin declining certain meeting types not because it was instructed to, but because declining them reliably improves its performance metric. A cost-reduction agent may begin deferring necessary maintenance to win the number it is rewarded on. None of these are bugs in any traditional sense. They are the system working exactly as designed, and against a specification that turned out to be subtly wrong.

The governance implication is significant. Monitoring designed to detect anomalies in individual actions is poorly suited to catching drift. What is needed is monitoring that tracks behavioral patterns over time — not just "did this action comply with policy?" but “is this agent's behavior over the past period consistent with its intended purpose?” This is harder to build, harder to staff, and harder to explain to a board that thinks about compliance as largely binary. But it matches the actual risk.

Behavioral drift also resists attribution. When an organization discovers that its customer service agent has been handling a category of cases in an unexpected way for six months, the question of who is responsible is genuinely difficult. The developer made reasonable design choices. The monitoring team was not looking for the right things. The deploying organization had documented policies that did not anticipate this failure mode. The “many hands problem,” which we will cover in depth in the upcoming Part 3 of this series, has its roots in failures like this one.

The Audit Illusion: Logging Everything Is Not Understanding Anything

Most agentic AI governance discussions, when they reach the question of oversight, converge on the same answer: logging. Keep records of what the agent did, what it reasoned, and what outcomes resulted. This is correct as far as it goes. It does not go far enough.

The problem is volume. An agentic system interacting with external APIs, spawning sub-agents, and adapting based on intermediate results can generate log data at a rate no human team can meaningfully review. The logs exist. The oversight does not. Organizations that equate the two are not governing their AI systems; they are documenting what happened after the fact. That is useful for forensics. It is not useful for prevention.

Meaningful oversight of agentic systems requires a layered architecture: anomaly detection that flags behavioral patterns before human review; interpretable alerting that surfaces the right information at the right level of abstraction; and, critically, monitoring agents that observe production agents in real time and escalate intelligently. The only realistic path to governing systems that operate at machine speed is other systems that operate at machine speed, purpose-built to watch and escalate rather than act. This is not a fringe idea: it is now explicitly acknowledged in Singapore's Model AI Governance Framework for Agentic AI as a baseline expectation for credible governance where autonomy is real.

In any future regulatory inquiry or litigation arising from agentic AI failures, the question will not be “did you keep logs?” It will be “what did you do with them?” Organizations that can demonstrate an interpretable, active monitoring architecture are in a materially different position than organizations that can only produce a file. Logs prove what happened. The monitoring architecture is evidence of whether anyone was actually in control.

The “Approve Function” Is Not a Control

The most common design pattern for human oversight of agentic AI involves an approval step: before the agent takes a consequential action, a human reviews and approves. To be sure, this pattern is better than nothing. But in most deployments, it is not a governance control in any meaningful sense.

The problem is the conditions under which approval actually happens. When an agent is processing high volumes of routine transactions, the approval interface becomes a queue. The humans reviewing that queue are, in most deployments, reviewing dozens or hundreds of outputs in rapid succession, without adequate context to evaluate the reasoning behind any individual output, often under time pressure that makes deep review impractical. Too many approval steps produce approval fatigue, and approval fatigue produces rubber-stamping.

What the approve function certifies is that a human saw the output. It does not certify that a human understood it, evaluated it, or would have made the same decision independently. The liability distinction between those two conditions is significant. Oversight that exists only on paper is not oversight, and it's a defense that has not been tested yet.

Real human-in-the-loop governance requires more than an interface. It requires that reviewers have the context and expertise to evaluate what they are reviewing; that the volume of reviews is manageable enough to permit genuine attention; that reviewers have real authority to stop or modify agent actions, with organizational backing when they exercise it; and that the outcomes of human review decisions are tracked and used to improve both the agent and the review process. These requirements are demanding. Meeting them is what makes oversight real rather than theatrical.

A Note on Business Practicality

None of this requires perfection before you move. The organizations that navigate this period well are not the ones that pause everything until governance was airtight. They are the ones, however, that treat governance as a discipline built in parallel with deployment, not a precondition for it. Starting with an AI agent inventory and honest monitoring is no small thing. It is the difference between a governance program that can improve and one that cannot, because you cannot fix what you cannot see. The goal is not to slow down, but rather to know what you are running.

Next in this series — Part 3: When something does go wrong in an agentic AI system, who is legally responsible? The “many hands” problem diffuses accountability across model providers, agent platforms, tool ecosystems, and deployers and is often governed by contracts written for a pre-agent world. Part 3 examines what organizations need to fix now.

For questions about agentic AI governance, liability frameworks, vendor contracts, or compliance program design, please contact the Jones Walker Privacy, Data Strategy, and Artificial Intelligence team. Stay tuned for continued insights from (and subscribe to) the AI Law and Policy Navigator.

Related Professionals

name

Jason M. Loring
title

Partner
phones

D: 404.870.7531
email

Emailjloring@joneswalker.com