White Paper  ·  Aletheia Security Consulting  ·  June 2026

The Evidence Gap

AI Agents, Attribution, and SOC 2 CC8.1

AI coding agents are running in production at organizations that haven't thought through what this means for their change management evidence package. CC8.1 hasn't changed. The way changes get made has.

Download the Framework →

CC8.1 requires an evidence package. AI agents don't produce one.

SOC 2 CC8.1 requires that every production change come with a consistent evidence package: who authorized it, what was the plan, who implemented it, was it tested, and what's the backout path. That package has worked for human engineers and automated pipelines for years.

AI coding agents don't fit either model. They receive a goal and determine their own method at runtime — which files to touch, which commands to run, how many changes to make. They typically operate under a human engineer's credentials. The result is a log that looks internally clean while failing to answer the questions the audit depends on.

This isn't a logging problem. It's an evidence problem. And it's the kind auditors are starting to ask about.

Every part of the evidence package is affected.

The breaks aren't subtle. They go to the core structure of what change management evidence is supposed to prove.

Break 01  ·  Attribution

The log says Jane.
Jane was in a meeting.

Most organizations run AI agents under the invoking engineer's credentials — her SSH keys, her git identity. Every action the agent takes lands under her name. The log is internally accurate. It is also a lie with a clean format. It cannot tell you which changes a human evaluated and which ones no one ever looked at. That distinction is exactly what review and deviation analysis depend on.

Break 02  ·  Method

Approved before
it existed.

A traditional change record contains the implementation method before work starts. The approver saw the method. Testing can be compared against the stated approach. Deviations are identifiable because there's a plan to deviate from. For an agent given a goal, the method materializes at machine speed after the approval is granted. There is no left side to the planned-versus-actual comparison the audit depends on.

Break 03  ·  Review

Approving what
you can't evaluate.

One deployment can contain hundreds of agent-generated commits. The reviewer approving the merge faces a diff that may be impossible to meaningfully evaluate in the time available. The approval record looks identical whether the review was substantive or reflexive. An approval that can't be exercised as a review isn't evidence of review. It's evidence that a control exists on paper.

Most organizations are solving the wrong problem.

The most common responses to the agent governance question are all partially correct — and each one is routinely mistaken for a complete answer.

Correct first step · Not the destination

Dedicated service accounts

Service account treatment ends the attribution problem — it gives auditors a truthful actor in the log. But it says nothing about what method was planned, what scope was authorized for this session, or whether a backout plan existed. The deeper breaks remain fully open. And a standing service account carries persistent, broad credentials — the opposite of bounded, per-change authorization.

Creates the thing it's trying to fix

Per-action human approval

Require sign-off before every agent action and you've restored human review in theory while destroying it in practice. At any meaningful scale, per-action approvals at agent speed become rubber stamps: recorded as if evaluation occurred, impossible to exercise with actual judgment. Manufactured evidence of review is worse than honestly documenting that an agent operated under appropriate governance.

Necessary · Not testable alone

Policy documentation

Written policy is necessary. "Agents must not deploy without review" is the right policy to have. But the auditor doesn't test whether the policy exists — she tests whether the control operated. If agent activity isn't instrumented before the action occurs, the compliance evidence is generated after the fact by the same systems that did the work. A well-formatted self-attestation is still self-attestation.

The Evidence Parity Framework

The framework's premise is straightforward: hold AI agents to the same evidentiary bar human engineers already meet — at plan granularity, not keystroke granularity. A named human approves a concrete plan before work starts. An identified actor executes it. Deviations are detected and handled. Records exist that no one can quietly rewrite.

Pillar 01

Plan-Bound Authorization

Before any session starts, the agent generates a plan record — objective, method, scope, test approach, backout path — and a human with appropriate authority approves it. Approval is bound to this artifact, not to a goal.

Pillar 02

Agent Identity Separation

Each session runs under its own ephemeral credential, issued for that session, bound to the approved plan, expired when the session ends. The record can now distinguish what the human did from what the agent did.

Pillar 03

Execution Records & Deviation Handling

Every action is logged under the session credential. Resources the agent actually touched are compared against the declared scope. Action outside the envelope triggers revocation and an exception record — proof the boundary was real.

Pillar 04

Two-Stage Verification

Verification at credential issuance (plan complete, approver authenticated, no change freeze) and at promotion (vulnerability scan, test evidence, scope conformance). Both stages write to the audit ledger.

Pillar 05

Independent Anchoring

A governance ledger operated by the same team running the agents is self-attestation with cryptographic decoration. Independence requires key custody separation and external anchoring to a system the organization cannot rewrite.

The questions are in fieldwork now.

In the absence of formal guidance, these are the questions beginning to appear in CC8.1 fieldwork. Organizations that can answer them from records are ahead of the requirement. Organizations that can't are accumulating compliance debt with each agent session running today.

Download the full paper.

The Evidence Gap: AI Agents, Attribution, and SOC 2 CC8.1

The complete paper covers the evidence package CC8.1 actually requires, exactly why AI agents break it in three specific ways, why the most common responses close only part of the gap, the full Evidence Parity Framework with all five components, a five-phase implementation roadmap, and the questions auditors are already starting to ask.

  • How AI agents break SOC 2 CC8.1's evidence package — in three specific ways
  • Why service account treatment and policy documentation aren't enough
  • The Evidence Parity Framework: plan-bound authorization, agent identity separation, independent anchoring
  • A five-phase implementation roadmap ordered by evidentiary yield
  • The questions auditors are already starting to ask in CC8.1 fieldwork
Steve Weltman, CISSP  ·  Aletheia Security Consulting  ·  © 2026
WHITE PAPER
The Evidence Gap
AI Agents, Attribution,
and SOC 2 CC8.1
Steve Weltman, CISSP
Aletheia Security Consulting

Enter your work email to download the framework.

No spam. Occasional advisory briefings on compliance and AI governance, from which you can unsubscribe anytime.

Your next audit will ask about this.
Better to answer from records.

Book a 30-minute conversation. We'll talk about where your agent governance stands today, what an auditor would find, and what it would take to close the gap before they do. No pitch. No proposal. Just an honest conversation.