Agent Audit and Replay (Evidence, Retention, Rollback)

Agent audit and replay is the discipline of making every agent run provable, reviewable, and reversible, even after the chat transcript is gone or disputed. In Claw EA, the operational unit is a run governed by a WPC, authenticated by a CST, and evidenced by gateway receipts and a proof bundle.

OpenClaw is the baseline agent runtime, but prompt-only control is not enough because prompts are not permission boundaries. You need a permissioned execution layer with policy-as-code so tools, model calls, and replay constraints are enforced by machines, not “best effort” instructions.

Step-by-step runbook

This runbook aims for audit-ready operation without inventing new infrastructure. It assumes OpenClaw is your runtime, and Claw Bureau services are used for policy and evidence.

Define the work boundary for a run: objective, data classes touched, and allowed tools. Write it as a WPC and treat it like a change-controlled artifact that you can hash, sign, and fetch deterministically.

Keep the WPC stable for the entire run so later audits can answer “what was the agent allowed to do at that time?”
Issue a CST for the job and pin it to the policy hash when you need strict replay controls. The CST is the on-wire authorization, and the scope hash is the concise representation that auditors can compare across runs.

For anti-replay, use job-scoped CST binding so a token captured from one job cannot be reused to launder a different job’s activity.
Route model calls through clawproxy so you get gateway receipts for each model call. This is the evidence layer for “what model was called, with what request envelope, and what came back,” in a form that can be verified later.

If you use OpenRouter via fal, keep it on that path through clawproxy so receipts cover the routed call.
Lock down tool execution locally in OpenClaw using sandboxing and tool policy, and confirm the effective configuration before enabling external triggers. Prompt text can request anything, but OpenClaw tool allow/deny and sandbox settings decide what can actually execute.

Run periodic checks with OpenClaw security audit guidance, especially after changing network exposure, plugins, or channel policies.
At the end of the run, emit a proof bundle that includes gateway receipts and run metadata needed for later verification. Store the proof bundle with your retention policy, and optionally publish a Trust Pulse for audit viewing.

Retention should cover at least the period where decisions can be challenged (for example: access reviews, incident response, financial close).
Verify on demand: when an incident occurs, you want to verify evidence without re-running the agent. Treat verification as a fail-closed gate in your incident workflow: if a proof bundle cannot be verified, the run is not admissible as evidence.

Replay should be “replay the evidence,” not “replay the model.” You can re-simulate tool steps in a safe sandbox, but you should not claim deterministic reproduction of frontier model outputs.

Threat model

Audit and replay fail most often from small operational gaps: missing policy pinning, permissive tool profiles, or logs that cannot be tied to a specific job. The table below lists concrete failure modes and what to do about them.

Threat	What happens	Control
Prompt injection triggers unintended tool use	The agent follows hostile instructions embedded in content and calls file, shell, browser, or network tools.	Permissioned execution: enforce tool allow/deny in OpenClaw, use sandboxing for tool execution, and bind the run to a WPC that restricts tool categories.
Policy drift between runs	A team changes config and later cannot prove what was in effect when a decision was made.	Use a WPC as the hash-addressed policy artifact and pin the CST to the policy hash so the run carries its policy identity.
Token replay across jobs	A captured token is reused to make calls that appear to belong to a different run.	Marketplace anti-replay binding with job-scoped CST binding, plus storing the job identifier inside the proof bundle metadata.
Evidence without verifiability	You have transcripts or app logs, but cannot prove which model calls occurred or whether content was altered.	Route model calls via clawproxy to produce gateway receipts, and package them into a proof bundle for verification.
Over-collection of sensitive data in logs	Logs become a second breach surface, or retention violates internal policy.	Keep retention scoped to what you can justify; store proof bundles with access controls; minimize raw prompt and tool output retention where feasible while keeping receipts and policy hashes.
Unsafe local execution surface	An agent with host execution can modify system state, credentials, or other workloads.	Prefer OpenClaw sandboxing modes, avoid elevated execution unless explicitly required, and regularly run the OpenClaw security audit checks.

Policy-as-code example

Prompt-only “rules” are easy to bypass because they are interpreted by the model. A WPC is policy-as-code: it is a signed, hash-addressed policy artifact served by clawcontrols, so enforcement can check the hash and fail closed.

Below is a compact, JSON-like sketch of what teams typically pin for audit and replay. Treat it as an example shape, not a guaranteed schema.

{
  "wpc_ref": "wpc:sha256:<policy_hash>",
  "job": {
    "job_id": "job_2026_02_11_001",
    "owner": "team-finops",
    "purpose": "invoice_reconciliation"
  },
  "auth": {
    "cst_scope_hash": "<scope_hash>",
    "pin_policy_hash": true,
    "anti_replay": "job_scoped_binding"
  },
  "runtime": {
    "baseline": "OpenClaw",
    "sandbox": { "mode": "all", "workspaceAccess": "ro" },
    "tools": {
      "allow": ["read", "write", "http", "browser"],
      "deny": ["exec", "elevated"]
    }
  },
  "models": {
    "route": "OpenRouter via fal (through clawproxy)",
    "require_gateway_receipts": true
  },
  "retention": {
    "proof_bundle_days": 180,
    "access": "security-review-only"
  }
}

What proof do you get?

For audit, you need evidence that is machine-verifiable and tied to a specific job. In Claw EA, the core artifacts are gateway receipts and the proof bundle that packages them with run identifiers and policy references.

Gateway receipts are signed receipts emitted by clawproxy for model calls. They help you answer: which model route was used, which request envelope was sent, what response was returned, and whether the call was authorized under the CST presented.

A proof bundle is the harness artifact bundling receipts and related metadata for audit and verification. In practice, you store the proof bundle as the authoritative record of the run’s model interaction evidence, alongside the WPC reference and the CST scope hash used at execution time.

For operational review, you can also publish a Trust Pulse as a marketplace-stored artifact for audit/viewing. Use it when you want a stable, shareable view of the run evidence without handing out raw internal logs by default.

Retention is a policy decision, not a logging accident. Keep the proof bundle long enough to support incident response and contested decisions, and keep access tight so evidence does not become an unbounded data lake.

Rollback posture

Rollback in agent systems is mostly about containing blast radius and restoring known-good state. You rarely “roll back the model,” but you can roll back credentials, permissions, channel exposure, and the tool surface while preserving evidence.

Action	Safe rollback	Evidence
Suspected compromised job token	Revoke the CST and re-issue a new CST for a new job. Keep job-scoped CST binding so old tokens cannot be repurposed.	Proof bundle shows the CST scope hash used per run and the gateway receipts show calls made under that authorization.
Policy mistake (too-permissive tools)	Publish a new WPC with tightened permissions, then pin the new policy hash for subsequent runs. Do not “edit” the old WPC for historical runs.	WPC hash identifies the policy in effect; proof bundle and receipts tie activity to that specific policy identity.
Unexpected model behavior in production	Freeze the model route for the workflow and require future calls to route through clawproxy for receipting. If needed, disable high-risk tools in OpenClaw tool policy immediately.	Gateway receipts provide the model-call evidence for the impacted window; OpenClaw configuration and audit guidance support post-change review.
Channel exposure or inbound abuse	Tighten inbound allowlists and mention requirements in OpenClaw channel policy. Reduce the number of places where strangers can trigger runs.	OpenClaw security audit guidance and configuration state support the “what changed” narrative; proof bundles cover model-call evidence for triggered runs.
Need to prove what happened without rerunning	Do not rerun the job as a first step. Verify the proof bundle and review receipts and run metadata; use sandboxed re-simulation only for contained reproduction.	Proof bundle is the authoritative evidence package; receipts are signed and can be checked for integrity.

FAQ

Why isn’t a transcript enough for audit and replay?

A transcript is easy to truncate, redact incorrectly, or reformat without detection. Gateway receipts and a proof bundle are designed to be verified, and they bind model-call evidence to a job, a CST scope hash, and a WPC reference.

What does “replay” mean if model outputs are nondeterministic?

Replay means replaying the evidence and the execution constraints, not guaranteeing identical text output. You can validate that specific model calls occurred and then re-simulate tool steps safely, but you should not treat a second model run as the same event.

How does policy-as-code stop prompt injection better than prompt rules?

Prompt rules are suggestions to the model, and the model can be coerced or can make mistakes. Policy-as-code is enforced outside the model: OpenClaw controls which tools can run, and WPC plus CST pinning makes the allowed surface explicit and verifiable.

What is the minimum set of artifacts to retain for an audit-ready posture?

At minimum: the WPC reference (hash), the CST scope hash used for the job, and the proof bundle containing gateway receipts. Keep additional application logs only if they are necessary, and keep access to evidence limited to review roles.

How does this relate to Microsoft audit logs?

If you operate in Microsoft environments, keep your agent governance aligned with Microsoft audit logging and review flows for agent activity. Use Microsoft terminology and controls where applicable (for example: Entra ID, Conditional Access, PIM), and integrate via official API where needed rather than assuming native connectors.

Agent Audit and Replay (Evidence, Retention, Rollback)

Step-by-step runbook

Threat model

Policy-as-code example

What proof do you get?

Rollback posture

FAQ

Why isn’t a transcript enough for audit and replay?

What does “replay” mean if model outputs are nondeterministic?

How does policy-as-code stop prompt injection better than prompt rules?

What is the minimum set of artifacts to retain for an audit-ready posture?

How does this relate to Microsoft audit logs?

Sources

See how this applies to your environment

Agent Audit and Replay (Evidence, Retention, Rollback)

Step-by-step runbook#

Threat model#

Policy-as-code example#

What proof do you get?#

Rollback posture#

FAQ#

Why isn’t a transcript enough for audit and replay?#

What does “replay” mean if model outputs are nondeterministic?#

How does policy-as-code stop prompt injection better than prompt rules?#

What is the minimum set of artifacts to retain for an audit-ready posture?#

How does this relate to Microsoft audit logs?#

Sources#

See how this applies to your environment

Related

Step-by-step runbook

Threat model

Policy-as-code example

What proof do you get?

Rollback posture

FAQ

Why isn’t a transcript enough for audit and replay?

What does “replay” mean if model outputs are nondeterministic?

How does policy-as-code stop prompt injection better than prompt rules?

What is the minimum set of artifacts to retain for an audit-ready posture?

How does this relate to Microsoft audit logs?

Sources