Production deploy with two-person approval | Secure Agent Workflow

To run production deploys with two-person approval, treat “deploy” as a permissioned action gated by policy-as-code, not a prompt instruction. In Claw EA, you do this by binding an agent run to a WPC and issuing a CST that is pinned to that policy hash, so the execution layer can fail-closed when approvals or dry-run evidence are missing.

OpenClaw is the baseline agent runtime, but prompts alone cannot guarantee that a deploy only happens after two humans approve. The enforcement must live in the execution path, so tool calls and model calls are only accepted when the WPC conditions are satisfied and verifiable artifacts are attached.

Step-by-step runbook

Use this runbook when an agent proposes a production change and you want machine-enforced approvals, forced dry-run, and audit-ready evidence. The details of the approval UI and ticketing can be implemented via official API or via MCP server, but the enforcement points stay the same.

Define the production deploy WPC. Put “deploy-to-prod” behind an explicit policy gate that requires two distinct approvers and a dry-run artifact. Store the WPC in the WPC registry so it is signed and hash-addressed (served by clawcontrols).
Issue a CST that pins the policy hash. When a deploy job is created, mint a CST (issued by clawscope) scoped to only the deploy tool and the target environment, optionally pinning the WPC hash. This makes the job non-replayable across unrelated runs and gives you a stable reference for verification.
Run “plan/dry-run” first and capture outputs. The agent must execute a forced dry-run before any production apply. If you use Terraform, this is “plan”; if you use Kubernetes, this might be “server-side dry run” plus a diff; if you use a custom deployer, produce a deterministic preview artifact.
Collect two-person approval as signed inputs. Two humans approve the exact dry-run output, not a paraphrase. The approvals can come from your control plane (enterprise buildout), or from an existing workflow system via official API, but each approval must record approver identity, timestamp, and the hash of the dry-run artifact.
Execute “apply” only after approvals validate. The deploy tool validates the WPC requirements locally before running any irreversible action. If approvals are missing, mismatched to the dry-run hash, or from the same identity, the tool fails closed and produces a denial record.
Proxy model calls through clawproxy. Route the agent’s model calls through clawproxy so you get Gateway receipts for model calls. This lets you later prove which prompts and outputs were used during the decision path without relying on best-effort logging.
Bundle evidence for audit. At the end of the job, assemble a proof bundle that includes the Gateway receipts, the WPC hash, the CST scope hash, dry-run artifacts, and approval records. Optionally publish the resulting artifact into Trust Pulse for viewing.

Threat model

Two-person approval exists because production deploys are irreversible or expensive to unwind. The common failure mode in agent systems is that a prompt suggests a constraint, but the tool still executes because nothing in the runtime enforces it.

Threat	What happens	Control
Prompt injection pushes a direct “deploy now” instruction	The agent attempts to skip approvals and calls the deploy tool directly	Policy-as-code WPC gating on the deploy tool, enforced at execution time; missing approvals cause fail-closed
Single approver rubber-stamps twice	Two approvals exist but are not independent	Two-person rule requires distinct identities; enforce “unique approver” and reject duplicates
Approval does not match what is deployed	Humans approve a plan, but the agent applies a different change	Forced dry-run; approvals bind to dry-run artifact hash; deploy tool verifies hash match before apply
Replay of a previously approved deploy	A past approval is reused for a new deploy attempt	Marketplace anti-replay binding using job-scoped CST binding; approvals must be job-bound and hash-bound
Unbounded tool blast radius in the runtime	The agent uses extra tools to exfiltrate secrets or change IAM as a side effect	OpenClaw tool policy and sandboxing reduce available tools and isolate execution; WPC restricts which actions are in-scope
Dispute about what the model saw or produced	Post-incident, you cannot prove whether a risky instruction came from the model or a human	Gateway receipts from clawproxy provide signed receipts for model calls; receipts are packaged into the proof bundle

Policy-as-code example

This is a compact, JSON-like sketch of a WPC that enforces three things: forced dry-run, two distinct approvals, and a strict tool scope for production apply. The enforcement should happen inside the deploy tool wrapper or the execution gateway that brokers the tool call.

{
  "wpc_version": "v1",
  "policy_name": "prod_deploy_two_person_approval",
  "risk_class": "irreversible",
  "tool_scope": {
    "allow": [
      "deploy.dry_run",
      "deploy.apply_prod",
      "read.repo",
      "read.build_artifacts"
    ],
    "deny": [
      "rotate_credentials",
      "change_firewall",
      "iam.write"
    ]
  },
  "required_sequence": [
    {
      "step": "dry_run",
      "evidence": {
        "artifact_type": "deploy_preview",
        "hash_alg": "sha256"
      }
    },
    {
      "step": "approval",
      "rule": "two_person",
      "constraints": {
        "distinct_identities": true,
        "bind_to_artifact_hash": "dry_run.deploy_preview.sha256"
      }
    },
    {
      "step": "apply",
      "guard": {
        "must_match_artifact_hash": "dry_run.deploy_preview.sha256"
      }
    }
  ],
  "token_requirements": {
    "cst": {
      "scope_hash_required": true,
      "optional_policy_hash_pinning": true,
      "job_bound": true
    }
  }
}

Why policy-as-code instead of prompt-only: prompts can request approvals, but they cannot prevent a tool call from being made. A WPC is evaluated in the execution path, so “no approvals” is not a suggestion, it is a hard stop.

What proof do you get?

Each deploy job can produce evidence that is checkable after the fact, without trusting the agent’s narrative. The core artifacts are Gateway receipts, a proof bundle, and the policy and token bindings that show what was allowed.

Gateway receipts. Signed receipts emitted by clawproxy for model calls, including the job context needed to verify that the model traffic was routed through the proxy. This is the backbone for answering “what did the model see and output?”
WPC reference. The proof bundle includes the WPC hash (and optionally the full WPC) so reviewers can reconstruct the exact constraints in force for that job. Because WPCs are signed and hash-addressed, you can verify you are looking at the same policy that was enforced.
CST scope hash and policy pinning. The CST (issued by clawscope) can be validated against its scope hash and, when used, the pinned WPC hash. This ties the job’s permissions to a specific contract and blocks “scope drift” during execution.
Proof bundle. A harness artifact bundling receipts and related metadata for audit and verification, including dry-run artifact hashes and the two approval records. This is what you hand to security, change management, or auditors.
Trust Pulse (optional). If you need a consistent place for reviewers to view an audit artifact, you can store the proof bundle as a Trust Pulse artifact for audit/viewing.

Rollback posture

Two-person approval reduces the chance of a bad deploy, but you still need an operational rollback posture. The key is to make rollback a first-class action with its own policy gates and its own evidence, rather than an ad hoc “fix forward” chat.

Action	Safe rollback	Evidence
Prod deploy apply	Prefer atomic deploy mechanisms (blue/green, canary, or versioned releases) so rollback is a version switch, not a manual edit	Proof bundle includes dry-run hash, two approvals bound to that hash, and Gateway receipts for model calls
Config change	Store prior configuration and use an automated revert path that re-applies the last known good version	WPC identifies the exact tool scope used; receipts show the sequence of calls leading to the change
Credential rotation	Use staged rotation with overlap and a documented break-glass path; treat rotation as high risk and gate it separately	Separate WPC and CST scope; proof bundle for rotation should be distinct from deploy proof bundles
IAM or firewall changes	Use least privilege and time-bound access; apply changes via infrastructure-as-code so rollback is a revert and apply	WPC denies these in the deploy workflow by default; any exception requires a different WPC and new approvals

OpenClaw-specific note: do not rely on “the agent will behave” when host execution is available. Use OpenClaw sandboxing and tool policy to limit what the agent can touch, and keep elevated host execution tightly controlled.

FAQ

How is two-person approval enforced if the agent is the one running the deploy?

The deploy tool wrapper checks for two distinct approvals bound to the dry-run hash before executing apply. The WPC makes that requirement machine-evaluable, and the CST can be pinned to the WPC hash so the job cannot “switch policies” mid-run.

Why can’t we just put “wait for approval” in the prompt?

A prompt is not an enforcement boundary. If the runtime allows the tool call, the agent can still call it, especially under injection or accidental instructions, so the control must live in policy-as-code evaluated at execution time.

What counts as a “dry-run” in this workflow?

A dry-run is any deterministic preview output that can be hashed and later compared against the apply inputs. Examples include Terraform plan output, Kubernetes server-side dry run diffs, or a signed release manifest generated by your build system.

Can the approvals come from Microsoft tools?

Yes, via official API and your existing identity controls (for example Entra ID backed approvers), but Claw EA does not assume a native connector. The important part is that each approval record includes the approver identity and the dry-run artifact hash it is approving.

What do we show auditors after an incident?

Provide the proof bundle for the job: the WPC hash, the CST scope hash (and any policy pinning), the dry-run artifact hash, the two approval records, and Gateway receipts for model calls. This lets an auditor independently verify that the deploy was gated and that the model traffic was receipted.

Production deploy with two-person approval | Secure Agent Workflow

Proof-first summary

Step-by-step runbook

Threat model

Policy-as-code example

What proof do you get?

Rollback posture

FAQ

How is two-person approval enforced if the agent is the one running the deploy?

Why can’t we just put “wait for approval” in the prompt?

What counts as a “dry-run” in this workflow?

Can the approvals come from Microsoft tools?

What do we show auditors after an incident?

Sources

Ready to put this workflow into production?

See how this works for your stack

Production deploy with two-person approval | Secure Agent Workflow

Proof-first summary

Step-by-step runbook#

Threat model#

Policy-as-code example#

What proof do you get?#

Rollback posture#

FAQ#

How is two-person approval enforced if the agent is the one running the deploy?#

Why can’t we just put “wait for approval” in the prompt?#

What counts as a “dry-run” in this workflow?#

Can the approvals come from Microsoft tools?#

What do we show auditors after an incident?#

Sources#

Ready to put this workflow into production?#

See how this works for your stack

Related

Step-by-step runbook

Threat model

Policy-as-code example

What proof do you get?

Rollback posture

FAQ

How is two-person approval enforced if the agent is the one running the deploy?

Why can’t we just put “wait for approval” in the prompt?

What counts as a “dry-run” in this workflow?

Can the approvals come from Microsoft tools?

What do we show auditors after an incident?

Sources

Ready to put this workflow into production?