A timeboxed privileged break-glass workflow lets an OpenClaw-based agent perform a narrowly scoped, high-risk action (like temporary admin grant or disabling a control) only after step-up approvals, with an explicit expiry and a pre-planned rollback.
In Claw EA, this is enforced at the execution layer with policy-as-code, not prompt-only instructions, using a WPC = Work Policy Contract (signed, hash-addressed policy artifact; served by clawcontrols) plus a CST = scoped token (issued by clawscope) that is pinned to that policy and job.
Step-by-step runbook
-
Define the break-glass objective and rollback before you run anything. Write down the exact irreversible action (for example: “temporarily activate Entra ID privileged role for user X” or “temporarily exclude a device group from Conditional Access”) and the precise rollback (for example: “remove role assignment, restore policy, validate sign-in logs”).
Keep the rollback mechanically executable, not a narrative. If you cannot articulate rollback, treat the action as not eligible for agent execution.
-
Create a timeboxed WPC and publish it. The WPC should define the allowed tools, allowed targets, and a hard TTL for privilege. Keep it task-specific and prohibit tool expansion during the run.
Host it in the WPC registry and require proxy fetch/verify so the running agent cannot silently swap policy.
-
Issue a job-scoped CST that is pinned to the WPC. The CST should include a scope hash and optional policy hash pinning, and it must be bound to a single job to prevent replay.
Use short TTLs. Make renewal require the same approvals as the original break-glass request.
-
Enforce the two-person rule with step-up approvals. Require two separate humans to approve: one requester and one approver who is not the requester. For Microsoft actions, align approval to your existing control plane (for example: Entra ID PIM approval workflows) and treat the agent as the executor, not the policy decision-maker.
If the approvals are outside Claw EA, treat them as prerequisites and record the approval identifiers in job metadata for audit correlation.
-
Run under constrained execution, not prompt promises. Use OpenClaw as the baseline runtime and enforce tool policy and sandboxing so the agent can only execute the exact “break-glass tool” path. Do not rely on “the prompt says do not do X”; prompts do not stop tool calls when a model is confused or injected.
For model traffic, route through clawproxy so you get Gateway receipts = signed receipts emitted by clawproxy for model calls.
-
Execute the privileged step via official API, then verify state. If the action touches Microsoft, execute via Microsoft Graph via official API using explicit Graph permissions/scopes that match the minimum operation. Immediately perform a verification read to confirm the change took effect and that no additional principals were modified.
Keep “verification reads” in-scope in the same WPC so verification cannot be skipped without failing policy.
-
End the window: rollback, then invoke the kill switch posture. Perform rollback steps even if the incident appears resolved. Then halt the job and revoke the CST to guarantee no further privileged calls can succeed under that run.
If you need to stop mid-run, the emergency halt is: revoke CST, stop the OpenClaw session, and rotate any credentials touched during the window.
Threat model
Break-glass failures are rarely “a model went rogue” in isolation. They are usually a combination of overbroad tokens, unclear operator approval, and missing rollback under time pressure.
The table below lists concrete failure modes and how to control them using shipped Claw Bureau primitives and OpenClaw runtime constraints.
| Threat | What happens | Control |
|---|---|---|
| Prompt injection causes unintended privileged action | The agent follows untrusted input and calls a privileged tool path. | Permissioned execution: WPC tool allowlist and target constraints, plus CST scope hash and policy hash pinning. Use OpenClaw tool policy and sandboxing; avoid elevated host execution for break-glass unless explicitly required. |
| Token replay or lateral use | A CST leaks and is reused for another job or by another component. | Marketplace anti-replay binding with job-scoped CST binding. Short TTLs and immediate revocation after use. |
| Approval spoofing | A single operator claims approval without a second reviewer. | Two-person rule and step-up approvals as a gating prerequisite. Record approver identities and timestamps in job metadata and require both before CST issuance. |
| Privilege does not expire | Temporary admin remains active and becomes the new steady state. | Timeboxed WPC plus operational rollback checklist. Verify post-rollback state reads as part of the same run. |
| Operator cannot prove what the model was asked or what it did | After incident, you cannot reconstruct model calls and tool intent. | Gateway receipts for model calls routed through clawproxy, bundled into a proof bundle for audit and verification. |
| Host escape via elevated execution | A “temporary” break-glass job becomes host-level access. | Prefer OpenClaw sandboxed tool execution. If elevated is unavoidable, make it explicit in the WPC and keep the window shorter with stricter approvals. |
Policy-as-code example
This snippet shows the shape of a timeboxed break-glass policy. Treat it as JSON-like pseudocode to communicate intent, then implement as a signed WPC and enforce it with CST pinning.
{
"policy_name": "breakglass-temporary-admin-timeboxed",
"wpc": {
"hash": "b64u:POLICY_HASH",
"requires_proxy_fetch_verify": true
},
"approvals": {
"two_person_rule": true,
"step_up": {
"required": true,
"min_approvers": 2,
"constraints": { "approver_must_not_equal_requester": true }
}
},
"timebox": {
"max_duration_minutes": 30,
"cst_max_ttl_minutes": 30,
"no_renew_without_reapproval": true
},
"execution": {
"runtime": "OpenClaw",
"tools_allowlist": [
"breakglass.microsoft_graph.write_role_assignment",
"breakglass.microsoft_graph.read_role_assignment",
"breakglass.rollback.remove_role_assignment"
],
"tools_denylist": [
"shell.exec",
"tools.elevated"
],
"target_constraints": {
"tenant_id": "GUID",
"allowed_principals": ["user:alice@corp.example"],
"allowed_roles": ["Privileged Role Administrator"]
}
},
"network": {
"model_calls_must_use": "clawproxy",
"provider": "OpenRouter via fal (through clawproxy)"
},
"kill_switch": {
"on_trigger": ["revoke_cst", "halt_job"],
"triggers": ["approval_revoked", "timebox_expired", "unexpected_target_detected"]
},
"audit": {
"require_gateway_receipts": true,
"emit_proof_bundle": true
}
}
Prompt-only guidance cannot enforce any of the above. A prompt can be ignored, overwritten, or become ambiguous under pressure, while a WPC plus CST constrains what can execute even if the model output degrades.
What proof do you get?
For privileged runs, the minimum useful evidence is: which policy was in force, which model calls occurred, and which job the authorization belonged to. Claw EA focuses on producing that evidence as a machine-checkable record.
You get Gateway receipts for model calls from clawproxy, and a proof bundle that packages receipts plus run metadata so you can verify the run later. If you store the result in Trust Pulse, you have a consistent place to view the artifact and correlate it with the incident ticket.
Operationally, this means you can answer “did the model ever see secret X,” “did it call the model outside the approved route,” and “was this CST bound to this job and this WPC hash” without relying on screenshots or human recollection.
Rollback posture
Break-glass is only safe if rollback is treated as part of the workflow, not a follow-up task. The rollback plan should be executable even if the original operator is offline.
| Action | Safe rollback | Evidence |
|---|---|---|
| Grant temporary admin | Remove role assignment; confirm no additional principals gained the role; revoke CST; re-run verification reads. | Proof bundle with receipts for the privileged call and the rollback call, plus the pinned WPC hash. |
| Disable security controls | Re-enable the control; validate the effective policy state; rotate any credentials exposed during the window; halt the job. | Receipts showing the model call sequence and job-scoped CST binding preventing reuse. |
| Suspected compromise mid-run | Emergency halt: revoke CST immediately, stop OpenClaw session, and invalidate any temporary admin grants made in the window. | CST revocation record (from issuance system) plus proof bundle showing last successful model calls before halt. |
If you need stronger network containment, egress allowlists enforced outside clawproxy can be implemented as an additional control, but treat it as optional and environment-specific.
FAQ
Why is policy-as-code required instead of putting “do not escalate” in the prompt?
Prompts shape model behavior but do not enforce execution. A privileged break-glass workflow needs a hard boundary: the tool calls and targets must be restricted by WPC and CST so the runtime cannot execute outside the approved envelope.
How do two-person rule and step-up approvals work in practice?
Two-person rule means one human requests and a different human approves, with both captured before the CST is issued. Step-up approvals mean the privileged step is blocked until the approval is present, commonly aligned to existing enterprise controls like Entra ID PIM approval flows.
Can this be used for Microsoft Entra ID and Microsoft Graph?
Yes, if your break-glass tool calls Microsoft Graph via official API and the Graph permissions/scopes are narrowly granted for the specific operation. Do not give broad directory write scopes to an agent run; use the minimum permissions that satisfy the change and require verification reads.
What is the kill switch during a live incident?
The practical kill switch is to revoke the CST and halt the job so further privileged calls fail closed. Pair that with immediate rollback steps like removing temporary role assignments and restoring configuration.
What should be timeboxed: the policy, the token, or the role assignment?
All three. The WPC should declare the maximum window, the CST should have an equal or shorter TTL with no silent renewal, and the privileged assignment should be created with an explicit expiration where supported.