MCP security for enterprise agents comes down to one rule: treat every MCP tool as a production capability with its own authorization, audit trail, and rollback plan. Prompt-only “don’t do bad things” controls fail because the model can be tricked, the context can be spoofed, and the tool surface is often broader than you think.
Claw EA runs OpenClaw as the baseline agent runtime and adds a permissioned execution layer: WPC = Work Policy Contract (signed, hash-addressed policy artifact; served by clawcontrols), CST = scoped token (issued by clawscope), gateway receipts (from clawproxy), and proof bundles. The goal is to make MCP tool usage verifiable, minimally scoped, and easy to disable safely when something goes wrong.
Step-by-step runbook
-
Inventory MCP servers and classify tool actions. List every MCP server your OpenClaw agents can reach (local and remote), then group tools into read-only, write, and irreversible actions (payments, deletes, role changes). This classification becomes the input to your policy and token scopes.
-
Define a WPC that expresses the allowed MCP tool surface. Use a WPC to pin what tools are callable, under what conditions (environment, job type), and what data classes are permitted. Keep the first policy narrow, then widen only when you have proof that the agent needs it.
-
Issue CST per job and pin the policy hash. For each agent job, mint a CST with a scope hash and (optionally) policy hash pinning so the runtime cannot silently expand privileges mid-run. Use short CST lifetimes and job-scoped binding to reduce replay risk.
-
Route model calls through clawproxy to produce gateway receipts. Put clawproxy on the model egress path so you get signed gateway receipts for each model call used during tool planning and tool execution. This is your “what did the model see and decide” evidence when an incident happens.
-
Run OpenClaw with tool policy and sandboxing aligned to the WPC. OpenClaw already separates sandbox (where tools run) from tool policy (which tools exist) and elevated execution (host escape hatch). Configure OpenClaw so local execution is constrained even if an MCP tool misbehaves or the model is prompt-injected.
-
Collect proof bundles for every job that can change state. Generate a proof bundle that ties together the WPC reference, the CST scope hash, and gateway receipts. Store the bundle for review and later verification, and publish to Trust Pulse when you need a marketplace-stored artifact for audit/viewing.
-
Practice rollback and kill switches. Predefine which MCP tools can be globally disabled by policy, which require human approval in your surrounding systems, and which require enterprise buildout to safely mediate. The safe posture is “fail closed” when policy cannot be fetched or verified.
Threat model
MCP changes the attack surface from “bad answer in a chat” to “untrusted instructions that can invoke real tools.” Your security plan should assume prompt injection, context spoofing, mis-scoped credentials, and tool servers that return malicious outputs.
| Threat | What happens | Control |
|---|---|---|
| Confused deputy via MCP tool chaining | The agent is tricked into using a high-privilege tool to act on attacker-supplied targets (emails, URLs, tenants, repos). | Express allowed targets and action classes in a WPC; issue per-job CST with scope hash; prefer narrow tools over general “http_request” style tools. |
| Context spoofing from an MCP server | An MCP server returns “trusted looking” context that causes the model to authorize actions it should not take. | Treat MCP outputs as untrusted; require explicit allowlisted tool calls in policy; use OpenClaw tool policy to deny generic executors unless needed. |
| Prompt injection via documents or tickets | Untrusted text instructs the agent to exfiltrate secrets, expand permissions, or disable logging. | Permissioned execution (WPC + CST) so the model cannot grant itself tools; run OpenClaw tools in sandbox where possible; keep elevated execution tightly gated. |
| Credential overbreadth in enterprise identity systems | A token with broad Microsoft Graph permissions (or other vendor scopes) enables unintended data access and write actions. | Use least-privilege permissions and Conditional Access plus PIM where applicable; map the effective permission set into the WPC and keep CST scopes narrower than the underlying identity token. |
| Replay of an agent run token | A captured token is reused to rerun actions out of band, potentially against new targets. | Marketplace anti-replay binding using job-scoped CST binding; short CST TTLs; pin policy hash so a replay cannot be combined with a broader policy. |
| Audit gaps when model calls bypass the proxy | Some model calls happen “direct,” leaving no evidence of what the agent was instructed by the model. | Route model calls through clawproxy to emit gateway receipts; treat missing receipts as a verification failure for regulated jobs. |
Policy-as-code example
This is a simplified JSON-like sketch of a WPC that governs MCP tools. In practice, a WPC is a signed, hash-addressed policy artifact served by clawcontrols, and you pin its hash when issuing a CST.
{
"wpc_version": "v1",
"policy_name": "mcp-enterprise-agent-minimal",
"allowed_tools": [
{ "kind": "mcp", "server": "crm-mcp", "tools": ["search_customer", "get_case"] },
{ "kind": "mcp", "server": "tickets-mcp", "tools": ["read_ticket", "comment_ticket"] }
],
"denied_tools": [
{ "kind": "mcp", "server": "*", "tools": ["http_request", "shell", "file_write"] }
],
"data_classes": {
"allow": ["internal", "customer_provided"],
"deny": ["secrets", "credentials", "payment_instruments"]
},
"model_egress": {
"require_proxy_receipts": true,
"provider": "openrouter_via_fal"
},
"token_constraints": {
"cst_max_ttl_seconds": 900,
"require_scope_hash": true,
"policy_hash_pinning": "required"
}
}
The key design choice is that tools are authorized by machine-enforced policy, not by natural-language instructions. Prompts help the agent do the job, but they do not decide what the job is allowed to do.
What proof do you get?
For MCP-governed work, you want proof that (1) the agent ran under a specific policy, (2) the policy could not be silently widened, and (3) the model calls used to plan actions are auditable. Claw Bureau primitives are built around those needs.
Gateway receipts are signed receipts emitted by clawproxy for model calls, so you can later verify what model endpoint was used and when. A proof bundle packages the receipts plus related metadata for audit/verification, including the CST scope hash and (when used) the pinned WPC hash.
If you need an artifact that can be viewed and shared in a controlled way, you can store the result as a Trust Pulse. For external verification workflows, the proof bundle is the portable unit you hand to a verifier or auditor.
Rollback posture
MCP incidents are usually not “the agent went rogue,” they are “a tool had too much power” or “a context source was untrusted.” Rollback needs to be operational: disable capabilities quickly, preserve evidence, and restart with narrower scopes.
| Action | Safe rollback | Evidence |
|---|---|---|
| Suspect an MCP server is returning malicious context | Update WPC to deny that MCP server’s tools; reissue CST for affected jobs; rerun only read-only workflows until validated. | Proof bundle showing the prior WPC hash and the gateway receipts for model calls during the incident window. |
| Tool permissions found too broad (ex: write actions enabled) | Create a narrower WPC, pin it for new CST issuance, and rotate underlying vendor credentials via official API procedures. | Diff of WPC versions by hash; job-scoped CST binding limits replay; receipts show which model calls led to tool invocation. |
| Model egress bypass detected | Fail closed for regulated jobs until egress is routed through clawproxy; for emergency continuity, run read-only jobs only. | Absence of gateway receipts is itself a detection signal; subsequent runs produce receipts for comparison. |
| Need to invalidate active runs quickly | Revoke CST issuance at clawscope and deny the WPC for new jobs; restart runs with new CST and updated policy. | Proof bundles for completed jobs remain verifiable; revoked tokens stop new work from continuing under the old scope. |
Some controls are commonly implemented around this core, but are not assumed shipped in all deployments. For example: egress allowlists enforced outside clawproxy can be implemented, and automatic cost budget enforcement is planned.
FAQ
Why is prompt-only security not enough for MCP tools?
Because MCP turns text into actions, and the model can be manipulated by prompt injection, context spoofing, or tool outputs. A permissioned execution layer uses WPC and CST so that even a tricked model cannot exceed the allowed tool surface.
What is the minimal policy you recommend for first MCP deployments?
Start with read-only tools and a short-lived CST per job, with policy hash pinning. Deny generic executors and any tool that can exfiltrate arbitrary data until you have a concrete, reviewed need.
How does this relate to OpenClaw sandboxing and tool policy?
OpenClaw controls where tools run (sandbox), which tools are available (tool policy), and whether any execution can escape to host (elevated). Your WPC should align with those settings so the local runtime boundary matches the remote authorization boundary.
How do you handle Microsoft environments without over-granting access?
Use least-privilege Microsoft Graph permissions/scopes, and apply Conditional Access and PIM where appropriate. Then ensure the agent’s CST and WPC are narrower than the underlying identity grants, so the agent can only exercise a reviewed subset of capabilities.
What artifacts should I retain for audit after an MCP incident?
Retain the proof bundle for the affected jobs, including gateway receipts, the CST scope hash, and the referenced WPC hash. That lets you show what policy was in force and what model calls occurred, even if you later rotate credentials.