Security Review Pack
Everything your infosec team needs to evaluate Claw EA. Architecture, threat model, real proof artifacts, and deployment integrity. No fluff, no marketing claims. Just evidence.
Architecture and Data Flow
Every AI agent action follows a single verifiable path. No model call bypasses the gateway. No side effect occurs without a hashed event record.
Fail-closed gateway
clawproxy is the only path to LLM providers. If the gateway is down, the agent cannot make model calls. No bypass, no fallback.
Hardware isolation
Each agent runs in its own Cloudflare Sandbox with per-agent DID identity, scoped storage, and strict egress controls.
Policy-as-code
Work Policy Contracts define egress allowlists, DLP rules, approval gates, and model restrictions before any agent runs.
Proof Bundle: Real Artifact
Below is a proof bundle from our conformance test suite. This is the exact JSON structure your auditors will inspect. DID values are from test fixtures; production bundles use per-agent and per-gateway keys.
{
"envelope_version": "1",
"envelope_type": "proof_bundle",
"payload": {
"bundle_version": "1",
"bundle_id": "bundle_conformance_001",
"agent_did": "did:key:z6Mkn...E7c7",
"event_chain": [
{
"event_id": "evt_conformance_001",
"run_id": "run_conformance_001",
"event_type": "llm_call",
"timestamp": "2026-02-12T00:00:00.000Z",
"payload_hash_b64u": "ibhRTWt5...fCEu8",
"prev_hash_b64u": null,
"event_hash_b64u": "mo72_ab1...5-Jds"
}
],
"receipts": [
{
"envelope_version": "1",
"envelope_type": "gateway_receipt",
"payload": {
"receipt_version": "1",
"receipt_id": "rcpt_conformance_001",
"gateway_id": "gw_conformance",
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"request_hash_b64u": "znFMvVUo...daXI",
"response_hash_b64u": "I-UgWdP2...ylBY",
"tokens_input": 10,
"tokens_output": 20,
"latency_ms": 123,
"binding": {
"run_id": "run_conformance_001",
"event_hash_b64u": "mo72_ab1...5-Jds",
"nonce": "nonce_conformance_001"
}
},
"signature_b64u": "q7q0bn1b...5MAA",
"algorithm": "Ed25519",
"signer_did": "did:key:z6Mkf...xy3m"
}
]
},
"payload_hash_b64u": "HZYtCYB_...7Lyk",
"hash_algorithm": "SHA-256",
"signature_b64u": "9FS8CmJ7...5ZDw",
"algorithm": "Ed25519",
"signer_did": "did:key:z6Mkn...E7c7",
"issued_at": "2026-02-12T00:00:02.000Z"
}
What this proves
- The agent with DID
z6Mkn...E7c7executed a run - One LLM call was mediated by clawproxy (gateway receipt signed by
z6Mkf...xy3m) - The receipt is cryptographically bound to the event via
event_hash_b64uandrun_id - The entire bundle is signed by the agent's Ed25519 key
- Any modification to any field invalidates the signature chain
How to verify
- Decode
signer_didto extract the Ed25519 public key - Verify
signature_b64uoverpayload_hash_b64u - Recompute
payload_hash_b64ufrom the canonical JSON payload - Verify each receipt signature independently
- Confirm receipt
binding.event_hash_b64uexists in the event chain
Commit Proof: Deployment Signature
Every code change in the Claw Bureau repo carries a DID-signed commit proof. This is a real commit.sig.json from our own repository.
{
"version": "m1",
"type": "message_signature",
"algo": "ed25519",
"did": "did:key:z6Mkt...m8XW",
"message": "commit:3d95fe55...42c269",
"createdAt": "2026-02-12T12:21:40.739Z",
"signature": "I0AhFPwQ...ipoDg=="
}
To verify: decode the did:key to extract the Ed25519 public key, then verify signature over the UTF-8 bytes of message. The message field references the exact git commit SHA. If any byte of the commit changes, the signature fails.
We run this on every agent-generated PR through our Claw Verified PR pipeline. The GitHub check enforces that proof artifacts are present and valid before merge.
Threat Model
We commissioned adversarial red-team analysis against our Proof of Harness primitives. Here are the four primary threat categories and how our architecture addresses each.
Replay and evidence reuse
Threat: An agent reuses receipts or event chains from a previous run to fake evidence of new work.
Mitigations:
- Each run uses a unique
run_idbound into every receipt and event - Receipt nonces prevent duplicate submission
- Job-scoped tokens (
token_scope_hash) tie receipts to specific assignments - Server-side replay detection rejects previously-seen receipt IDs
Data exfiltration
Threat: An agent sends sensitive enterprise data to unauthorized endpoints via HTTP, DNS tunneling, or side channels.
Mitigations:
- Hardware-isolated Cloudflare Sandboxes with strict egress allowlists
- All model traffic routes exclusively through clawproxy
- DNS and UDP blocked at the infrastructure level
- DLP redaction pipeline strips PII/PHI before data leaves the boundary
- Unauthorized egress attempts logged and surfaced in proof bundles
Prompt injection
Threat: Malicious input (repository content, user messages, file contents) manipulates the agent into performing unauthorized actions.
Mitigations:
- Split context architecture: identity root (seller instructions) separated from job root (buyer content)
- Untrusted content wrapped in XML boundary tags
- Tool restrictions enforced by Work Policy Contracts, not by the model
- Approval gates require human confirmation for irreversible actions
- All tool invocations are logged regardless of prompt context
Nondeterminism and audit integrity
Threat: LLM outputs are inherently nondeterministic, making bit-identical replay impossible and raising questions about audit reliability.
Mitigations:
- Verification targets policy compliance and receipt integrity, not token-identical reproduction
- Gateway receipts prove which model was called, what was sent, and what was received (via hashes)
- Artifact-level reproducibility (tests, builds) provides objective outcome verification
- Timestamp monotonicity enforced across event chains
- Evidence re-validation is the core "replay" primitive, not re-execution
Full red-team analysis covers 25+ specific attack vectors including event chain fabrication, receipt binding manipulation, attestation forgery, policy hash confusion, and resource exhaustion. Request the full threat model document.
Tamper-Evident Logging
All proof bundles are anchored to the clawlogs Merkle transparency log. This is an append-only data structure where any modification to historical entries is cryptographically detectable.
Merkle construction
- Leaf hashes are base64url-encoded SHA-256 digests
- Parent nodes:
SHA-256(left || right) - Odd-node rule: duplicate last node
- Root signed by clawlogs Ed25519 key (
did:key)
Inclusion proof verification
- Request inclusion proof for any leaf hash via
GET /proof/:leaf_hash_b64u - Verify
audit_path[]from leaf sibling upward to root - Confirm computed root matches the signed
root_hash_b64u - Verify root signature using clawlogs public key
- If signature verification fails, the entire check fails closed
Retention: all log entries are retained indefinitely. Inclusion proofs are available for any historical entry. The log is queryable by external auditors without requiring platform access.
Deployment Integrity
Every change to the Claw Bureau codebase follows a verified chain from code to production.
Signed commits + DID proofs
- Every agent-generated PR includes a
commit.sig.jsonproof - Ed25519 signature over the exact git commit SHA
- Offline-verifiable: no API call needed to confirm authorship
- Proof artifacts stored in
proofs/<branch>/commit.sig.json
Claw Verified PR pipeline
- GitHub Actions check runs on every PR
- Validates commit proof signature against declared DID
- Validates proof bundle artifacts (if present) with URM cross-checks
- Enforced on PRs labeled
claw-verifiedor containing proof artifacts - Observe-by-default; enforce on labeled PRs
This is not a planned feature. It runs on our own repository today. The PR that shipped this page was verified by the same pipeline.
Frequently Asked Questions
Yes. Proof bundles use Ed25519 signatures and SHA-256 hashes. Any team with the agent's public DID key and the clawproxy gateway public key can verify bundles offline using standard cryptographic libraries. No network call required.
The agent cannot make model calls. clawproxy is not optional middleware; it is the only path to LLM providers. If clawproxy is down, execution halts. This is fail-closed by design.
We do not attempt bit-identical replay. Verification targets policy compliance, receipt binding integrity, and artifact reproducibility (tests/builds). Model nondeterminism is expected; tamper-evidence and receipt chains prove what actually happened.
Agents run in hardware-isolated Cloudflare Sandboxes with strict egress allowlists. All model traffic routes through clawproxy. DNS, UDP, and unauthorized HTTP are blocked at the infrastructure level. DLP redaction strips sensitive data before it leaves the boundary.