Knowledge base updates with review | Secure Agent Workflow

This workflow lets an OpenClaw-based agent propose knowledge base (KB) changes while keeping publishing permissioned and reviewable. The agent can draft, cite sources, and open a change request, but it cannot publish until two humans approve under a Work Policy Contract (WPC).

Prompt-only guardrails fail when the agent is given real credentials or tool access. Claw EA enforces policy-as-code with WPC and scoped tokens (CST), then produces gateway receipts and a proof bundle so you can verify what happened and roll back safely.

Step-by-step runbook

Define the KB update boundary and “irreversible” actions. Treat “publish KB article” and “edit policies” as high risk. Everything else (read, summarize, draft, open PR) stays in the low risk lane by default.

Write this boundary into a WPC so the execution layer can enforce it even if the prompt is manipulated.
Register a WPC and pin it for the job. Store the signed WPC in the WPC registry (served by clawcontrols). Configure the run so the CST includes optional policy hash pinning, meaning the token is only valid for that exact WPC.

This prevents “policy drift” mid-run and makes approvals reference a stable policy hash.
Issue a CST that can draft but cannot publish. Use a CST (issued by clawscope) that only allows read access to the KB and writing drafts or change requests. If you use Git-based docs, allow “create branch / open PR” actions, but do not allow merging to protected branches.

Bind the CST to the job scope so replaying the token in a different job is rejected (marketplace anti-replay binding).
Run the authoring agent in OpenClaw with a constrained tool profile. Use OpenClaw tool policy to expose only the “kb.read”, “kb.write_draft”, and “submit_for_review” tools you intend, and keep tool execution sandboxed where feasible. Run OpenClaw’s security audit regularly to catch common configuration footguns before you scale this workflow out.

Route model calls through clawproxy so each call produces gateway receipts that you can later verify.
Perform step-up approvals with a two-person rule. Reviewer 1 checks content quality, citations, and that the change matches the request. Reviewer 2 confirms risk posture (no secrets, no policy changes, no unsafe instructions) and approves publication.

Only after both approvals are recorded do you mint a second, short-lived CST that includes the publish scope for that single article revision.
Publish using a separate “publisher” action under the step-up CST. The publish action is a distinct tool call and should require explicit human confirmation in the control plane. The approval metadata (reviewers, timestamps, artifact hashes) should be attached to the run record so it is included in the proof bundle.

If the KB system is external, publishing is performed via official API or via an MCP server, with the publish CST restricted to only the minimum endpoint and object identifiers needed.
Retain evidence and store the run artifact. Persist the proof bundle produced by the harness, including the WPC hash, CST scope hash, gateway receipts, and content hashes for drafts and published versions. Store the bundle in Trust Pulse for later viewing and audit review.

This evidence retention is what makes rollback practical: you can identify exactly what changed, by which token, under which policy.

Threat model

The core risks are not “bad text”; they are unauthorized tool use, replay of credentials, and silent changes that cannot be reconstructed later. The controls below assume the attacker can influence prompts, retrieved content, and chat messages, and that failures must be detectable after the fact.

Threat	What happens	Control
Prompt injection triggers “publish now”	The agent attempts to bypass review and publishes a KB article directly.	WPC denies publish tool unless step-up approval conditions are satisfied; publish requires a separate CST with publish scope and short TTL.
Single reviewer rubber-stamps changes	One compromised account approves harmful content or policy edits.	Two-person rule: two distinct human approvals required before minting the step-up publish CST; approvals are bound into the proof bundle metadata.
Replay of a token in another job	A CST captured from logs is reused to publish different content later.	Marketplace anti-replay binding (job-scoped CST binding) plus optional policy hash pinning so the CST is valid only for the intended WPC and job.
Model call disputes and “we never sent that”	After an incident, you cannot prove which model calls happened or what input constraints were applied.	Gateway receipts emitted by clawproxy for model calls, assembled into a proof bundle for verification and audit.
Agent edits governance policy instead of KB content	The agent changes permissions, review rules, or tool access to widen its own capabilities.	WPC explicitly blocks policy-edit tools; policy changes are treated as high risk and require a separate workflow and CST, not the KB update job.

Policy-as-code example

This snippet is intentionally “JSON-like” to show the operational shape. Your actual WPC should enumerate the specific tools, object IDs, and approval requirements your KB platform needs.

{
  "wpc_version": "1",
  "policy_name": "kb-updates-with-review",
  "risk": {
    "high_risk_actions": ["kb.publish", "governance.policy_edit"]
  },
  "tools": {
    "allow": [
      "kb.read",
      "kb.write_draft",
      "kb.create_change_request",
      "kb.attach_citations",
      "notifications.request_review"
    ],
    "deny": [
      "governance.policy_edit"
    ],
    "gates": [
      {
        "tool": "kb.publish",
        "requires": {
          "step_up_approvals": 2,
          "two_person_rule": true,
          "artifact_hash_match": true
        }
      }
    ]
  },
  "auth": {
    "cst": {
      "policy_hash_pinning": "optional",
      "job_scoped_binding": true
    }
  },
  "model_traffic": {
    "proxy_required": "clawproxy",
    "receipts_required": true
  }
}

Why this must be policy-as-code: a prompt can ask the agent to “ignore prior instructions,” but the WPC is enforced by the execution layer. The result is fail-closed behavior for publish and policy-edit actions, regardless of what the model says.

What proof do you get?

Every model call routed through clawproxy yields gateway receipts that can be verified later. These receipts are not “logs”; they are signed artifacts intended for downstream verification and dispute resolution.

At the end of the run, Claw EA produces a proof bundle that packages the gateway receipts with run metadata such as WPC hash, CST scope hash, timestamps, and content hashes of drafts and published outputs. You can store the proof bundle in Trust Pulse so reviewers and auditors can view the artifact without rerunning the job.

If you need external verification, you can hand the proof bundle to a verifier workflow and check that the receipts match the policy constraints you expected. Operationally, this is how you prove “publishing happened only after two approvals” instead of trusting a UI screenshot.

Rollback posture

KB updates should be treated like production changes: they need a clear revert path and evidence that the revert restored the last known good state. The table below maps common actions to safe rollback steps and the evidence you should retain.

Action	Safe rollback	Evidence to retain
Draft created or edited	Discard the draft or revert to the previous draft revision; keep the change request open for re-review.	Proof bundle with draft content hash and the tool call sequence that produced it.
Change request opened (PR or KB review ticket)	Close or supersede the request; link to the replacement request so audit trails remain linear.	WPC hash, CST scope hash, and the review comments captured alongside the run record.
KB article published	Revert to the last published version or unpublish if your KB system supports it; then rotate the step-up CST issuance policy.	Gateway receipts for model calls plus the publish event metadata (article ID, version, timestamps) in the proof bundle.
Policy or permission changed (should be blocked here)	Revert the policy artifact to the prior WPC; revoke any CST issued under the widened policy.	Proof bundle showing the denied attempt (or the separate policy-change workflow’s bundle if it occurred outside this job).

Rollback is also where replay resistance matters. If a publish CST can be replayed, “rollback” becomes a loop of re-compromise, so keep publish CSTs short-lived, job-bound, and pinned to the WPC when possible.

FAQ

How does review work without blocking productivity?

The agent does the time-consuming part: gather references, propose diffs, and format the draft consistently. Humans only step in at defined gates, and publishing is a separate step with a step-up CST.

What stops prompt injection from publishing directly?

The WPC denies publish by default, and the authoring CST does not include publish scope. Even if the model asks for publish, the tool call fails closed until two approvals trigger a separate, short-lived publish CST.

Can this workflow update SharePoint, Confluence, or another KB system?

Yes, via official API or via an MCP server, but you should treat “publish” as a distinct tool with its own approval gate. Keep API permissions narrow (object and scope restricted) and avoid giving the authoring CST any tenant-wide write rights.

What evidence can we show to auditors after an incident?

You can provide the proof bundle containing the WPC hash, CST scope hash, gateway receipts from clawproxy, and content hashes for what was drafted and what was published. That lets an auditor verify that publishing occurred under the intended policy and after the two-person approval gate.

How do we handle urgent hotfixes to a broken article?

Create a separate “emergency publish” WPC that still requires two-person approval but relaxes non-essential checks, then expire it quickly. Keep emergency publishes isolated so they do not become the default path for routine edits.

Knowledge base updates with review | Secure Agent Workflow

Proof-first summary

Step-by-step runbook

Threat model

Policy-as-code example

What proof do you get?

Rollback posture

FAQ

How does review work without blocking productivity?

What stops prompt injection from publishing directly?

Can this workflow update SharePoint, Confluence, or another KB system?

What evidence can we show to auditors after an incident?

How do we handle urgent hotfixes to a broken article?

Sources

Ready to put this workflow into production?

See how this applies to your environment

Knowledge base updates with review | Secure Agent Workflow

Proof-first summary

Step-by-step runbook#

Threat model#

Policy-as-code example#

What proof do you get?#

Rollback posture#

FAQ#

How does review work without blocking productivity?#

What stops prompt injection from publishing directly?#

Can this workflow update SharePoint, Confluence, or another KB system?#

What evidence can we show to auditors after an incident?#

How do we handle urgent hotfixes to a broken article?#

Sources#

Ready to put this workflow into production?#

See how this applies to your environment

Related

Step-by-step runbook

Threat model

Policy-as-code example

What proof do you get?

Rollback posture

FAQ

How does review work without blocking productivity?

What stops prompt injection from publishing directly?

Can this workflow update SharePoint, Confluence, or another KB system?

What evidence can we show to auditors after an incident?

How do we handle urgent hotfixes to a broken article?

Sources

Ready to put this workflow into production?