Use this skill to audit and harden any LLM agent against adversarial attacks
across messaging channels, email, MCP integrations, and web interfaces.
This is not a theoretical framework. Every rule here was earned from a real failure
or a real pen test.
agent-architect)skill-builder)battle-tested-agent)This skill was built on OpenClaw but the principles are universal. It works with:
Read references/attack-surface-checklist.md and determine which channels,
MCP servers, and capabilities the agent has.
Read references/channel-hardening.md and verify each channel has
the correct access controls, allowlists, and instruction isolation.
Read references/mcp-hardening.md and audit each connected MCP server
for excessive permissions, cross-service chaining risks, and tool
description injection.
Read references/behavioral-rules.md and add the appropriate
defensive rules to the agent's operating docs.
Use the quick-test checklist in references/quick-test.md to verify
the rules work. Run both single-shot and multi-turn test scenarios.
Use the findings template in references/findings-template.md to record
what was tested and what needs attention.
references/attack-surface-checklist.md — identify what the agent can accessreferences/channel-hardening.md — per-channel security configurationreferences/mcp-hardening.md — MCP server permission auditingreferences/behavioral-rules.md — defensive operating rules to addreferences/quick-test.md — fast verification tests (single-shot + multi-turn)references/findings-template.md — structured findings documentationLead with the specific vulnerability or configuration gap. Provide the exact
rule or config change needed. Do not lecture about security in general.
共 1 个版本