← 返回
未分类 中文

ZT4AI Self-Audit

Zero Trust security audit for AI agent workspaces, skills, and configurations. Based on Microsoft's Zero Trust for AI (ZT4AI) framework and the "Caging the A...
对AI代理工作区、技能和配置进行Zero Trust安全审计,基于Microsoft的Zero Trust for AI(ZT4AI)框架以及‘Caging the AI’方法。
tanarchytan tanarchytan 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 374
下载
💾 0
安装
1
版本
#latest

概述

ZT4AI Self-Audit

Audit your agent's skills, workspace, and configuration against Zero Trust for AI principles.

Background

AI agents process instructions and data as indistinguishable tokens in a context window. This means:

  • Skill files loaded into context can inject behavioral instructions
  • Workspace files (SOUL.md, AGENTS.md) are both operating instructions AND attack surface
  • External inputs (web content, emails, ClawHub skills) can contain prompt injection
  • Credentials in plaintext config files have no access scoping or rotation

This skill applies three frameworks:

  1. Microsoft ZT4AI — Verify explicitly, least privilege, assume breach
  2. "Caging the Agents" (arXiv:2603.17419) — Four-layer defense: workload isolation, credential proxy, network egress, prompt integrity
  3. OWASP Agentic AI Top 10 — Trust boundary violations, privilege escalation, resource exhaustion

Audit Process

Step 1: Inventory Skills

Scan all three skill locations:

echo "=== System ===" && ls /usr/lib/node_modules/openclaw/skills/ 2>/dev/null
echo "=== User ===" && ls ~/.openclaw/skills/ 2>/dev/null
echo "=== Workspace ===" && ls ~/.openclaw/workspace/skills/ 2>/dev/null

Step 2: Classify Each Skill

Assign every skill to a risk category using the classification guide in references/risk-classification.md.

Categories:

  • Behavioral modifiers (🔴 highest risk) — Skills that change how you think, override safety instincts, or inject decision-making patterns into your context
  • Credential handlers (🟡 elevated risk) — Skills that read, write, or transmit API keys, tokens, passwords
  • System modifiers (🟡 elevated risk) — Skills that write to config files, modify system state, or execute with elevated privileges
  • Tool wrappers (🟢 standard risk) — Skills that wrap external tools with well-scoped inputs/outputs
  • Read-only (🟢 low risk) — Skills that only read data and produce reports

Step 3: Audit Each Skill Against ZT4AI Principles

For each skill, evaluate against the checklist in references/audit-checklist.md.

Quick reference — the three questions:

  1. Verify explicitly: Does this skill verify identity/authorization before acting? Does it distinguish owner from non-owner input?
  2. Least privilege: Does this skill request only the access it needs? Could its scope be narrowed?
  3. Assume breach: If this skill were compromised (poisoned update, prompt injection in its files), what's the worst outcome? How would you detect it?

Step 4: Check Scripts and Executables

Find all executable code in skills:

find ~/.openclaw/skills/ ~/.openclaw/workspace/skills/ \
  -type f \( -name "*.sh" -o -name "*.py" -o -name "*.js" \) \
  2>/dev/null | sort

For each script, check:

  • Does it access credentials? (grep -li "API_KEY\|SECRET\|TOKEN\|PASSWORD" )
  • Does it make network calls? (grep -li "curl\|wget\|requests\|fetch\|http" )
  • Does it write to system config? (grep -li "openclaw.json\|\.env\|/etc/" )
  • Does it execute arbitrary input? (grep -li "eval\|exec\|subprocess\|system(" )

Step 5: Generate Integrity Baseline

Create SHA256 checksums of all skill files for future drift detection:

find ~/.openclaw/skills/ ~/.openclaw/workspace/skills/ \
  -type f \( -name "*.md" -o -name "*.sh" -o -name "*.py" -o -name "*.js" -o -name "*.json" \) \
  -exec sha256sum {} \; | sort -k2 > memory/skill-integrity-baseline.md

To verify against an existing baseline:

sha256sum -c memory/skill-integrity-baseline.md 2>&1 | grep -v ": OK$"

Any output indicates modified files — investigate before trusting.

Step 6: Assess Workspace File Security

Check the self-modification surface:

  • Can the agent modify its own SOUL.md / AGENTS.md? (Yes by default — flag it)
  • Are memory files loaded into context? (Yes — they're instruction vectors)
  • Is MEMORY.md loaded in non-owner contexts? (Should NOT be — data leak risk)
  • Are there credentials in workspace files? (grep -rli "api_key\|password\|secret" ~/.openclaw/workspace/)

Step 7: Check Network Egress

Assess outbound network restrictions:

# Check for firewall rules
iptables -L OUTPUT -n 2>/dev/null || echo "No iptables access"
ufw status 2>/dev/null || echo "No UFW"

# Check what the agent can reach
curl -s -o /dev/null -w "%{http_code}" https://httpbin.org/get --max-time 5

If the agent has unrestricted outbound access, flag as a security gap — a compromised agent could exfiltrate data to any destination.

Step 8: Produce Report

Generate a structured report using the template in references/report-template.md. Include:

  • Risk classification for each skill
  • Specific findings with severity ratings
  • Recommended remediations with priority
  • Action tier assignments (see references/action-tiers.md)

Save report to memory/zt4ai-audit-YYYY-MM-DD.md.

Ongoing Monitoring

After the initial audit:

  1. Re-verify integrity after any skill install/update (sha256sum -c against baseline)
  2. Re-audit behavioral skills whenever they're updated — these are the highest risk
  3. Update baseline after intentional skill modifications
  4. Schedule periodic audits via cron (monthly recommended)

References

  • references/risk-classification.md — Detailed classification criteria with examples
  • references/audit-checklist.md — Per-skill audit checklist
  • references/action-tiers.md — Graduated trust model for agent actions
  • references/report-template.md — Audit report template

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-31 05:11 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

it-ops-security

MoltGuard - Security & Antivirus & Guardrails

thomaslwang
MoltGuard — OpenClaw 安全守卫,由 OpenGuardrails 提供。安装 MoltGuard,保护您和您的用户免受提示注入、数据泄露和恶意攻击。
★ 116 📥 30,880
it-ops-security

1password

steipete
设置和使用 1Password CLI (op)。适用于:安装 CLI、启用桌面应用集成、登录(单/多账户)、通过 op 读取/注入/运行密钥。
★ 53 📥 31,599
it-ops-security

Tmux

steipete
通过发送按键和抓取窗格输出,远程控制交互式 CLI 的 tmux 会话。
★ 45 📥 29,505