← 返回
未分类 中文

OpenClaw Security PII Audit

Multi-region async PII detection for OpenClaw sessions. Scans user input, prompts, context, and knowledge base content for sensitive personal data across CN,...
针对 OpenClaw 会话的多区域异步 PII 检测。扫描用户输入、提示词、上下文及知识库内容,以检测 CN 等地区的敏感个人数据。
mtoby8326 mtoby8326 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 397
下载
💾 1
安装
1
版本
#compliance#latest#pii#security

概述

OpenClaw Security - PII Audit Skill

Multi-region async PII detection engine for OpenClaw sessions. Detects 8 categories of sensitive personal data across 10 country/region jurisdictions and logs audit events locally as NDJSON.

中文速览(PII 审计)

基本信息

  • 技能名称:openclaw-security
  • 能力:多区域异步 PII 检测,支持后台审计与本地合规留痕

检测范围

  • 8 类标签:PHONE / EMAIL / PERSON_NAME / ADDRESS / PASSPORT / BANK_CARD / NATIONAL_ID / SOCIAL_ACCOUNT
  • 10 区域:CN / US / AU / SG / MY / TH / ID / DE / UK / FR(支持 +CC 国际手机号)
  • 来源类型:input / prompt / context / knowledge_base

关键规则

  • 风险分级:high(证件/银行卡或组合信息),low(单一弱标识)
  • 智能采样:input 100%(5m),prompt 20%(24h),context 20%(1h),knowledge_base 100%(24h)
  • 调用方无需自行判断是否跳过扫描;如需强制扫描,使用 --no-cache
  • 后台扫描禁止 --text,请使用 --file + --delete-after-read
  • 输入上限 32,768 字符,超限截断并记录 truncated: true
  • 审计结果本地 NDJSON 落盘,默认保留 7 天,可 cleanup.py --dry-run 先演练

Quick Start

Scan via file (recommended for background / automated scans):

python scripts/audit_worker.py --session-id SESSION_001 --source-type input --file content.txt

Scan via file + auto-delete (secure temp-file workflow):

python scripts/audit_worker.py --session-id SESSION_001 --source-type input --file tmp_scan.txt --delete-after-read

Scan via stdin:

echo "张三的手机号是13812345678" | python scripts/audit_worker.py --session-id SESSION_001 --source-type input

Quick manual test (WARNING: content visible in process list):

python scripts/audit_worker.py --session-id S001 --source-type input --text "short test" --json

Source Types

  • input — User input text
  • prompt — System or user prompts
  • context — Conversation context
  • knowledge_base — Knowledge base content

Detection Labels

PHONE, EMAIL, PERSON_NAME, ADDRESS, PASSPORT, BANK_CARD, NATIONAL_ID, SOCIAL_ACCOUNT

Supported Regions

CN, US, AU, SG, MY, TH, ID, DE, UK, FR (+ INTL via +CC phone prefix)

Risk Levels

  • high: NATIONAL_ID / PASSPORT / BANK_CARD detected, or combination of PERSON_NAME + contact info + ADDRESS
  • low: Single weak identifier (EMAIL, SOCIAL_ACCOUNT, PHONE alone)

Smart Sampling

The audit worker includes built-in smart sampling to efficiently handle large contexts:

  • User input (input): 100% scan rate, 5-min cache TTL — every user message is scanned, but identical repeats within 5 minutes are skipped.
  • System prompts (prompt): 20% scan rate, 24-hour cache TTL — prompts rarely change; first scan is cached for 24 hours.
  • Conversation context (context): 20% scan rate, 1-hour cache TTL — context overlaps heavily; only sample 1 in 5 submissions.
  • Knowledge base (knowledge_base): 100% first-scan rate, 24-hour cache TTL — static content is fully scanned once, then deduped for 24 hours.

Bypass sampling for manual / forced scans:

python scripts/audit_worker.py --session-id S001 --source-type context --text "text" --no-cache

Async Audit Workflow

When auditing session content as a background task:

  1. Respond to user first — never block the main response for audit.
  2. Feed all content types — the script internally decides whether to actually scan based on sampling config and cache. The Agent does not need to decide when to skip.
  3. Use temp-file + --delete-after-read — NEVER pass content via --text in background scans. Write content to a temp file, pass --file, and let the script auto-delete it.
  4. Run audit in background:
  5. # Step 1: Write content to temp file (no PII in command-line args)
    $tmpFile = [System.IO.Path]::GetTempFileName()
    [System.IO.File]::WriteAllText($tmpFile, $userInput, [System.Text.Encoding]::UTF8)
    
    # Step 2: Background scan — script reads and deletes the temp file
    Start-Process -NoNewWindow -FilePath python -ArgumentList "scripts/audit_worker.py --session-id $sid --source-type input --file $tmpFile --delete-after-read"
    
    # Same pattern for other source types:
    $tmpPrompt = [System.IO.Path]::GetTempFileName()
    [System.IO.File]::WriteAllText($tmpPrompt, $systemPrompt, [System.Text.Encoding]::UTF8)
    Start-Process -NoNewWindow -FilePath python -ArgumentList "scripts/audit_worker.py --session-id $sid --source-type prompt --file $tmpPrompt --delete-after-read"
    
  6. Review results: openclaw-security-audit/YYYY-MM-DD/events.ndjson
  7. All outcomes (detected, clean, skipped) are logged for complete audit trail.

Retention

Default: 7 days. Cleanup:

python scripts/cleanup.py --days 7

Dry run first:

python scripts/cleanup.py --days 7 --dry-run

Input Size Limit

Maximum input: 32,768 characters (32K). Content exceeding this limit is truncated to the first 32K characters. The audit record carries truncated: true and original input_chars count.

Audit Record Schema

Every scan invocation writes an NDJSON record — including clean and skipped outcomes.

Each NDJSON line contains:

  • event_id — UUID
  • session_id — Caller-provided session ID (required)
  • source_type — One of: input, prompt, context, knowledge_base
  • statusdetected, clean, or skipped
  • labels — Array of detected PII types (detected only)
  • regions — Array of matched regions/country codes (detected only)
  • risk_level — high or low (detected only)
  • matched_count — Number of PII matches
  • matches — Array of {label, confidence, masked_preview, region} (detected only)
  • content_hash — SHA256 prefix for dedup (no raw content stored)
  • input_chars — Original input size in characters
  • truncated — Whether input was truncated to 32K
  • created_at — ISO 8601 UTC timestamp

Safety Rules

  • NEVER store raw sensitive values — only masked previews + content hash
  • NEVER pass content via --text in background scans — use --file + --delete-after-read
  • Audit logs are local-only, never transmitted externally
  • All file I/O uses UTF-8 encoding explicitly, with file locking for concurrent safety
  • No external dependencies — stdlib only
  • Input capped at 32K characters to prevent resource exhaustion

Configuration

Environment variable override for audit output directory:

$env:OPENCLAW_AUDIT_DIR = "C:\path\to\custom\audit\dir"

See references/patterns.md for detection pattern details.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 18:04 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

it-ops-security

1password

steipete
设置和使用 1Password CLI (op)。适用于:安装 CLI、启用桌面应用集成、登录(单/多账户)、通过 op 读取/注入/运行密钥。
★ 53 📥 31,646
it-ops-security

OpenClaw Backup

alex3alex
备份与恢复 OpenClaw 数据。适用于创建备份、设置自动备份计划、从备份恢复或管理备份轮转。处理 ~/.openclaw 目录归档并包含适当的排除规则。
★ 90 📥 30,962
it-ops-security

MoltGuard - Security & Antivirus & Guardrails

thomas-security
MoltGuard — OpenClaw 安全守卫,由 OpenGuardrails 提供。安装后可防止您和您的用户受到提示注入、数据泄露及恶意行为的侵害。
★ 116 📥 30,942