← 返回
未分类 Key 中文

Vigilance

Evaluate-before-Execute (EBE) guardrail for OpenClaw agents. Issues a mandatory GO / NO-GO decision before any high-stakes tool call. Enforces child-safety p...
评估后再执行(EBE)防护栏,用于OpenClaw代理。在任何高风险工具调用前发布强制性 GO/NO‑GO 决策,强制执行儿童安全策略。
sanjeet-toosi
未分类 clawhub v1.0.3 1 版本 100000 Key: 需要
★ 0
Stars
📥 361
下载
💾 1
安装
1
版本
#latest

概述

agent-sentinel

Purpose

This skill is the mandatory evaluation layer between the agent's intent

and any high-stakes tool execution. You are not permitted to use the tools

listed under Interception Triggers without first

calling this script and receiving "decision": "ALLOW" or `"decision":

"ADVISE"` in the result.

Think of this as a circuit breaker — if the Sentinel trips, the circuit opens

and the action stops.


Interception Triggers

> **You are FORBIDDEN from invoking any of the following tools without first

> running eval_engine.py and parsing its response.**

ToolTrigger condition
-------------------------
web_searchEvery search, without exception
booking_toolEvery flight, hotel, or travel booking
shell_commandEvery shell or terminal execution
payment_toolEvery payment, checkout, or purchase

No exceptions apply. Even if you are certain the action is safe, the

Sentinel must still be called. This is a policy requirement, not a

suggestion.


How to Call the Sentinel

Run the following command before invoking any trigger tool:

python3 ~/.openclaw/skills/agent-sentinel/eval_engine.py \
  --intent  "<what the user asked for>" \
  --action  <booking_tool|web_search|shell_command|payment_tool|other> \
  --data    "<the exact payload: URL, flight details, command, amount>" \
  [--provider  anthropic|openai|ollama] \
  [--model     <model-id>] \
  [--config_path /path/to/SENTINEL_CONFIG.md]

Example — flight booking:

python3 ~/.openclaw/skills/agent-sentinel/eval_engine.py \
  --intent "Book a family trip to Orlando for spring break" \
  --action booking_tool \
  --data   "Delta Airlines, dep 08:30, arr 11:45, $389 total, non-stop, economy"

Example — web search:

python3 ~/.openclaw/skills/agent-sentinel/eval_engine.py \
  --intent "Find age-appropriate science videos for my daughter" \
  --action web_search \
  --data   "https://www.youtube.com/results?search_query=kids+science+experiments"

> Important: The script writes Chain-of-Thought reasoning to stderr

> and emits only valid JSON to stdout. Parse stdout with

> json.loads(...). Do not parse stderr.


Response Schema

The script always returns a single JSON object:

{
  "decision":     "ALLOW" | "BLOCK" | "ADVISE",
  "severity":     "LOW"   | "MEDIUM" | "HIGH",
  "reason":       "<clear explanation>",
  "alternatives": "<suggestion to resolve the violation>"
}

Decision Handling Rules

"ALLOW" — Proceed

The action passed all checks. Continue with the intended tool call.

If the result contains "severity": "LOW" alongside ALLOW, surface any

informational notes to the user as a soft advisory but do not block.


"ADVISE" — Pause and Confirm

The action is not blocked but a preference mismatch or soft-limit warning

was detected. You must:

  1. Stop before invoking the tool.
  2. Show the reason and alternatives fields to the user verbatim.
  3. Ask the user explicitly: "Would you like to proceed anyway?"
  4. Only continue if the user confirms. If they do not confirm within the

turn, treat it as a BLOCK.

Example ADVISE response to user:

> I noticed an advisory before completing your request:

>

> Advisory: Price $480 is within 15% of your $500 budget cap.

>

> Suggestion: Confirm this cost is acceptable or I can search for

> cheaper alternatives.

>

> Would you like me to proceed with this booking, or should I look for

> less expensive options?


"BLOCK" — Stop Immediately

You are strictly forbidden from proceeding. Do not attempt to:

  • Retry the same action with different parameters
  • Find a workaround or alternative path to the same outcome
  • Bypass the Sentinel by splitting the action into smaller steps
  • Claim the Sentinel is wrong and proceed anyway

You must:

  1. Do not call the trigger tool.
  2. Apologize to the user and clearly explain the violation.
  3. Quote the reason field exactly.
  4. If alternatives is non-empty, present it as the recommended path forward.
  5. Ask for an explicit user override if they wish to continue.

Example BLOCK response to user (budget violation):

> I'm sorry — I can't complete this booking.

>

> Blocked: Price $650.00 exceeds your maximum budget of $500.00.

>

> What you can do: Look for options priced at or below $500. Consider

> flexible dates or alternate airports.

>

> If you'd like to override this limit for this booking only, please say

> "override" and I'll ask you to confirm the amount before proceeding.

Example BLOCK response to user (child-safety violation):

> I'm sorry — I can't perform this search.

>

> Blocked: This content is restricted under the household child-safety

> policy (severity: HIGH).

>

> Reason: [reason from the Sentinel]

>

> Please modify your request. If you believe this is an error, an adult

> in the household can review and override the policy in SENTINEL_CONFIG.md.


Override Protocol

If a user explicitly says "override" for a BLOCK decision, you must:

  1. Repeat the blocking reason and severity back to the user.
  2. Ask for explicit written confirmation: *"Please type 'I confirm' to

proceed despite this policy violation."*

  1. Log the override in your response (e.g., "Proceeding with user override.").
  2. Never offer override for a severity: HIGH (Tier-1 child-safety)

BLOCK unless an adult user has explicitly established that permission in

writing within the same conversation turn.


Installing Dependencies

cd ~/.openclaw/skills/agent-sentinel
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configuration

Edit SENTINEL_CONFIG.md (in the skill directory or ~/.openclaw/) to

update your preferences and safety policy. See that file for full

documentation of all supported keys.

KeyTypeEffect
-------------------
Child_Age_LimitintegerActivates child-safety tier
Max_Budget$NNNHard budget cap (BLOCK above, ADVISE at 85%)
Night_Flights_Blockedtrue/falseBlocks flights in night window
Night_Flight_WindowHH:MM - HH:MMNight restriction hours
Preferred_Airlinescomma listSoft preference (ADVISE if absent)
Blocked_Airlinescomma listHard block on listed carriers
Max_StopsintegerBLOCK if flight exceeds stop count
Preferred_CabinstringADVISE if different cabin detected
Max_Booking_Advance_DaysintegerADVISE if booking too far ahead

版本历史

共 1 个版本

  • v1.0.3 当前
    2026-05-07 04:56 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,219 📥 266,855
developer-tools

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 672 📥 324,525
ai-intelligence

self-improving agent

pskoett
捕获经验教训、错误和纠正,以实现持续改进。使用时机:(1)命令或操作意外失败;(2)用户纠正……
★ 4,062 📥 799,929