← 返回
AI智能 Key 中文

Gateway Sentinel

Production-hardened OpenClaw gateway watchdog. Monitors the gateway process using graduated health checks, performs escalating repairs (restart → doctor fix...
生产级OpenClaw网关看门狗。采用递进式健康检查监控网关进程,执行逐级修复(重启→医生级修复→…)
zurbrick
AI智能 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 559
下载
💾 22
安装
1
版本
#latest

概述

🛡️ OpenClaw Guardian

A battle-hardened watchdog that keeps your OpenClaw gateway running — and tells you when it can't.

What It Does

OpenClaw Guardian runs as a background service and continuously monitors the OpenClaw gateway using two independent health signals. When the gateway goes down, it works through an escalating repair sequence before entering a cooldown and waiting for manual help. Every significant event is logged and sent to your configured alert channel(s).

Health Check Strategy (graduated)

  1. CLI checkopenclaw gateway status (the authoritative signal)
  2. HTTP fallbackcurl http://localhost:${OPENCLAW_PORT}/health (5s timeout)
  3. Both must fail before the guardian considers the gateway truly down

Repair Strategy (escalating)

LevelActionTrigger
------------------------
1 — Restartopenclaw gateway restartFirst failure
2 — Doctor Fixopenclaw doctor --fixopenclaw gateway startAfter Level 1 fails
3 — Git RollbackStash → reset to last stable commit → pop stashAfter GUARDIAN_MAX_REPAIR failures, only if GUARDIAN_ENABLE_ROLLBACK=true
CooldownSleep GUARDIAN_COOLDOWN secondsAfter all levels exhausted

> Note: Level 3 rollback is off by default and requires explicit opt-in via GUARDIAN_ENABLE_ROLLBACK=true. Even then, it always stashes uncommitted work before resetting — your changes are never silently discarded.

Alerting

Guardian supports both Telegram and Discord simultaneously. If neither is configured, it runs in log-only mode.

Alert events:

  • Guardian started / stopped
  • Gateway down detected
  • Each repair attempt (with level)
  • Repair success / failure
  • Rollback triggered
  • All repairs exhausted (cooldown entered)

Daily Snapshots

Once per calendar day, guardian runs git add -A && git commit in your workspace. It respects .gitignore, so secrets you've excluded stay excluded. Commit message format: guardian: daily snapshot YYYY-MM-DD.


Quick Start

1. Configure environment variables

Create ~/.openclaw/guardian.env (or export in your shell profile):

# Required for alerts — set at least one
export GUARDIAN_TELEGRAM_BOT_TOKEN="bot123456:ABC..."
export GUARDIAN_TELEGRAM_CHAT_ID="-1001234567890"
# OR
export GUARDIAN_DISCORD_WEBHOOK_URL="https://discord.com/api/webhooks/..."

# Optional tuning
export GUARDIAN_CHECK_INTERVAL=30
export GUARDIAN_MAX_REPAIR=3
export GUARDIAN_COOLDOWN=600
export GUARDIAN_ENABLE_ROLLBACK=false  # set true to enable git rollback
export GUARDIAN_WORKSPACE="$HOME/.openclaw/workspace"
export GUARDIAN_LOG="/tmp/openclaw-guardian.log"
export OPENCLAW_PORT=3578

2. Install as a system service

# macOS or Linux — auto-detects
./scripts/install-guardian.sh

# With a custom log path
GUARDIAN_LOG=/var/log/openclaw-guardian.log ./scripts/install-guardian.sh

3. Verify it's running

# macOS
launchctl list | grep openclaw

# Linux
systemctl --user status openclaw-guardian

# Both
tail -f /tmp/openclaw-guardian.log

4. Run manually (testing / foreground)

# Source your config first
source ~/.openclaw/guardian.env

# Run guardian in the foreground (Ctrl-C to stop)
./scripts/guardian.sh

5. Uninstall

./scripts/uninstall-guardian.sh

Environment Variable Reference

VariableDefaultDescription
---------
GUARDIAN_CHECK_INTERVAL30Seconds between health checks
GUARDIAN_MAX_REPAIR3Max Level 1+2 attempts before Level 3
GUARDIAN_COOLDOWN600Cooldown sleep (seconds) after all repairs fail
GUARDIAN_ENABLE_ROLLBACKfalseEnable Level 3 git rollback (off by default)
GUARDIAN_LOG/tmp/openclaw-guardian.logLog file path (rotates at 1 MB)
GUARDIAN_WORKSPACE$HOME/.openclaw/workspacePath to the OpenClaw workspace git repo
GUARDIAN_TELEGRAM_BOT_TOKEN_(unset)_Telegram Bot API token
GUARDIAN_TELEGRAM_CHAT_ID_(unset)_Telegram chat or channel ID
GUARDIAN_DISCORD_WEBHOOK_URL_(unset)_Discord incoming webhook URL
OPENCLAW_PORT_(auto-detected)_Gateway HTTP port — auto-parsed from openclaw gateway status if not set

File Layout

skills/openclaw-guardian/
├── SKILL.md                    ← this file
└── scripts/
    ├── guardian.sh             ← main watchdog (run continuously)
    ├── install-guardian.sh     ← sets up launchd / systemd service
    └── uninstall-guardian.sh   ← clean removal

Runtime files (created automatically, not committed):

FilePurpose
---------------
/tmp/openclaw-guardian.lockSingle-instance lockfile containing PID
/tmp/openclaw-guardian-last-snapshotDate of last successful daily snapshot
/tmp/openclaw-guardian.logCurrent log (rotated to .log.1 at 1 MB)

How It Improves on myclaw-guardian

Issue in myclaw-guardianFix in openclaw-guardian
------
git reset --hard without stashing — could silently destroy uncommitted workAlways git stash before any reset; git stash pop to restore regardless of outcome
Process detection via pgrep — fragile, can match wrong processUses openclaw gateway status (the actual CLI) as primary, with HTTP fallback
No lockfile — multiple instances could run simultaneously/tmp/openclaw-guardian.lock with PID written; stale lock detection on startup
Only Discord alertsSupports Telegram and Discord simultaneously; log-only if neither configured
Level 3 rollback always enabled — risky defaultLevel 3 off by default (GUARDIAN_ENABLE_ROLLBACK=false), explicit opt-in required
No graduated health checkingTwo independent checks: CLI → HTTP; both must fail before declaring gateway down
No cooldown after exhausting repairsConfigurable cooldown (GUARDIAN_COOLDOWN) before resuming monitoring

Logging

Logs are timestamped and structured:

[2026-03-05 11:30:00] [INFO] OpenClaw Guardian started (PID 12345)
[2026-03-05 11:30:30] [INFO] Gateway healthy
[2026-03-05 11:31:00] [WARN] CLI status check failed — trying HTTP health endpoint
[2026-03-05 11:31:05] [WARN] Gateway health check FAILED
[2026-03-05 11:31:05] [INFO] ALERT: 🔴 Gateway is DOWN — beginning repair sequence
[2026-03-05 11:31:05] [INFO] Repair Level 1: restarting gateway
[2026-03-05 11:31:35] [INFO] Level 1 repair succeeded

Log rotates automatically when it exceeds 1 MB (one backup: .log.1).


Security Notes

  • No secrets in git — daily snapshots use git add -A which respects .gitignore. Ensure your .gitignore excludes .env, *.key, etc.
  • Level 3 rollback is destructive by nature — only enable it if you understand git reset semantics and have tested your .gitignore coverage.
  • Alert tokens in env only — never put GUARDIAN_TELEGRAM_BOT_TOKEN or webhook URLs in files that get committed.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 14:08 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

security-compliance

Battle-Tested Agent

zurbrick
19个生产级强化模式,面向AI智能体——记忆、验证、歧义处理、压缩容错、委托、基于证明的交接、失效工作线程...
★ 0 📥 898
ai-intelligence

self-improving agent

pskoett
捕获经验教训、错误和纠正,以实现持续改进。使用时机:(1)命令或操作意外失败;(2)用户纠正……
★ 4,055 📥 795,847
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,349 📥 317,694