← 返回
开发者工具 中文

Canary Deploy

Safe system changes with automatic baseline capture, canary testing, and rollback for critical infrastructure modifications. Use when making changes to SSH c...
通过自动基准捕获、金丝雀测试和回滚功能,安全地进行关键基础设施变更。适用于SSH配置等修改场景。
lolaopenclaw
开发者工具 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 676
下载
💾 6
安装
1
版本
#latest

概述

Canary Deploy

Safe system changes with pre-flight checks, validation, and automatic rollback.

The Problem

System changes can lock you out:

  • SSH hardening breaks remote access
  • Firewall rules block needed ports
  • Kernel parameters cause instability
  • Service restarts break dependencies

Recovery without physical access is painful or impossible.

Quick Start

Before any critical change

# Capture baseline (connectivity, services, ports)
bash scripts/canary-test.sh baseline

# Make your change
sudo nano /etc/ssh/sshd_config

# Validate change didn't break anything
bash scripts/canary-test.sh validate

# If validation fails:
bash scripts/canary-test.sh rollback

For automated changes

# Full pipeline: baseline → apply → validate → rollback-if-failed
bash scripts/critical-update.sh \
  --name "SSH hardening" \
  --backup "/etc/ssh/sshd_config" \
  --command "sudo sed -i 's/PermitRootLogin yes/PermitRootLogin no/' /etc/ssh/sshd_config && sudo systemctl reload sshd" \
  --validate "ssh -o ConnectTimeout=5 localhost echo ok"

Protocol A+B (Manual Workflow)

For interactive sessions where you want human-in-the-loop:

Protocol A: Test interactively

  1. Tell the human: "Open a second SSH session as backup"
  2. Apply change in the first session
  3. Ask: "Test connectivity from the second session"
  4. If it works → confirm
  5. If it fails → rollback from the backup session

Protocol B: Backup first

  1. Run bash scripts/canary-test.sh baseline
  2. Verify backup is valid
  3. Apply change
  4. Run bash scripts/canary-test.sh validate
  5. If validation fails → bash scripts/canary-test.sh rollback

Always use both A + B together for maximum safety.

What Gets Checked

Baseline capture

  • SSH connectivity (local + remote)
  • Open ports (ss -tlnp)
  • Running services (systemctl)
  • Firewall rules (ufw/iptables)
  • Network routes
  • DNS resolution
  • Config file checksums

Validation

  • All baseline checks re-run
  • Diff against baseline
  • Any regression = FAIL

Critical Change Categories

CategoryRiskExampleRecovery
-----------------------------------
SSH config🔴 HIGHsshd_config changesBackup session
Firewall🔴 HIGHUFW/iptables rulesPre-change snapshot
Network🔴 HIGHInterface/routing changesConsole access
Services🟡 MEDIUMsystemd unit changessystemctl restart
Kernel params🟡 MEDIUMsysctl changesReboot to defaults
Packages🟢 LOWapt install/upgradeapt rollback

References

See references/incident-report.md for the real incident that inspired this skill.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 11:11 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

developer-tools

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 672 📥 324,650
developer-tools

Agent Browser

matrixy
专为AI智能体优化的无头浏览器自动化CLI,支持无障碍树快照和基于引用的元素选择。
★ 427 📥 118,445
developer-tools

CodeConductor.ai

larsonreever
AI驱动平台,提供快速全栈开发、智能体、工作流自动化及低代码AI集成的可扩展产品创建。
★ 68 📥 180,563