← 返回
未分类 中文

Change Verification

Pre/post change verification with baseline capture, diff analysis, and rollback decision guidance across Cisco IOS-XE/NX-OS, Juniper JunOS, and Arista EOS. S...
在 Cisco IOS‑XE/NX‑OS、Juniper JunOS 和 Arista EOS 上进行变更前后验证、基线捕获、差异分析以及回滚决策指导。
vahagn-madatyan vahagn-madatyan 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 335
下载
💾 1
安装
1
版本
#latest

概述

Change Verification

Event-driven change verification skill for structured change windows. Guides

baseline capture before a change, provides change execution safety patterns,

performs post-change diff analysis, and supports rollback decision-making when

unexpected deviations are detected.

This skill covers a single change event lifecycle (before → during →

after). For ongoing configuration drift detection and compliance auditing, use

the config-management skill instead.

Commands are labeled [Cisco], [JunOS], or [EOS] where syntax

diverges. Unlabeled statements apply to all three vendors.

> Safety Note — Read-Write Operations: This skill includes procedures that

> modify device state during change execution and rollback phases. Steps that

> write to devices are marked with ⚠️ WRITE. Always confirm authorization,

> change ticket approval, and maintenance window status before executing write

> operations. Baseline capture and post-change verification steps are read-only

> and safe to run at any time.

When to Use

  • Planned maintenance window requiring structured pre/post verification
  • Configuration change (routing policy, ACLs, interface config) with rollback plan
  • Software upgrade or patch requiring before/after state comparison
  • Hardware replacement (linecard, SFP, PSU) with service validation
  • Circuit turn-up or decommission with adjacency and traffic verification
  • Emergency change requiring rapid baseline capture and rollback readiness
  • Post-change soak period with periodic re-verification against baselines

Prerequisites

  • SSH or console access to all devices in the change scope (read-only for

baselines; enable/configure privilege for change execution and rollback)

  • Approved change ticket with documented scope, expected impact, and rollback

plan including timing criteria

  • Pre-identified list of devices and interfaces in the change scope
  • Knowledge of expected state changes: which routes will move, which interfaces

will bounce, which adjacencies will flap

  • Access to a file store (flash, SCP server, or local disk) for baseline

archival

  • Contact information for escalation if rollback criteria are met
  • See references/checklist-templates.md for per-change-type prerequisites

Procedure

Follow these steps sequentially for each change event. Steps 1–2 are

always read-only. Steps 3–4 include write operations. Steps 5–6 are

analytical and drive the rollback decision.

Step 1: Pre-Change Baseline Capture

Capture device state snapshots before any changes. Store outputs with

timestamps for post-change comparison.

Routing state:

[Cisco]

show ip route summary
show ip bgp summary
show ip ospf neighbor

[JunOS]

show route summary
show bgp summary
show ospf neighbor

[EOS]

show ip route summary
show ip bgp summary
show ip ospf neighbor

Interface and adjacency state:

All vendors — capture interface status, error counters, and neighbor tables:

[Cisco]

show interfaces summary
show cdp neighbors
show ip arp

[JunOS]

show interfaces terse
show lldp neighbors
show arp no-resolve

[EOS]

show interfaces status
show lldp neighbors
show ip arp

Configuration and hardware:

[Cisco]

show running-config
show environment all
show inventory

[JunOS]

show configuration
show chassis environment
show chassis hardware

[EOS]

show running-config
show environment all
show inventory

⚠️ WRITE — Archive baseline config to persistent storage:

[Cisco] copy running-config flash:pre-change-[ticket]-[date].cfg

[JunOS] request system configuration save /var/tmp/pre-change-[ticket].conf

[EOS] copy running-config flash:pre-change-[ticket]-[date].cfg

Record baseline metrics for comparison: total route count, BGP peer count

(Established), OSPF neighbor count (Full), interface error counters, and

hardware sensor readings.

Step 2: Change Scope Documentation

Before executing any changes, document:

  1. Change description — what configuration lines are being added, modified,

or removed

  1. Expected impact — which peers will flap, which routes will shift, which

interfaces will bounce, expected duration of disruption

  1. Rollback trigger criteria — specific thresholds that mandate rollback

(see Threshold Tables below)

  1. Rollback procedure — exact commands to revert (see

references/cli-reference.md for vendor-specific rollback commands)

  1. Success criteria — what "done" looks like: all baselines restored,

intended changes visible, no unexpected deviations

  1. Soak period — how long to monitor after change before declaring success

Step 3: Change Execution

⚠️ WRITE — Apply changes using commit-confirm patterns when available.

[Cisco] — No native commit-confirm. Apply changes in config mode and

immediately verify. For bulk changes, use configure replace with a prepared

config file.

[JunOS] — Use commit confirmed [minutes] to auto-rollback if not

confirmed within the timer window. Confirm with commit after verification.

configure
# ... apply changes ...
commit confirmed 5
# ... verify ...
commit

[EOS] — Use configure session for atomic staged changes. Review before

committing.

configure session change-[ticket]
# ... apply changes ...
show session-config
commit

Staged rollout for multi-device changes: Apply to one device first, verify

post-change state (Step 4), then proceed to remaining devices only after the

first device passes all checks.

Step 4: Post-Change Verification

Re-capture all baseline metrics from Step 1 using identical commands. Perform

a structured diff against pre-change baselines.

Key comparisons:

MetricCompare AgainstExpected Outcome
------------------------------------------
Route countPre-change summaryWithin deviation threshold
BGP peers EstablishedPre-change peer listAll peers restored (or changed per plan)
OSPF neighbors FullPre-change neighbor listAll adjacencies restored
Interface errorsPre-change countersNo new sustained errors
Config diffArchived pre-change configOnly intended lines changed

Config diff verification:

[Cisco]

show archive config differences flash:pre-change-[ticket]-[date].cfg system:running-config

[JunOS]

show | compare rollback 1

[EOS]

diff running-config flash:pre-change-[ticket]-[date].cfg

Review every line in the diff output. Classify each changed line as expected

(directly part of the change plan) or unexpected (not in the change scope).

Step 5: Impact Assessment

Classify all deviations from baseline into categories:

  1. Expected — Intended: Changes that are the direct goal of the change

window (e.g., new BGP peer appearing, old ACL removed). No action needed.

  1. Expected — Side Effect: Changes caused by the intended change but not

the primary goal (e.g., route count increase because a new peer is now

advertising). Verify they are benign.

  1. Unexpected — Minor: Deviations not related to the change scope but low

severity (e.g., a single interface counter increment). Investigate but do

not necessarily roll back.

  1. Unexpected — Critical: Deviations indicating collateral damage (e.g.,

adjacency loss on an unrelated interface, route withdrawal not in change

scope). Evaluate rollback immediately.

If any deviation is classified as Unexpected — Critical, proceed directly

to Step 6 rollback evaluation.

Step 6: Rollback Decision

Evaluate whether to accept the change or roll back using the criteria below.

Rollback if ANY of these conditions are true:

  • Service-affecting outage on interfaces/peers outside the change scope
  • Route count deviation exceeds threshold AND routes are not accounted for in

the change plan

  • Adjacency loss persists beyond the expected convergence window
  • Hardware errors (PSU, fan, temperature) emerged that were not present in

baseline

  • The change did not achieve its intended effect (success criteria from Step 2

not met)

Accept if ALL of these conditions are true:

  • All success criteria from Step 2 are met
  • Config diff contains only expected change lines
  • All baseline metrics are within acceptable deviation thresholds
  • No unexpected critical deviations detected
  • Soak period has elapsed without regression

⚠️ WRITE — If rolling back:

[Cisco] configure replace flash:pre-change-[ticket]-[date].cfg force

[JunOS] rollback 1 then commit

[EOS] configure replace flash:pre-change-[ticket]-[date].cfg

After rollback, re-run Step 4 post-change verification to confirm the device

has returned to its pre-change state.

Threshold Tables

Acceptable Deviation Thresholds

MetricNormal DeviationWarningRollback Trigger
----------------------------------------------------
IPv4 route count±2% of baseline±5% of baseline>10% or unplanned loss
IPv6 route count±2% of baseline±5% of baseline>10% or unplanned loss
BGP Established peers0 lost (unless planned)1 lost (if in scope)≥1 lost outside scope
OSPF Full adjacencies0 lost (unless planned)Flap then recover <2 minLost >2 min
Interface errors (new)0 new CRC/input errors<10 in first 5 minSustained >10/min
Interface flaps0 (unless planned bounce)1 flap on in-scope intfAny flap outside scope

Rollback Timing Thresholds

PhaseMaximum DurationAction if Exceeded
-------------------------------------------
Change executionPer change ticketPause and escalate
Post-change convergence5 minutesBegin rollback assessment
Adjacency re-establishment2 minutes per peerEscalate if critical peer
Route table stabilization3 minutesCheck for route oscillation
Soak period (minor change)15 minutesDeclare success or investigate
Soak period (major change)60 minutesDeclare success or investigate
Rollback execution5 minutesEscalate to senior engineer

Decision Trees

Post-Change Diff Contains Unexpected Lines

Config diff shows unexpected changes
├── Lines are in sections RELATED to change scope
│   ├── Side effect of intended change (e.g., auto-generated route-map sequence)
│   │   └── Classify as Expected — Side Effect → Document and accept
│   └── Unintended consequence (e.g., wrong interface affected)
│       └── Classify as Unexpected — Critical → Evaluate rollback
└── Lines are in sections UNRELATED to change scope
    ├── Timestamps, counters, or cosmetic changes (e.g., "Last configuration change")
    │   └── Classify as Expected — Side Effect → Ignore
    └── Substantive config changes (e.g., ACL modified, route-map added)
        └── Classify as Unexpected — Critical → Immediate rollback

Adjacency Loss Detected Post-Change

Neighbor/peer no longer in expected state
├── Device IS in the change scope
│   ├── Interface was intentionally bounced per change plan
│   │   ├── Adjacency recovers within timing threshold
│   │   │   └── Expected — document recovery time
│   │   └── Adjacency does NOT recover within threshold
│   │       └── Investigate → Check interface state, cable, peer config
│   └── Interface was NOT intentionally bounced
│       └── Unexpected — Critical → Check for config error → Rollback if unresolved
└── Device is NOT in the change scope
    ├── Peer is on a device that IS in scope (far-end impact)
    │   └── Expected side effect → Verify peer recovers within threshold
    └── Peer is on a device NOT in scope (unrelated)
        └── Unexpected — Critical → Unrelated failure, separate investigation

Route Count Deviation Outside Normal Threshold

Route count differs from baseline beyond ±2%
├── Change plan includes prefix addition or removal
│   ├── Deviation direction matches plan (added routes = count increase)
│   │   └── Expected — verify exact prefix matches plan
│   └── Deviation direction opposes plan (planned addition but count decreased)
│       └── Unexpected — Critical → Check BGP/OSPF process, peer state
├── Change plan does NOT include routing changes
│   ├── Deviation is <5% and routes are from in-scope device peers
│   │   └── Warning — likely convergence artifact → Monitor for 3 min
│   └── Deviation is >5% OR routes from out-of-scope sources
│       └── Unexpected — Critical → Evaluate rollback
└── Route oscillation detected (count fluctuating)
    └── Unexpected — Critical → Routing loop or flapping → Immediate rollback

Report Template

# Change Verification Report — [Ticket ID]

## Change Summary
- **Ticket:** [ID]
- **Date/Time:** [Start] — [End]
- **Devices:** [list]
- **Change Type:** [routing | switching | security | upgrade | other]
- **Executed By:** [name/team]

## Pre-Change Baseline
- Route count (IPv4/IPv6): [count]
- BGP peers Established: [count]
- OSPF adjacencies Full: [count]
- Interface errors (notable): [any]
- Config archived to: [location]

## Change Execution
- Method: [manual | commit-confirm | session | replace]
- Duration: [minutes]
- Issues during execution: [none | description]

## Post-Change Verification
- Route count (IPv4/IPv6): [count] (Δ [change])
- BGP peers Established: [count] (Δ [change])
- OSPF adjacencies Full: [count] (Δ [change])
- Interface errors (new): [count]
- Config diff lines: [expected: N, unexpected: N]

## Impact Assessment
- Expected — Intended: [list]
- Expected — Side Effect: [list]
- Unexpected — Minor: [list or none]
- Unexpected — Critical: [list or none]

## Decision
- **Result:** [ACCEPTED | ROLLED BACK | ESCALATED]
- **Rationale:** [reason]
- **Soak period:** [duration, outcome]

## Action Items
- [ ] [any follow-up tasks]

Troubleshooting

Baseline capture commands fail or return incomplete output

  • Verify SSH session stability — long command outputs may be truncated by

terminal buffer limits. Use terminal length 0 [Cisco/EOS] or

set cli screen-length 0 [JunOS] before capture.

  • Check device CPU — high CPU may cause CLI timeouts. Run show processes cpu

[Cisco] / show system processes extensive [JunOS] /

show processes top [EOS] to verify.

  • If archival to remote storage fails, save to local flash as fallback and

note the location for later retrieval.

Config diff shows excessive noise

  • Filter out timestamp and comment lines that change on every config display

(e.g., ! Last configuration change at ...).

  • On [JunOS], show | compare rollback 1 gives clean structured diffs. On

[Cisco], show archive config differences may include line-order

differences that are not real changes — focus on substantive config lines.

  • Use references/checklist-templates.md checklists to focus verification on

change-relevant sections rather than full config comparison.

Adjacency does not recover after expected bounce

  • Check interface state: show interfaces [intf] — look for down/down vs

up/down to distinguish physical vs protocol issues.

  • Verify the peer device accepted the change — a mismatched configuration on

both sides of a link (e.g., mismatched OSPF area, BGP ASN) will prevent

adjacency formation.

  • Check for hold-timer expiry: OSPF default dead interval is 40s; BGP default

hold time is 180s. Wait at least one full timer cycle before escalating.

Rollback command fails or produces unexpected state

  • [Cisco] configure replace requires the IOS archive feature to be

enabled. If unavailable, manually reverse the config changes line by line.

  • [JunOS] rollback N may fail if the commit history has been cleared or

the device has rebooted since the baseline commit. Use

show system commit to verify available rollback points.

  • [EOS] configure replace requires the replacement file to be a complete

config, not a partial fragment. Verify the archived file is a full

show running-config capture.

  • After any rollback, re-run the full post-change verification (Step 4) to

confirm the device has returned to its pre-change state.

Change window time exceeded before verification completes

  • Prioritize critical services: check routing adjacencies and interface states

first, defer detailed config diff analysis to after the window if services

are healthy.

  • If the soak period must be shortened, document the reduced observation window

and schedule a follow-up verification at the next opportunity.

  • Escalate if service impact is detected and the change window has closed —

do not delay rollback due to window constraints if there is active service

degradation.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 06:27 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

it-ops-security

MoltGuard - Security & Antivirus & Guardrails

thomaslwang
MoltGuard — OpenClaw 安全守卫,由 OpenGuardrails 提供。安装后可防止您和您的用户受到提示注入、数据泄露及恶意行为的侵害。
★ 116 📥 30,989
it-ops-security

Free Ride - Unlimited free AI

shaivpidadi
管理OpenClaw的OpenRouter免费AI模型,自动按质量排名模型,配置速率限制备用方案,并更新opencla...
★ 471 📥 78,323
it-ops-security

OpenClaw Backup

alex3alex
备份与恢复 OpenClaw 数据。适用于创建备份、设置自动备份计划、从备份恢复或管理备份轮转。处理 ~/.openclaw 目录归档并包含适当的排除规则。
★ 90 📥 31,030