概述

System Restoration

Comprehensive guide for restoring Advantage HPE's operational intelligence systems when they fail or go down.

Investigation Workflow

1. System Status Assessment

Before fixing anything, map out what's broken:

Core Intelligence Systems:

Zero Revenue Alerts → #margin-alerts (Every 30 min)
Morning Pulse → #manager-nudges (Daily 6:35 AM)
Live Nudges → #manager-nudges (Every 15 min)
Material Truth Report → #material-intel-systems (Daily 7:00 AM)
Friend-Zone Reformatter → #live-ops (ServiceTitan email alerts)

Investigation Commands:

# Check LaunchD services
launchctl list | grep ranger

# Check cron jobs  
cron list

# Check running processes
ps aux | grep -E "(keel|pulse|margin|nudge)" | grep -v grep

# Find system code
find /Users/stephendobbins/.config/ranger -name "*.py" | grep -E "(pulse|margin|nudge)"
find /Users/stephendobbins/.openclaw/workspace -name "*.py" | grep -E "(zero|revenue)"

2. Locate Code & Determine Failure Cause

Common Locations:

/Users/stephendobbins/.config/ranger/scripts/ - Main operational scripts
/Users/stephendobbins/.config/ranger/materials/ - Material intelligence
/Users/stephendobbins/.openclaw/workspace/ - Recent scripts & fixes
/Users/stephendobbins/Library/LaunchAgents/ - LaunchD service definitions

Common Failure Patterns:

LaunchD services unloaded - Emergency shutdown or system restart
Data source broken - ServiceTitan API returning wrong data
Scheduling missing - Functions exist but no cron/LaunchD trigger
Script errors - Import failures, credential issues

System-Specific Restoration

Zero Revenue Alerts

Script: /Users/stephendobbins/.config/ranger/scripts/margin_alerts.py

Channel: #margin-alerts (C0A5L7MG60P)

Schedule: Every 30 minutes

Restoration Steps:

Verify script exists and posts to Slack
Load LaunchD service: launchctl load /Users/stephendobbins/Library/LaunchAgents/com.ranger.margin-alerts.plist
Test manually: cd /Users/stephendobbins/.config/ranger/scripts && python3 margin_alerts.py
Check logs: tail /tmp/margin_alerts.log

Morning Pulse

Script: /Users/stephendobbins/.config/ranger/scripts/pulse_os_full.py

Channel: #manager-nudges (C0A5V9JL2KV)

Schedule: Daily 6:35 AM CT

Restoration Steps:

If broken API data: Check for .bak backup with working data sources
Restore backup: cp pulse_os_full.py.bak pulse_os_full.py
Fix data sources: Replace API calls with browser automation (see references/browser-data-sources.md)
Load LaunchD service: launchctl load /Users/stephendobbins/Library/LaunchAgents/com.ranger.morning-pulse.plist
Test: python3 pulse_os_full.py pulse

Live Nudges

Script: /Users/stephendobbins/.config/ranger/scripts/pulse_os_full.py nudges

Channel: #manager-nudges

Schedule: Every 15 minutes

Function: run_nudges() on line 548-617

Features: 🚗 dispatched / 📍 arrived / ✅ completed alerts

Restoration Steps:

Verify function exists: grep -n "def run_nudges" pulse_os_full.py
Create LaunchD service (see scripts/create-live-nudges-service.py)
Load service: launchctl load /Users/stephendobbins/Library/LaunchAgents/com.ranger.live-nudges.plist
Test: python3 pulse_os_full.py nudges

Material Truth Report

Script: /Users/stephendobbins/.config/ranger/materials/reconciliation_report.py

Channel: #material-intel-systems (C0A5L7RB5EK)

Schedule: Daily 7:00 AM CT

Restoration Steps:

Test script: cd /Users/stephendobbins/.config/ranger/materials && python3 reconciliation_report.py --no-email
Create cron job with 7:00 AM schedule
Verify channel posting

Data Source Repair

ServiceTitan API vs UI Data

Problem: ServiceTitan API often returns test/historical data instead of real operational data.

Solution: Replace API calls with browser automation:

Create browser data source module (see scripts/browser_data_sources.py)
Import in main script: Replace parse functions with browser equivalents
Preserve output format - Same sections, different data source

Browser Data Functions:

get_browser_low_margin_jobs()
get_browser_stale_estimates()
get_browser_revenue_leaks()
get_browser_driver_incidents()

KEEL System Issues

Script: /Users/stephendobbins/.config/ranger/keel/keel_slack_bot.py

Safe restart for field tech DM only:

Disable operational intelligence: Set OPERATIONAL_INTELLIGENCE_ENABLED = False
Restart process: cd /Users/stephendobbins/.config/ranger/keel && python3 keel_slack_bot.py &
Verify running: ps aux | grep keel_slack_bot

Service Management Commands

LaunchD Services

# List services
launchctl list | grep ranger

# Load service  
launchctl load /Users/stephendobbins/Library/LaunchAgents/com.ranger.<service>.plist

# Unload service
launchctl unload /Users/stephendobbins/Library/LaunchAgents/com.ranger.<service>.plist

# Start service immediately
launchctl start com.ranger.<service>

# Check service logs
tail /tmp/<service>.log
tail /tmp/<service>.err

Cron Jobs (OpenClaw)

# List jobs
cron list

# Add job  
cron add <job-definition>

# Remove job
cron remove <job-id>

Emergency Shutdown Recovery

When systems are emergency-stopped due to bad data:

Investigate root cause - Usually ServiceTitan API data issues
Fix data sources - Switch to browser automation or correct API endpoints
Test manually - Verify data accuracy before re-enabling
Restore services - Load LaunchD services and cron jobs
Monitor initially - Check logs and channel posts for accuracy

Resources

scripts/

create-live-nudges-service.py - Generate LaunchD plist for live nudges
browser_data_sources.py - Browser automation replacement for broken APIs

references/

launchd-service-templates.md - LaunchD plist templates for different schedules
channel-ids.md - Slack channel IDs for all operational intelligence channels
troubleshooting-checklist.md - Step-by-step debugging guide

版本历史

共 1 个版本

v1.0.0 当前

2026-05-07 16:51 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)

安全，无风险

查看报告

System Restoration

概述

System Restoration

Investigation Workflow

1. System Status Assessment

2. Locate Code & Determine Failure Cause

System-Specific Restoration

Zero Revenue Alerts

Morning Pulse

Live Nudges

Material Truth Report

Data Source Repair

ServiceTitan API vs UI Data

KEEL System Issues

Service Management Commands

LaunchD Services

Cron Jobs (OpenClaw)

Emergency Shutdown Recovery

Resources

scripts/

references/

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

OpenClaw Backup

1password

Tmux