Use this skill when an OpenClaw host was just updated, is about to be updated, or is behaving strangely after an update. It is a generic operator runbook, not a release-specific checklist.
This skill is meant to be installed as a folder, not copied as a single file. It expects references/failure-patterns.md to exist locally beside SKILL.md inside the same skill bundle.
The goal is not only to get it running, but to prove which layer is broken:
For remote multi-host updates, first prove SSH reachability to each host
with a short timeout. If a host cannot be reached directly or through an
available jump host, record it as a transport/access blocker instead of an
OpenClaw update failure, because no OpenClaw command has executed on that
host yet.
If you are connected over non-interactive SSH, do not assume the
login-shell PATH is available. First locate the binary with common install
paths such as a package-manager prefix and ~/.local/bin/openclaw, then
export the correct PATH for the audit session.
If the gateway process is owned by a different OS user than the SSH login
user, run OpenClaw diagnostics as the gateway service user. The SSH user can
have no openclaw on PATH, or a private package-manager shim can be
unreadable, while the LaunchAgent/systemd service is healthy under another
home directory. Derive the service user, state dir, CLI path, and port from
the live process/service definition before running doctor or editing
config.
Check:
openclaw --versionopenclaw update statusopenclaw status --deepopenclaw doctor --non-interactive --no-workspace-suggestionsopenclaw channels status --deepopenclaw tasks audit Look at service-manager state, running PID, and /health.
Derive the service label/name and gateway port from openclaw status --deep
and/or the service definition instead of guessing them.
Do not trust only one of:
It is common to have:
First inspect plugin health:
openclaw plugins doctoropenclaw plugins list --jsonopenclaw plugins inspect Important rule:
codex, compare plugins inspect with plugins list --json; inspect can report a runtime as loaded while
raw plugin metadata still says disabled.
codex, compare the plugin version against the host version even when plugins doctor is clean. Use
openclaw plugins update to see whether an official
matching package exists before changing broader model config.
Pay attention to:
tools.web.search.providerplugins.allowplugins.entries.*openai/, openai-codex/, codex, and pi If doctor says a provider or plugin is unknown, inspect the actual config file and do not assume doctor --fix fully cleaned it.
Inspect:
~/.openclaw/plugins/installs.json~/.openclaw/npm/node_modules/@openclaw/...~/.openclaw/extensions/...Look for:
~/.openclaw/extensions/ thatload successfully but lag the host cohort
resolvedSpec, integrity, and installed version are exact, but the stored spec is still a bare package name such as
@openclaw/discord
openclaw update --channel ... installed from a fallback tag such as @latest
dist/Read:
~/.openclaw/logs/gateway.log~/.openclaw/logs/gateway.err.log/tmp/openclaw/openclaw-YYYY-MM-DD.logPrioritize recent startup lines and warnings involving:
config overwrites/backups, and service reload timing
~/.openclaw/service-env/*.env for token-line quote corruption — see Pattern #23 — before assuming the upstream credential was rotated)gateway is ready
Check for:
sessionKey points at a live channel lane such as agent: despite sessionTarget: isolated
A successful package update can still leave the system unhealthy if stale tasks block restarts or keep the audit red.
Run a narrow direct agent smoke test with a fresh session id and inspect the returned metadata:
fallbackAttempts Treat status: ok as insufficient if the primary model failed and a fallback provider completed the run.
Treat a clean plugins doctor as insufficient for runtime plugins until a
fresh direct agent run proves that the intended harness can load and execute.
An OpenClaw agent can sometimes update the package it is running under, but
that path has repeatedly left hosts with the package changed and the managed
service unloaded or not restarted. From an outside SSH shell, verify:
/health endpoint and channels recoveredopenclaw gateway restart repairs an installed-but-unloadedservice without any further package changes
Do not treat the agent conversation's final message as authoritative. Trust
the post-update host state.
Check:
agentId so temporary provider workarounds can berolled back without flattening full-size and mini cron routes together
sessionKey values, especially channel/direct-message keys onisolated cron jobs
cron run behavior--expect-final actually waits for final completion on the current buildso stale last-run errors are separated from active regressions
If cron verification only proves enqueue, state that clearly in the handoff notes.
This is especially important after a failed plugin install, external plugin
fallback, or public npm compromise advisory. Check:
openclaw security audit --deep flags unpinned npm plugin specsafter plugin update churn
~/.openclaw/npm/node_modules/opt/homebrew/lib/node_modulespackage.jsonState the limits of the check: a live-system scan cannot prove a package was
never installed and removed earlier.
Common fix sequence:
doctor, plugins doctor, status --deep, channels status --deep, and tasks auditUse this order when diagnosing post-update failures:
/healthopenclaw --versionopenclaw plugins doctoropenclaw doctoropenclaw channels status --deepopenclaw tasks auditStart with this file first.
Open references/failure-patterns.md when:
doctor or plugins doctor points to a known-looking regressionchannels status or logs disagree with the apparent service healthUse the reference file for symptom matching and concrete examples after the main workflow has narrowed the likely failure area.
Do not assume a broken plugin means "plugin missing."
There are three common cases:
A channel plugin is a good example of the second case: a host can upgrade correctly while still loading an older globally installed plugin package.
If the feature is not bundled, check npm and ClawHub before rewriting config.
Prefer the smallest fix that makes state consistent again:
node_modulesDo not stop at "service is up." A good finish means:
If the upgrade exposed an OpenClaw bug rather than local drift, collect enough information for the next operator or project/support contact. Do not assume the user has any particular external account or wants a public report created.
fallbackAttemptsroutes and inherited/default model cases
openclaw cron runs --id ,not just the time the operator noticed the issue
sessionKey values that crossed channel/session boundaries, afterreplacing channel ids and account ids with placeholders
doctor/plugins doctor warning textSanitize handoff notes before sharing externally:
, , and ~/.openclawFor concrete regression patterns and example symptoms, read references/failure-patterns.md.
When another operator or agent learns something new from a different OpenClaw host:
references/failure-patterns.mdIf a new finding is host-specific or uncertain, add it as a new failure pattern with:
Do not silently erase older patterns just because the current host did not hit them.
共 3 个版本