← 返回
数据分析 中文

bug-fixing

Zero-regression bug fix workflow: triage → reproduce → root cause → impact analysis → fix → verify → knowledge deposit → self-reflect. Use when: - Feature br...
零回归缺陷修复工作流:分诊→复现→根因分析→影响分析→修复→验证→知识沉淀→自我反思。适用场景:-feature br...
tinkcarlos
数据分析 clawhub v1.0.3 1 版本 99798.6 Key: 无需
★ 0
Stars
📥 1,982
下载
💾 30
安装
1
版本
#latest

概述

Bug Fix v4.0 — OpenClaw Edition (Zero-Regression + Portable)

Core Promise: Fix completely. Fix everywhere. Break nothing. Learn from every fix.


Iron Rules (12 — NEVER Violate)

┌──────────────────────────────────────────────────────────────────────────┐
│  Rule 1:  Root cause MUST pass 4 gates before fixing                     │
│           (reproducible + causal + reversible + mechanistic)              │
│                                                                          │
│  Rule 2:  Scope MUST pass 5 gates before fixing                          │
│           (consumers + contracts + invariants + call sites + dup scan)    │
│                                                                          │
│  Rule 3:  MUST trace IMPACT CHAIN (code → data → time → event)          │
│           + scan ALL files for same pattern before writing fix            │
│                                                                          │
│  Rule 4:  MUST predict side effects + check blind spots before coding    │
│           (references/blind-spots.md is single source of truth)           │
│                                                                          │
│  Rule 5:  After fix, MUST run regression verification                    │
│           (functional + performance + concurrency + all impact levels)    │
│                                                                          │
│  Rule 6:  MUST verify fix is LOADED at runtime                           │
│           (clear __pycache__ + restart + exercise code path)              │
│                                                                          │
│  Rule 7:  Framework behavior → read source code first, never trust       │
│           docs/comments/assumptions alone                                │
│                                                                          │
│  Rule 8:  UI bugs MUST gather RUNTIME EVIDENCE before proposing fixes    │
│           (screenshot + DevTools DOM/console + user repro steps)          │
│           Do NOT fix UI bugs based on code reading alone.                │
│                                                                          │
│  Rule 9:  Fix is NOT done until: Bug Summary output + code-review        │
│           passes + knowledge files updated + self-reflection complete     │
│                                                                          │
│  Rule 10: Before fixing, CLASSIFY the problem layer:                     │
│           code bug? missing config? wrong architecture? AI capability?   │
│           Fix at the root layer, not at the symptom layer.               │
│                                                                          │
│  Rule 11: Pattern matching (regex, string match, name lookup) MUST       │
│           check boundary conditions (word boundaries / anchors / exact)   │
│                                                                          │
│  Rule 12: Before fix, MUST search bug pattern library + bug records      │
│           for known fixes and historical context                         │
└──────────────────────────────────────────────────────────────────────────┘

Workflow Overview

Phase 0: Triage → Severity (P0-P3) + Tier (Trivial/Standard/Complex)
  │
  ├─ Trivial → Quick Fix → test → done
  │
  ├─ Standard ─┐
  └─ Complex ──┘
      │
Phase 1: Reproduce (evidence required)
      │
Phase 2: Root Cause Analysis
    2A: Hypothesis ladder → 5 Whys → evidence
    2B: Search knowledge files (bug-patterns + bug-records)
    2C: Impact chain (code + data + time + event)
    2D: Similar issue scan across codebase
      │
Phase 3: Scope + Prediction
    3A: Consumer list → contracts → invariants → dup scan (5 gates)
    3B: Side effect prediction + blind spot check
    3C: Fix strategy comparison (when >10 LOC, Complex only)
      │
Phase 4: Fix (minimal change, prefer ≤50 LOC)
      │
Phase 5: Verify + Review
    5A: Regression verification (functional + perf + concurrency)
    5B: Runtime deployment verification
    5C: Bug Summary + code-review skill
      │
Phase 6: Knowledge Deposit + Self-Reflection

Phase 0: Triage + Severity

> Classify severity AND tier FIRST to control workflow depth.

Severity Classification (controls workflow depth)

| Severity | Criteria | Workflow | Time-box |

|----------|----------|----------|----------|

| P0 Critical | Production down / data loss / security | FULL (all phases) | 4h escalation |

| P1 High | Core feature broken / data corruption | FULL (all phases) | 8h escalation |

| P2 Medium | Non-core feature / UI issue | STANDARD (skip 3C) | 16h |

| P3 Low | Cosmetic / minor edge case | QUICK (skip 2C, 2D, 3A-3C) | No limit |

Tier Classification (controls fix path)

| Tier | Criteria | Path |

|------|----------|------|

| Trivial | Typo, config value, 1-line obvious fix, no behavioral change | Quick Fix (below) |

| Standard | Logic bug, 1-3 files, clear symptom, no cross-module risk | Standard Path (skip phases marked "Complex only") |

| Complex | Cross-module, >3 files, shared utility, schema change, multi-process | Full Path (all phases mandatory) |

Quick Fix Path (Trivial only)

## Quick Fix
- Bug: [one-line description]
- Fix: [one-line change]
- File: [path:line]
- Test: [how verified — lint/test/manual]
- Risk: None (isolated, no behavioral change)

After quick fix: update references/bug-records.md, done. No RCA, no impact chain, no self-reflection needed.

If "trivial" fix touches >1 file or changes behavior → upgrade to Standard.

Auto-Initialize Knowledge Files

Check: references/bug-patterns.md exists?
  YES → search it in Phase 2B
  NO  → skip pattern search; create after first fix

Check: references/bug-records.md exists?
  YES → search it in Phase 2B
  NO  → skip records search; create after first fix

Check: references/blind-spots.md exists?
  YES → use it in Phase 3B
  NO  → skip blind spot check; create after first fix

Phase 1: Reproduce

> MUST have evidence before continuing. No evidence = no fix.

| Bug Type | Evidence Required |

|----------|------------------|

| Backend error | Stack trace + request/response |

| Frontend UI | Screenshot + browser console + user repro steps (Rule 8) |

| Performance | Before/after metrics + profiler output |

| Intermittent | Timing conditions + frequency estimate |

UI Bug Protocol (Rule 8):

  1. Get user screenshot or screen recording
  2. Open browser DevTools → check Console for errors/warnings
  3. Inspect DOM structure (check for overflow clipping, z-index, Portal needs)
  4. Reproduce the exact user steps
  5. ONLY THEN form hypotheses

Evidence Bundle Template

### Trigger Conditions
- Input/params: [...]
- Environment: [OS/browser/runtime version]
- Timing: [action sequence or time interval]

### Observable Output
- Error message: [full error text]
- Logs: [key log lines]
- Screenshot/recording: [if available]

### Correlation IDs
- requestId/traceId: [...]
- sessionId: [...]

Phase 2: Root Cause Analysis

2A: Hypothesis Ladder

| # | Hypothesis | Likelihood | Confirmation Test | Rejection Test | Status |

|---|-----------|-----------|-------------------|----------------|--------|

| 1 | [description] | High/Med/Low | [prove it IS this] | [prove it is NOT this] | [ ] |

Rules: Sort by likelihood → each must be falsifiable → run rejection tests first → test ONE at a time → use 5 Whys to reach root cause.

Root Cause Confirmation Gate (Rule 1)

Root cause is confirmed only when ALL 4 conditions are met:

| Gate | Meaning |

|------|---------|

| Reproducible | Can trigger symptom in controlled scenario |

| Causal | Minimal change makes bug disappear |

| Reversible | Reverting the change makes bug reappear |

| Mechanistic | Can point to exact code path / state transition |

Framework Assumption Audit (Rule 7)

When fix involves framework/library behavior: list assumptions → read source code to verify → document in comments with source references.

2B: Search Knowledge Files (Rule 12)

> Search bug-patterns.md and bug-records.md for matching patterns.

> Skip if files don't exist (see Phase 0 auto-init).

| Match Level | Action |

|-------------|--------|

| High (symptom + root cause match) | Apply known fix, can skip remaining RCA |

| Medium (similar symptom) | Reference strategy, verify |

| No match | Full investigation, must deposit after fix |

2C: Impact Chain (Rule 3)

| Dimension | What to Check |

|-----------|---------------|

| Code | Bug file → direct callers → indirect callers → deep callers |

| Data | Corrupted records in DB/file/cache? Repair script needed? |

| Time | When introduced? Duration of exposure? Users affected? |

| Event | Message queues, WebSocket, background workers affected? |

2D: Similar Issue Scan (Rule 3)

> Scan ALL files for the same bug pattern, not just the reported file.

rg -n "function_name\|similar_pattern" --glob "*.{ts,tsx,py,js}"

Phase 3: Scope + Prediction

Scope Accuracy Gate (Rule 2)

| # | Gate | Meaning |

|---|------|---------|

| 1 | Consumer List | All consumers (callers/dependents) enumerated |

| 2 | Contract List | Modified contracts/interfaces/behaviors listed |

| 3 | Invariant Check | Must-hold invariants listed |

| 4 | Call Site Enum | All call sites enumerated and classified |

| 5 | Duplicate Scan | No parallel implementation left unfixed |

3A: Side Effect Prediction (Rule 4)

  1. Change Blueprint — What exactly will change
  2. Impact Ripple — L0 (code) → L1 (module) → L2 (feature) → L3 (system) → L4 (user)
  3. Blind Spot Check — Read references/blind-spots.md and execute every active check
  4. Go/No-Go Decision

Quick version (for Standard-tier, ≤5 LOC, 1 file):

## Quick Impact Check
- Change: [one-line description]
- Direct callers: [list or "none - local function"]
- Duplicates: [checked — none / found and planned]
- Could break: [prediction or "low risk - isolated"]
- Decision: GO

3B: Fix Strategy Comparison (>10 LOC, Complex only)

| Dimension | Strategy A | Strategy B |

|-----------|-----------|-----------|

| LOC change | | |

| Impact scope | | |

| Regression risk | | |

| Rollback-able | | |


Phase 4: Fix

  • Minimal change, prefer ≤50 LOC; justify if more
  • ONE change at a time, never batch unrelated fixes
  • Layer Rule (Rule 10): Before writing fix code, verify you're fixing the right layer:

| Problem in… | Fix… | Do NOT fix… |

|-------------|------|-------------|

| Params/config | Config or param passing | Business logic |

| Single component | That component | Framework |

| Multiple components same issue | Framework/base class | Each component one by one |

| Docs vs code mismatch | Both sides in sync | Only one side |

  • Pattern matching safety (Rule 11): regex, string match, name lookup → always consider boundary conditions
  • DB schema change? Generate Alembic migration:

```bash

cd backend && alembic revision --autogenerate -m "describe change"

```


Phase 5: Verify + Review

5A: Regression Verification (Rule 5)

| Category | Checks |

|----------|--------|

| Functional | Unit tests + integration + API + E2E + manual |

| Performance | No N+1 queries, no resource leaks, no response time increase |

| Concurrency | Thread-safe shared state, atomic operations, no race conditions |

Test the entire impact chain (L0-L3), not just the original bug.

5B: Runtime Deployment Verification (Rule 6)

| Step | Action | Evidence |

|------|--------|----------|

| 1 | Clear Python bytecode cache | __pycache__ removed |

| 2 | Restart backend service | PID changed from X to Y |

| 3 | Health check passes | /docs returns 200 |

| 4 | Exercise the fixed code path | Request triggers fixed logic |

If NOT deployed → restart and re-verify before proceeding.

5C: Bug Summary + Code Review (Rule 9)

## Bug Summary [BUG-XXX]
- **Symptom**: [one-sentence user-visible problem]
- **Root Cause**: [one-sentence actual cause]
- **Fix**: [one-sentence fix description]
- **Files Modified**: [file1.py, file2.ts]
- **Severity**: P0/P1/P2

Output Bug Summary → run code-review skill → if review finds issues → fix → re-verify

Stop condition: Code review clean + regression passed + deployment verified + original bug fixed.

Special Checks

| Bug Type | Key Checks |

|----------|-----------|

| API Bug | Frontend → API → Schema → Service → DB chain; field completeness |

| DB Migration | Model changed → alembic revision --autogenerate; no migration = schema drift |

| System-level | Draw E2E chain; define handshake evidence per edge; insert probes first |

| Cross-Surface | Shared artifact → identify contract → consumer list → regression matrix |


Phase 6: Knowledge Deposit + Self-Reflection

6.1 Update Knowledge Files (Rule 9)

| File | When to Update |

|------|---------------|

| references/bug-records.md | Every fix (project history) |

| references/bug-patterns.md | New pattern / new fix strategy (universal) |

| references/blind-spots.md | New blind spot discovered |

6.2 Self-Reflection (Rule 9)

| Dimension | Score (1-5) | Evidence |

|-----------|------------|----------|

| First-time correctness | [1-5] | Did the fix work on first attempt? |

| Scope accuracy | [1-5] | Did I find all affected areas? |

| Minimal change | [1-5] | Was the change as small as possible? |

| Side effect prediction | [1-5] | Did I predict all side effects? |

| Root cause depth | [1-5] | Did I fix root cause, not symptom? |

| Total | [/25] | |

| Issue | What Happened | Why I Missed It | Prevention |

|-------|--------------|-----------------|------------|

Regression Autopsy (when fix introduced a regression)

- **Original Bug**: [what was being fixed]
- **New Bug Introduced**: [what broke]
- **Why I didn't predict it**: [blind spot]
- **Classification**: [missed consumer / contract violation / edge case / ...]

Domain-Specific Checks

| Bug Type | Key Checks |

|----------|-----------|

| Backend/API | Schema drift, timeout/retry, transactions, N+1, connection pool, ORM lazy loading |

| Frontend/UI | State (useEffect deps, unmount), race conditions, CORS, hydration, overflow/Portal |

| System-level | Cross-layer chain, async/streaming, IPC, routing |

| Framework | Read source code first (Rule 7), verify assumptions with tests |

| AI/LLM | Tool binding modes, simulated vs native, streaming, token limits |


Skill Delegation

| Trigger | Delegate To |

|---------|-----------|

| Need new API endpoint | fullstack-developer |

| UI fix needed | frontend-design |

| Schema change needed | database-migrations |

| After fix (mandatory) | code-review |


Anti-Patterns (FORBIDDEN)

| Forbidden | Correct |

|-----------|---------|

| Fix without RCA | Hypothesis ladder first |

| Single hypothesis then fix | List 3-5 hypotheses, verify each |

| Fix UI bug by code reading alone | Get runtime evidence first (Rule 8) |

| Skip consumer list for shared code | Fill consumer list first |

| Tests pass but server runs old code | Clear cache + restart + verify fix is live (Rule 6) |

| Fix code but ignore corrupted data | Assess data impact + repair if needed |

| Trust framework docs blindly | Read source code or run tests (Rule 7) |

| Fix one copy, miss the duplicate | Grep function name; check both Path A and Path B |

| Pattern match without boundary check | Add word boundaries / anchors / exact match (Rule 11) |

| Model changed but no migration | Run alembic revision --autogenerate |

| Use full workflow for a typo | Use Quick Fix path (Phase 0 Trivial tier) |

| Skip self-reflection | Must score, analyze, and learn |


Final Checklist

Core (Standard + Complex tiers)

| # | Check | Phase |

|---|-------|-------|

| 1 | Severity (P0-P3) + Tier (Trivial/Standard/Complex) classified | 0 |

| 2 | Root cause passes 4 gates | 2A |

| 3 | Bug pattern library + records searched | 2B |

| 4 | Impact chain traced (code+data+time+event) | 2C |

| 5 | Similar issue scan completed | 2D |

| 6 | Scope passes 5 gates (incl. duplicate scan) | 3A |

| 7 | Side effect prediction + blind spot check | 3A |

| 8 | Regression verification ALL passed (L0-L3) | 5A |

| 9 | Runtime deployment verified | 5B |

| 10 | Bug Summary output + code-review passed | 5C |

| 11 | Knowledge files updated | 6.1 |

| 12 | Self-reflection completed | 6.2 |

| 13 | If DB model changed: Alembic migration generated | 5 |

| 14 | User confirmed fix + no new bugs | Final |

Trivial Tier Checklist (Quick Fix path only)

| # | Check | Status |

|---|-------|--------|

| 1 | Fix applied and tested (lint/test/manual) | [ ] |

| 2 | Bug record entry added | [ ] |

| 3 | No behavioral change introduced | [ ] |


OpenClaw Project Context

Architecture Map

backend/app/
├── api/v1/              # FastAPI routes (agents, auth, chat, skills, tools, profile)
├── core/
│   ├── graph/           # LangGraph StateGraph (agent_graph, nodes/llm_node, tool_node, prepare_node)
│   ├── langchain/       # LangChain tools (tools.py, shell_tool.py, e2b_tools.py)
│   ├── mcp/             # MCP server integration (pool.py)
│   ├── database.py      # SQLAlchemy async engine
│   └── security.py      # JWT auth
├── models/              # SQLAlchemy ORM models (agent, tool, user, skill)
├── schemas/             # Pydantic request/response schemas
├── services/            # Business logic (agent_executor, chat_service, tool_call_parser, ...)
├── middleware/           # Request logging, audit, error handling
└── main.py              # FastAPI app entry

frontend/src/
├── features/            # Feature modules (chat, settings, admin, knowledge, skills, agents)
│   ├── chat/            # ChatPageV2, MessageRenderer, ToolCallCard, SkillExecutionInline
│   └── ...
├── components/ui/       # shadcn/ui style components (dialog, switch, checkbox)
├── hooks/               # React hooks (useChatStream — SSE event handling)
├── store/               # Zustand state management
└── lib/                 # API client (api-client.ts), markdown utils

Tech Stack

| Layer | Technology |

|-------|-----------|

| Backend | FastAPI + Python 3.11+ |

| ORM | SQLAlchemy 2.0 (async) |

| DB | PostgreSQL (asyncpg) or MySQL (aiomysql) |

| Migrations | Alembic (backend/alembic/) |

| Cache | Redis |

| AI | LangChain 0.3.x + LangGraph 0.4.x |

| Vector DB | ChromaDB |

| Frontend | React 18 + Vite + TypeScript |

| UI | Radix UI + Tailwind CSS |

| State | Zustand + TanStack Query |

| Tests | pytest (backend), Vitest (frontend) |

| Deploy | Docker Compose, supports PyInstaller desktop build |

High-Risk Bug Zones

Backend Hot Zones

| Zone | Files | Why It's High-Risk |

|------|-------|--------------------|

| Simulated Tool Call Parsing | services/tool_call_parser.py, core/graph/nodes/llm_node.py | Regex-based; dual implementations; multi-arg edge cases |

| Agent Executor | services/agent_executor.py | 3000+ LOC; native + simulated modes; complex streaming |

| Tool Argument Remapping | core/graph/nodes/tool_node.py | LLM wrong param names → alphabetical guess |

| LLM Streaming (httpx) | services/llm_manager.py | Reasoning model fallback; SSE; reasoning_content |

| MCP Tool Integration | core/mcp/pool.py, services/tool_service.py | MCP lifecycle; command vs HTTP; timeout |

| Skill Runtime | services/skill_executor.py, services/skill_service.py | Script exec; env var injection; enhanced vs local |

| Chat Streaming | services/chat_service.py | SSE events; client disconnect; async save |

| Memory System | services/unified_memory_manager.py | L1/L2; embedding scoring; slow queries |

Frontend Hot Zones

| Zone | Files | Why It's High-Risk |

|------|-------|--------------------|

| SSE Chat Stream | hooks/useChatStream.ts | Event parsing; reconnection; reasoning_content |

| Tool Call Rendering | features/chat/components/ToolCallCard.tsx | Dynamic display; error states; loading |

| Skill Execution UI | features/chat/components/SkillExecutionInline.tsx | Inline status; progress; error display |

| Markdown Renderer | features/chat/components/MarkdownRenderer.tsx | Nested code fences; special chars; XSS |

| Agent Editor | features/agents/AgentEditorPage.tsx | Complex form state; tool/skill/KB associations |

| Zustand Store | store/ | State updates not re-rendering if reference unchanged |

Two Code Paths for Agent Execution

Path A: Direct Executor (most common)
  chat API → chat_service → agent_executor.py → tool_call_parser.py → tool execution

Path B: StateGraph (LangGraph)
  chat API → chat_service → agent_graph.py → llm_node.py → tool_node.py → tool execution

When fixing anything in Path A, always check Path B for the same issue (and vice versa).

Known Duplicate Implementations

| Function / Feature | Primary Location | Known Alternate Location |

|--------------------|-----------------|-------------------------|

| parse_simulated_tool_calls | services/tool_call_parser.py | core/graph/nodes/llm_node.py |

| Tool loading / binding | services/agent_executor.py | core/graph/agent_graph.py |

| Token counting | services/token_counter.py | May have inline counting in agent_executor.py |

| Memory management | services/unified_memory_manager.py | core/graph/nodes/prepare_node.py |

OpenClaw Common Framework Pitfalls (Rule 7)

| Area | Assumption | Reality |

|------|-----------|---------|

| Config load order | First file has priority | Often last file wins (dict.update) |

| ORM lazy loading | Relations auto-load | Default lazy, causes N+1 |

| Async on Windows | Same as Linux | Windows uses ProactorEventLoop; run_dev.py forces SelectorEventLoop |

| Pydantic serialization | model_dump() includes all | exclude_unset=True changes behavior |

| LangChain tool binding | All models support tools | Reasoning models need simulated mode |

| __pycache__ | Python uses latest source | Stale .pyc can persist across restarts |

Windows Development Environment Gotchas

| Issue | Symptom | Workaround |

|-------|---------|-----------|

| Path separators | \ vs / | Use pathlib.Path or os.path.join |

| asyncio event loop | ProactorEventLoop default | run_dev.py forces loop="asyncio" |

| __pycache__ file locks | Can't delete while running | Kill process FIRST, then clean |

| Console encoding | GBK/CP936 default | sys.stdout.reconfigure(encoding='utf-8') |

| Playwright on Windows | Browser launch may fail | Runs in separate thread with own loop |


Verification Commands (OpenClaw)

Backend

cd backend
ruff check app/                                    # Lint
python -m pytest tests/ -v --tb=short              # Unit tests
python -m pytest tests/test_specific.py -v         # Specific test
alembic revision --autogenerate -m "check"         # DB migration check

Frontend

cd frontend
npm run lint
npm run typecheck
npm test
npm run build

Backend Server Restart

Get-WmiObject Win32_Process -Filter "Name='python.exe'" | Where { $_.CommandLine -like "*run_dev*" }
Stop-Process -Id [PID] -Force
Get-ChildItem -Path "backend" -Recurse -Filter "__pycache__" -Directory | Remove-Item -Recurse -Force
Start-Process -FilePath "backend\venv\Scripts\python.exe" -ArgumentList "run_dev.py" -WorkingDirectory "backend"
Invoke-WebRequest -Uri "http://127.0.0.1:8000/docs" -UseBasicParsing -TimeoutSec 5

Reference Files

Living Data Files (update after every fix)

| File | Purpose |

|------|---------|

| references/bug-records.md | Project-specific bug history |

| references/blind-spots.md | Single source of truth for AI blind spot registry |

Pattern Libraries (domain knowledge)

| File | Purpose |

|------|---------|

| references/bug-patterns.md | Universal bug pattern library (11 categories) |

| references/backend-patterns.md | Backend issues (API, ORM, LLM integration, OpenClaw-specific) |

| references/frontend-patterns.md | Frontend issues (React hooks, race conditions, CORS) |

Detailed Guides

| File | Purpose |

|------|---------|

| references/system-rca.md | System-level RCA (cross-layer, multi-process bugs) |

| references/regression-matrix.md | Complete zero-regression verification matrix |


Skill Evolution

Update this skill when:

  • Code review finds a bug that the workflow should have prevented
  • A recurring bug class repeats across fixes

Prefer updating specific sections over adding new rules.

After updates, validate that the workflow is still coherent and not overly bureaucratic.

版本历史

共 1 个版本

  • v1.0.3 当前
    2026-03-29 18:11 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 198 📥 65,030
ai-intelligence

skill-expert-skills

tinkcarlos
创建、优化、验证和封装AI代理技能(SKILL.md格式)。强制性6阶段工作流程,带质量门:第0阶段:任务分类+...
★ 0 📥 550
data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 367 📥 140,284