← 返回
效率工具 中文

Midos Self Improver

Structured learning pipeline with quality-gated promotion. Captures corrections, errors, and patterns — promotes only what proves itself through recurrence.
结构化学习流程,结合质量关卡晋升机制。捕获修正、错误与模式——仅推广经反复验证确证的内容。
msruruguay
效率工具 clawhub v1.0.0 1 版本 99829.1 Key: 无需
★ 0
Stars
📥 584
下载
💾 27
安装
1
版本
#latest

概述

midos-self-improver

An agent learning system that captures what goes wrong, what gets corrected, and what works — then promotes the best learnings into permanent project memory. With quality gates that prevent noise from polluting your knowledge base.

Most self-improving agents dump everything into a flat file. Over time, that file becomes a graveyard of one-off notes that never get cleaned up. midos-self-improver solves this with a capture → quality gate → staging → scoring → promotion pipeline where every learning must prove its value through recurrence before it becomes permanent.

Architecture

Agent Session
    ↓
[Detectors] — 5 trigger types
    ↓
.learnings/entries/{category}/{timestamp}.json
    ↓
[Quality Gate] — dedup + decision check
    ↓
.patterns/{domain}_pattern.md (staging)
    ↓
[4-Axis Scorer] — recurrence, freshness, specificity, impact
    ↓
.knowledge/ (permanent) ← only if score >= 0.7
    ↓
CLAUDE.md / AGENTS.md (promoted rules)

The 5 Detection Triggers

| Trigger | What It Captures | Example |

|---------|-----------------|---------|

| Correction | User corrects agent behavior | "Don't use git add ., use specific files" |

| Error | Tool call fails or returns unexpected result | ImportError, test failure, API timeout |

| Knowledge Gap | Agent didn't know something it should have | "The config file moved to /new/path" |

| Best Practice | Successful pattern worth repeating | "Running preflight before publish prevented 3 issues" |

| Pattern | Recurring code structure or workflow | "Every MCP tool needs tier guard + handler separation" |

Detection hooks

# Correction detector — fires on UserPromptSubmit when correction language detected
# Patterns: "no, do X instead", "that's wrong", "actually", "I said", "don't do that"

# Error detector — fires on PostToolUse when tool returns error
# Captures: exit code != 0, exception traces, "Error:" in output

# Gap detector — fires when agent says "I don't know" or searches >3 times for same thing

# Pattern detector — fires on PostToolUse Write|Edit
# Analyzes: what decisions were made, what trade-offs considered

Quality Gate (Deterministic)

Before any learning enters the staging area, it passes through a quality gate:

Deduplication

1. SHA-256 hash of normalized content (lowercase, strip whitespace)
2. Compare against all entries in last 30 days
3. If hash exists → increment recurrence counter, skip creation
4. If similar (>85% trigram overlap) → merge into existing entry

Decision Check

Rules (no LLM required):
  1. >= 2 decisions extracted from patterns → PASS
  2. >= 3 files across >= 2 domains → PASS (cross-cutting)
  3. Only docstrings, no decisions → FAIL (log, not pattern)
  4. All files in same trivial edit → FAIL (maintenance, not learning)

Only entries that pass both checks advance to the staging area.

4-Axis Scoring

Every staged learning gets scored on 4 axes:

| Axis | Weight | What It Measures |

|------|--------|-----------------|

| Recurrence | 0.35 | How many times this same issue/pattern appeared |

| Freshness | 0.25 | How recent (exponential decay, half-life 14 days) |

| Specificity | 0.20 | Concrete file paths/functions vs vague advice |

| Impact | 0.20 | Breadth of effect (multi-domain > single file) |

Scoring formulas

recurrence_score = min(count / 5, 1.0)  # saturates at 5 occurrences
freshness_score = exp(-0.693 * days_since / 14)  # half-life 14 days
specificity_score = (has_path * 0.4) + (has_function * 0.3) + (has_example * 0.3)
impact_score = min(n_domains / 3, 1.0) * 0.6 + min(n_files / 5, 1.0) * 0.4

composite = (recurrence * 0.35) + (freshness * 0.25) +
            (specificity * 0.20) + (impact * 0.20)

Promotion thresholds

composite >= 0.7  → PROMOTE to permanent knowledge base
composite < 0.3   → PRUNE (archive and stop tracking)
0.3 <= c < 0.7    → KEEP in staging (let it mature with more data)

Quick Start

Standalone Mode (zero dependencies)

Add to your CLAUDE.md or agent instructions:

## Self-Improvement Protocol

### On Corrections
When the user corrects you:
1. Log the correction to `.learnings/corrections/{date}.md`
2. Include: what you did wrong, what the correct behavior is, which file/function
3. If this is the 3rd+ time for the same correction → promote to CLAUDE.md rules

### On Errors
When a tool call fails:
1. Log to `.learnings/errors/{date}.md`
2. Include: command, error message, root cause, fix applied
3. If same error type appears 3+ times → create a prevention rule

### On Patterns
When you notice a recurring approach that works:
1. Log to `.learnings/patterns/{domain}/{date}.md`
2. Include: what decision, why this over alternatives, evidence it works
3. Pattern must have >= 2 concrete decisions to be logged (not just descriptions)

### Promotion Rules
- Recurrence >= 3 AND composite score >= 0.6 → promote to permanent memory
- Never promote without evidence of repeated value
- Deduplicate: check SHA-256 before writing new entries
- Archive entries older than 30 days with score < 0.3

With the capture hooks

# Correction capture (wired to UserPromptSubmit)
from hooks.learning_capture import capture_correction
capture_correction(
    user_message="no, always use specific files in git add",
    agent_response="I'll use git add file1 file2 instead of git add .",
    context={"file": "CLAUDE.md", "function": "commit_protocol"}
)

# Error capture (wired to PostToolUse)
from hooks.learning_capture import capture_error
capture_error(
    tool="Bash",
    command="python -m pytest tests/",
    error="ImportError: cannot import name 'AuthMiddleware'",
    fix="Changed to absolute import: from modules.community_mcp.auth import AuthMiddleware"
)

# Assess all staged patterns
from hooks.pattern_harvester import assess_pattern_value
results = assess_pattern_value()
# Returns: [{"file": "...", "score": 0.82, "action": "PROMOTE"}, ...]

Triggering promotion

# Run assessment on all staged patterns
python -c "from hooks.pattern_harvester import assess_pattern_value; assess_pattern_value()"

# Check what's in staging
ls docs/patterns/

# Check what was promoted
ls .knowledge/ | grep pattern

# Check what was discarded
cat knowledge/_discarded/LOG.md

Usage Patterns

Pattern 1: Correction Loop

User: "Don't read entire files, use grep first"
  ↓
Detector: correction language detected ("don't", imperative)
  ↓
Entry: .learnings/corrections/2026-03-04T10:23:45.json
  {
    "type": "correction",
    "wrong": "Read entire file with cat/Read",
    "right": "Grep for pattern first, then Read with offset",
    "context": {"domain": "efficiency"},
    "recurrence": 1
  }
  ↓
(Same correction appears 2 more times over 3 days)
  ↓
recurrence_score: 0.6 (3/5)
freshness_score: 0.95 (recent)
specificity_score: 0.7 (has concrete tool names)
impact_score: 0.8 (affects all file operations)
composite: 0.76 → PROMOTE
  ↓
.knowledge/efficiency_grep_before_read_pattern_20260307.md
  ↓
Added to CLAUDE.md: "Grep > Read — Never read full files, use offset/limit"

Pattern 2: Error Prevention

Error: ImportError on absolute vs relative import (3rd occurrence)
  ↓
Entry already exists with recurrence=3
  ↓
Assessment: composite 0.72 → PROMOTE
  ↓
Generated rule: "Always use absolute imports in package directories"
  ↓
Promoted to project-level AGENTS.md

Pattern 3: Noise Rejection

Agent writes docstrings to 2 files in same module
  ↓
Quality gate: "no decisions found — only docstrings (log, not pattern)"
  ↓
REJECTED — never enters staging

How It Compares

| Feature | midos-self-improver | self-improving-agent (101K) | proactive-agent (54K) |

|---------|--------------------|-----------------------------|----------------------|

| Promotion tiers | 4 (entry → staging → chunks → rules) | 2 (.learnings → CLAUDE.md) | 1 (WAL → manual) |

| Quality gate | Deterministic (dedup + decision check) | None | None |

| Deduplication | SHA-256 + trigram similarity | None | None |

| Scoring | 4-axis composite (recurrence, freshness, specificity, impact) | Manual review | VFM scoring (manual) |

| Promotion trigger | Automatic at threshold | Manual (activator.sh) | Manual |

| Noise rejection | Yes (quality gate rejects non-decisions) | No (logs everything) | No |

| Categories | 5 types with domain tagging | 3 files (LEARNINGS, ERRORS, FEATURES) | 1 file (WAL) |

| Maturation | Staging area with aging | None | None |

| Archival | Auto-prune at score < 0.3 | None | None |

| Hook integration | PostToolUse + UserPromptSubmit | PostToolUse + UserPromptSubmit | Manual |

| Works without LLM | Yes (all deterministic) | Yes | Yes |

Entry Format

{
  "id": "sha256-first-8-chars",
  "type": "correction|error|knowledge_gap|best_practice|pattern",
  "timestamp": "2026-03-04T10:23:45Z",
  "content": "Always use absolute imports in packages",
  "context": {
    "domain": "imports",
    "files": ["src/auth/server.py"],
    "functions": ["import_auth"],
    "trigger": "ImportError in CI"
  },
  "recurrence": 3,
  "hash": "a1b2c3d4",
  "scores": {
    "recurrence": 0.6,
    "freshness": 0.95,
    "specificity": 0.7,
    "impact": 0.4,
    "composite": 0.66
  },
  "status": "staging|promoted|pruned",
  "promoted_to": null
}

MidOS-Connected Mode

When running inside the MidOS ecosystem, the self-improver gains:

  • GEPA coherence scoring validates promoted chunks against the knowledge base
  • L2R reranker helps find truly similar existing patterns (prevents subtle duplicates)
  • Vector dedup via LanceDB cosine similarity (catches semantic duplicates, not just textual)
  • Auto-promotion pipeline with MC-2 deliverable gates (frontmatter, length, coherence)
  • Pattern harvester hook wired to every Write|Edit operation
  • Scheduled assessment via your cron/scheduler system (runs every 2 hours)
  • MCP tools: learning_log, learning_search, learning_stats exposed via MCP server

The standalone mode handles 80% of learning scenarios. The ecosystem adds deeper dedup, quality scoring, and integration with the 6-layer knowledge pipeline.


Built with MidOS — MCP Community Library.

This is 1 of 200+ skills in the MidOS ecosystem.

Free MCP access: midos.dev/dev (500 queries/mo)

Full ecosystem: midos.dev/pro ($20/mo)

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 06:24 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

suspicious
查看报告

🔗 相关推荐

productivity

Weather

steipete
获取当前天气和预报(无需API密钥)
★ 445 📥 226,302
productivity

Word / DOCX

ivangdavila
创建、检查和编辑 Microsoft Word 文档及 DOCX 文件,支持样式、编号、修订记录、表格、分节符及兼容性检查等功能。
★ 438 📥 147,662
data-analysis

Rust Patterns

msruruguay
生产级 Rust 模式,涵盖所有权、异步 Tokio、Axum Web 框架、SQLx、错误处理、CLI 工具、WASM 及 PyO3 Python 绑定。
★ 0 📥 1,049