← 返回
未分类 中文

Deflate — Intelligent Context Compression

Intelligent context compression for OpenClaw agents. Applies Cornell-MapReduce methodology to preserve information quality while reducing token cost by 60-80...
智能上下文压缩,适用于OpenClaw代理。采用Cornell‑MapReduce方法,保持信息质量的同时将令牌成本降低60‑80%。
thevibestack thevibestack 来源
未分类 clawhub v1.0.0 1 版本 99846.4 Key: 无需
★ 0
Stars
📥 650
下载
💾 0
安装
1
版本
#latest

概述

🗜️ Deflate — Intelligent Context Compression for OpenClaw

> Every message you send, the LLM re-reads the ENTIRE conversation history.

> A 100K token chat = 100K tokens of INPUT per message. 20 messages = 2M tokens

> re-read. This skill exists to keep that number LOW without losing information.


PART 1: HOW CONTEXT COSTS WORK (Why This Matters)

Message #1:  Context 25K  → You pay for 25K input tokens
Message #2:  Context 30K  → You pay for 30K input tokens
Message #10: Context 65K  → You pay for 65K input tokens
Message #20: Context 105K → You pay for 105K input tokens
Message #50: Context 200K → You pay for 200K input tokens

Total paid for 50 messages WITHOUT compression: ~5.2M input tokens
Total paid for 50 messages WITH compression at 80K: ~2.8M input tokens
SAVINGS: 46% fewer tokens = 46% less money

> The longer you stay in one chat without managing context,

> the more expensive EVERY SINGLE MESSAGE becomes.


PART 2: ZONE SYSTEM

Token Zones (CHECK EVERY MESSAGE)

ZoneRangeEmojiWhat to do
-------------------------------
GREEN0 - 80K🟢Work freely
YELLOW80K - 130K🟡Evaluate: compress or /new
RED130K+🔴Act NOW: compress or /new

Zone Reporting (MANDATORY)

Every response MUST include at the end:

[emoji] Contexto: XXK tokens

In audio/voice mode: omit token count (don't read numbers aloud).

Zone Math (NO EXCUSES)

   50K → 🟢 GREEN
   85K → 🟡 YELLOW  (80 < 85 < 130)
  110K → 🟡 YELLOW  (80 < 110 < 130)
  130K → 🔴 RED     (130+)
  166K → 🔴 RED     (130+)

PART 3: TOPIC TRACKING

What Is a Topic?

A topic is a DISTINCT subject of conversation. Examples:

  • "Configure the database" = 1 topic
  • "Fix the login bug" = 1 topic
  • "Discuss marketing strategy" = 1 topic
  • "Configure database AND fix login AND discuss marketing" = 3 topics

Track Topics Actively

Maintain a mental list of active topics. For each topic track:

TOPIC: [name]
STATUS: [active | completed | paused]
STARTED: message ~#N
KEY DATA: [IDs, decisions, configs that must survive]

Detect Completed Topics

A topic is COMPLETED when:

  • The user says "ok", "listo", "va", "next" and moves on
  • The task is done and results were delivered
  • No more questions or actions remain for that topic

Recommend /new When Topics Are Done

When ALL active topics are completed:

💡 Veo que ya cerramos [topic A] y [topic B].
   ¿Abrimos chat nuevo para el siguiente tema?
   Ya guardé todo en memoria.

When SOME topics are complete but others continue:

📋 [Topic A] ✅ cerrado | [Topic B] 🔄 en progreso
   Seguimos con [Topic B]. Contexto: XXK tokens.

PART 4: THE DEFLATE DECISION (Compress vs /new)

When You Hit Yellow Zone (80K+), Run This:

DEFLATE ANALYSIS:
──────────────────────────────
1. Active topics: [list with status]
2. Topics completed this session: [count]
3. Critical data in chat NOT yet in MEMORY.md: [list]
4. Session type: [focused / multi-topic / chaotic]
5. Previous compressions this session: [count + last reduction %]

DECISION:
├─ Is critical data already saved to MEMORY.md?
│  ├─ YES → recommend /new (FREE, fresh context) ✅
│  └─ NO → flush to MEMORY.md first, then:
│     ├─ All topics done? → /new ✅
│     └─ Topic in progress? → /compact (PAID) ⚠️
│
└─ If /compact chosen:
   ├─ 1-2 active topics → expect good reduction (40-60%)
   ├─ 3-4 active topics → expect moderate reduction (25-40%)
   └─ 5+ active topics → STOP. Flush + /new instead

The Golden Rule

> /new is FREE. /compact costs tokens.

> Always prefer /new when MEMORY.md has the important stuff.

> Only /compact when you're mid-topic and can't restart.


PART 5: COMPRESSION METHODOLOGY (Cornell-MapReduce)

When /compact is the right choice, use this 5-step method:

Step 1: MAP — Separate by Topic

Identify distinct topics in the conversation.
Group messages by topic mentally.

Step 2: FILTER — Remove Noise

ELIMINATE (zero information value):
- Greetings: "hola", "qué onda", "gracias"
- Confirmations: "ok", "va", "listo", "dale"
- Failed attempts: keep ONLY the final working solution
- Repeated info: if said 3 times, keep 1
- Tool raw output: keep results, discard JSON/logs
- Emotional reactions: "LOL", "wow", "nice"
- The agent explaining its thought process

Step 3: DISTILL — Cornell Notes per Topic

For each topic, create an atomic note:

┌─ TOPIC: [keyword/name] ─────────────────┐
│ SUMMARY: [1-2 lines max]                │
│ DECISION: [what was decided, by whom]    │
│ DATA: [IDs, configs, values — EXACT]     │
│ STATUS: [done / in-progress / blocked]   │
│ NEXT: [pending action, if any]           │
└──────────────────────────────────────────┘

Step 4: PRESERVE — Lossless Data (NEVER ALTER)

The following must survive compression EXACTLY as-is:
- Numeric IDs (project_id: 42, client_id: 7)
- Dates (2026-03-20)
- Money amounts ($450.00 MXN)
- URLs and file paths
- API keys and config values
- Names (people, projects, companies)
- Code snippets that are part of solutions

Step 5: COMBINE — Reduce Phase

Merge all Cornell notes into the compressed summary.
Order: MOST IMPORTANT FIRST (prevents "lost in the middle" effect).

Format:
SESSION CONTEXT (compressed from XXK → YYK):
├── [topic-keyword] Summary... | Decision: ... | IDs: ...
├── [topic-keyword] Summary... | Status: in-progress
├── [PRESERVED DATA] {all lossless items}
└── [PENDING] {actionable next steps}

PART 6: COMPRESSION QUALITY CONTROL

After Every Compression, Log:

DEFLATE LOG:
- Before: [X]K tokens
- After: [Y]K tokens
- Reduction: [Z]%
- Topics preserved: [list]
- Lossless data verified: [yes/no]
- Verdict: EFFECTIVE (>40%) | MARGINAL (20-40%) | FAILED (<20%)

Efficiency Rules

ReductionVerdictNext Action
---------------------------------
>40%✅ EFFECTIVESession healthy, continue
20-40%⚠️ MARGINALLast useful compress, /new next time
<20%❌ FAILEDSTOP. Flush + /new immediately

Session Type Impact

TypeExpected ReductionsNotes
--------------------------------
Debugging / logs60-80%Logs = pure noise, highly compressible
Data entry (repetitive)60-80%Same structure repeated, compresses well
Single-topic design40-60%Good reduction, decisions accumulate slowly
Configuration / setup40-60%Trial-and-error is compressible
Multi-topic (3-4)25-40%Each topic needs its own summary
Strategy / negotiation15-25%Everything is critical context
Brainstorm (5+ topics)10-20%Don't compress, just /new

PART 7: MEMORY FLUSH PROTOCOL

Before ANY /new or When in Red Zone:

Write to MEMORY.md (or memory/YYYY-MM-DD.md) with tagged sections:

## [DECISION] Brief title
Date: YYYY-MM-DD
- What: [the decision]
- Why: [1-line reason]
- Who: [user or agent decided]

## [PROJECT] Project Name
Date: YYYY-MM-DD  
- Status: active | paused | done
- Key IDs: [list]
- Next: [actionable step]

## [CONFIG] What changed
Date: YYYY-MM-DD
- Setting: [name] → [new value]
- Why: [reason]

## [LEARNING] Lesson learned
Date: YYYY-MM-DD
- Problem: [what went wrong]
- Fix: [what solved it]
- Rule: [how to prevent it next time]

Memory Tags Reference

TagUse for
--------------
[DECISION]Business or technical decisions
[PROJECT]Project status and key data
[CONFIG]System/tool configuration changes
[LEARNING]Mistakes and lessons learned
[CONTACT]People, clients, IDs
[TOOL]New tools, commands, integrations
[COST]Budget, API usage, optimization results
[RULE]New operational rules or protocols

PART 8: PRE-/NEW CHECKLIST

When the user types /new or you recommend it:

PRE-/NEW CHECKLIST:
□ All critical data written to MEMORY.md?
□ Active topic status saved (in-progress items noted)?
□ IDs and configs preserved exactly?
□ Pending actions clearly listed?
□ User confirmed ready for /new?

Steps:

  1. Run the checklist
  2. Flush anything missing to MEMORY.md
  3. Confirm to user: "Guardé [N] decisiones, [M] IDs,

[P] pendientes. Listo para /new."

  1. User sends /new
  2. New session: read MEMORY.md, confirm data loaded

PART 9: SESSION HEALTH REPORT

In every heartbeat or status report, include:

SESSION HEALTH:
[zone emoji] Contexto: XXK tokens
📊 Compresiones: N (última: Z% reducción)
📋 Temas: [active] activos, [done] cerrados
💰 Costo estimado sesión: ~$X.XX
💡 Recomendación: [seguir | comprimir | /new]

Cost Estimation (Simple)

Gemini Flash: ~$0.10 per 1M input tokens
Cost per message ≈ context_tokens × $0.0000001

Examples:
  50K context  → $0.005/msg
  100K context → $0.01/msg
  200K context → $0.02/msg

PART 10: CONFIGURATION

OpenClaw Config (Recommended)

Add to your openclaw.json under agents.defaults.compaction:

{
  "compaction": {
    "mode": "default",
    "reserveTokens": 920000,
    "reserveTokensFloor": 920000,
    "keepRecentTokens": 20000,
    "memoryFlush": {
      "enabled": true,
      "softThresholdTokens": 4000
    },
    "identifierPolicy": "strict"
  }
}

This sets up the SAFETY NET (system-level auto-compaction at ~80K).

The skill handles the INTELLIGENT layer on top.

Customization

Adjust these values based on your model's context window:

Model WindowreserveTokensYellow ZoneRed Zone
------------
128K (GPT-4)8000048K+80K+
200K (Claude)13000070K+120K+
1M (Gemini)92000080K+130K+
2M (Gemini Pro)1800000200K+500K+

CREDITS

Created by @thevibestackgithub.com/thevibestack

Methodology: Cornell-MapReduce Hybrid with Lossless Data Preservation.

Research: Recursive Summarization, Knowledge Distillation, Medical Shorthand.

License: MIT-0

> 💡 If this skill saved you money, star the repo and share it.

> The AI should work for everyone, not just those with big budgets.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-03 05:01 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

Agent Browser

rez0
用于 AI 代理的浏览器自动化 CLI。当用户需要与网站交互(包括浏览页面、填写表单、点击按钮、截图等)时使用。
★ 838 📥 314,773
ai-agent

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,398 📥 323,088
ai-agent

self-improving agent

pskoett
捕获经验教训、错误及修正内容,以实现持续改进。适用于以下场景:(1)命令或操作意外失败;(2)用户纠正Claude(如“不,那不对……”“实际上……”);(3)用户请求的功能不存在;(4)外部API或工具出现故障;(5)Claude发现自身
★ 4,109 📥 831,281