← 返回
开发者工具 中文

Shed

Context window hygiene for long-running LLM agents. Decision rules for when and how to compress, mask, switch, or delegate context — backed by research (JetB...
面向长时间运行LLM智能体的上下文窗口管理。提供何时及如何压缩、掩码、切换或委托上下文的决策规则,并有研究支持(JetB...
compass-soul compass-soul 来源
开发者工具 clawhub v1.0.0 1 版本 99911.4 Key: 无需
★ 0
Stars
📥 1,128
下载
💾 4
安装
1
版本
#latest

概述

Shed — Context Hygiene for Agents

Shed what you don't need. Keep what matters.

Named for molting — the process of shedding an outer layer to grow. Your context window is your skin. When it gets too heavy, shed the dead weight.

Core Principle

Tool outputs are 84% of your context growth but the lowest-value tokens you carry. (Lindenbauer et al., NeurIPS 2025 DL4C workshop, measured on SWE-agent). Everything flows from this.

The Rules

After Every Tool Call

  1. Extract, don't accumulate. When a tool returns large output (file contents, search results, logs, API responses), immediately write the key facts to a file or compress into bullets. The raw output is now disposable.
  2. Ask: "Will I need this verbatim later?" Almost never. The answer you extracted is what matters, not the 500 lines that contained it.

When Context Reaches ~70%

  1. Trigger condensation. Don't wait for the platform to compact you — that's losing control of your own memory. At 70%, actively shed.
  2. Mask old tool outputs first (free, no LLM calls). Keep your reasoning and action history intact — you need your decision chain, not the raw ls -la from 20 turns ago.
  3. Summarize reasoning only as backup. If masking isn't enough, compress old reasoning turns. But this is lossy and costs an LLM call — use sparingly.
  4. Never re-summarize a summary. If you've already condensed once and context is growing again, switch context or spawn a sub-agent. Recursive summarization compounds errors.

When Completing a Task

  1. Write results to file, then switch context immediately. Stale completed-task context is anti-signal for your next task. Don't carry it.
  2. Leave breadcrumbs. Before switching: write what you did, what's next, and where the files are to memory/YYYY-MM-DD.md. Future-you needs a trailhead, not a transcript.

When Delegating Work

  1. Spawn fresh-context sub-agents for complex sub-tasks. Your context is noise for their work. Give them a clean prompt with just what they need.
  2. Don't inherit parent context into children. The AutoGen pattern: each agent gets its own token budget. Inherited bloat = inherited degradation.

Architecture (For Agent Builders)

  1. Structure context into typed blocks with hard size limits. Every production framework converges here — Letta uses labeled blocks (human, persona, knowledge) with character caps. A monolithic context is unmanageable.
  2. Separate working memory (in-context) from reference memory (file/DB). Your effective context is much smaller than your window size. Models lose information in the middle of long contexts.
  3. Place critical information at the beginning or end of context, never the middle. Positional attention bias underweights middle content by up to 15 percentage points (Hsieh et al., 2024, "Found in the Middle").

The Complexity Trap

Don't assume sophisticated compression (LLM summarization) beats simple approaches (observation masking). The JetBrains "Complexity Trap" paper (2025) tested both across 5 model configurations on SWE-bench Verified:

  • Simple masking halved cost relative to raw agent
  • Masking matched or exceeded LLM summarization solve rates
  • Example: Qwen3-Coder went from 53.8% → 54.8% with masking alone

The lesson: start simple. Mask tool outputs. Only add summarization if masking alone isn't enough.

Cost Model

Without intervention, cost per turn scales quadratically (each turn adds tokens AND reprocesses all previous tokens). Periodic condensation converts this to linear scaling. OpenHands measured 2x cost reduction with their condenser.

Quick Reference

SituationAction
-------------------
Tool returned big outputExtract facts → file → discard raw
Context at ~70%Mask old tool outputs
Context still growing after maskingSummarize oldest reasoning turns
Task completeWrite results → switch context
Complex sub-task neededSpawn fresh sub-agent
Already condensed, still growingSwitch context or spawn
Critical info to preservePut at start or end, not middle

Sources

  • Lindenbauer et al., "The Complexity Trap" (NeurIPS 2025 DL4C): https://arxiv.org/abs/2508.21433
  • OpenHands Context Condensation (2025): https://openhands.dev/blog/openhands-context-condensensation-for-more-efficient-ai-agents
  • Letta/MemGPT Memory Blocks: https://www.letta.com/blog/memory-blocks
  • LLMLingua-2 (ACL 2024): https://aclanthology.org/2024.acl-long.91/
  • Liu et al., "Lost in the Middle" (2023): https://arxiv.org/abs/2307.03172
  • Hsieh et al., "Found in the Middle" (2024): https://arxiv.org/abs/2406.16008
  • MEM1 Dynamic State Management (2025): https://arxiv.org/abs/2506.15841

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 08:06 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,233 📥 268,658
ai-agent

self-improving agent

pskoett
捕获经验教训、错误及修正内容,以实现持续改进。适用于以下场景:(1)命令或操作意外失败;(2)用户纠正Claude(如“不,那不对……”“实际上……”);(3)用户请求的功能不存在;(4)外部API或工具出现故障;(5)Claude发现自身
★ 4,090 📥 818,059
ai-agent

ontology

oswalpalash
类型化知识图谱,用于结构化智能体记忆与可组合技能。适用于以下场景:创建/查询实体(人物、项目、任务、事件、文档)、关联相关对象、强制执行约束、将多步操作规划为图谱变换,或当技能需要共享状态时。触发关键词包括"记住""我知道关于什么""将X链
★ 725 📥 245,335