← 返回
未分类

compound-eng-orchestrating-swarms

Coordinate multi-agent swarms for parallel and pipeline workflows. Use when coordinating multiple agents, running parallel reviews, building pipeline workflows, or implementing divide-and-conquer patterns with subagents.
Coordinate multi-agent swarms for parallel and pipeline workflows. Use when coordinating multiple agents, running parallel reviews, building pipeline workflows, or implementing divide-and-conquer patterns with subagents.
yjkj999999
未分类 community v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 17
下载
💾 0
安装
1
版本
#latest

概述

Swarm orchestration

Primitives

Agents, teams, teammates, leaders, tasks, inboxes, messages, backends — see primitives.md for definitions and the file-system layout.


Two Ways to Spawn Agents

AspectTask (subagent)Task + team_name + name (teammate)
------------------------------------------------------------
LifespanUntil task completeUntil shutdown requested
CommunicationReturn valueInbox messages
Task accessNoneShared task list
Team membershipNoYes
CoordinationOne-offOngoing
Best forSearches, analysis, focused workParallel work, pipelines, collaboration

Subagent (short-lived, returns result):

Task({ subagent_type: "Explore", description: "Find auth files", prompt: "..." })

Teammate (persistent, communicates via inbox):

Teammate({ operation: "spawnTeam", team_name: "my-project" })
Task({ team_name: "my-project", name: "worker", subagent_type: "general-purpose",
       prompt: "...", run_in_background: true })

For detailed agent type descriptions, see agent-types.md.

Parallel Fan-Out (for independent work)

When dispatching multiple read-only or worktree-isolated agents whose work is independent, issue all Task calls in a SINGLE assistant message. Sequential dispatch across separate messages serializes what should run concurrently. Opus 4.7 does not parallelize by default -- state it explicitly.

// Correct: one message, multiple Task tool uses
Task({ subagent_type: "security-sentinel", ... })
Task({ subagent_type: "performance-oracle", ... })
Task({ subagent_type: "architecture-strategist", ... })

Sequential dispatch (each Task in its own message, waiting on the previous to return) is a serialization bug, not a coordination pattern. If agents truly depend on each other's output, that is a pipeline -- see Coordination Models below.

Bounded parallelism when the harness caps active subagents. Single-message fan-out (above) tells Opus to dispatch in parallel; the harness then decides how many to run concurrently. When the harness accepts the dispatch but caps active execution, queue the overflow rather than failing. Dispatch as many as the harness accepts in the first batch, treat transient capacity-related spawn errors as backpressure (any retryable error indicating the limiter rejected the dispatch — exact wording varies across harness versions and platforms; do not pattern-match on a fixed string list), and re-dispatch queued agents as active ones complete. Record an agent as failed only after a successful dispatch times out or returns an error, or when dispatch fails for a non-capacity reason (bad tool name, malformed prompt, missing permission). The fan-out is still parallel — it is just rate-capped to whatever the harness can run concurrently.


Quick Reference

For copy-paste spawn/message/task/shutdown snippets, load quick-reference.md.


Dispatch Discipline

Rules for when and how to dispatch agents. Getting these wrong wastes tokens and creates hard-to-debug failures.

When to dispatch a team vs. do it yourself:

Assess 5 signals: file count, module span, dependency chain, risk surface, parallelism potential. If 3+ fall in the "complex" column, dispatch a team. Below 3, do it yourself. When in doubt, prefer the simple path -- team overhead is only justified when parallelism provides a real speedup.

Task description template (for every dispatched task):

Every task prompt must include these fields to prevent integration failures:

  • Objective: what to accomplish (one sentence)
  • Owned Files: files this agent creates or modifies (exclusive -- no file assigned to multiple agents)
  • Interface Contracts: what to import from other agents' work, what to export for downstream agents
  • Acceptance Criteria: how the agent knows the task is correct
  • Out of Scope: what NOT to touch, even if it looks related

Cardinal rule: one owner per file. When files must be shared, designate a single owner; other agents send change requests, owner applies sequentially. If an upstream dependency isn't ready yet, write a stub/mock so downstream work can continue unblocked.

No parallel implementation agents (without worktrees):

Implementation agents share state via git by default, so parallel dispatch causes overwrites. Use isolation: "worktree" to give each agent its own copy. Without worktrees, dispatch implementation agents sequentially. Review, research, and analysis agents are always safe to parallelize (read-only).

Pre-dispatch file-intersection check -- operationalize the one-owner-per-file rule with a runnable safety gate before every parallel dispatch:

  1. Collect each unit's declared Owned Files / Test Paths / Modify Paths from its task spec.
  2. Build a {file → unit} map. If any file appears under more than one unit, the dispatch is unsafe. Quick check on Markdown task specs:

```bash

grep -h "^Owned Files:" -A 20 tasks/*.md | grep -v "^Owned Files:" | grep -v "^--$" | sort | uniq -d

```

Any output is an overlapping file path that needs resolution.

  1. On overlap: either downgrade to serial (log the overlap and the reason), or assign worktree isolation (isolation: "worktree" per agent), or rewrite unit boundaries so files become exclusive.
  2. Even with no declared overlap, include this constraint verbatim in every parallel-dispatch prompt: "Do not run git add, git commit, or the project's test suite while other parallel agents are active -- you'd race on the git index or thrash the test cache. Stage changes for the orchestrator to commit after integration."

The intersection check catches silent conflicts the controller misses at plan time; the dispatch-prompt constraint catches them when a unit's file list was incomplete.

Preset team compositions: Start from a named preset before designing a custom team. See team-compositions.md for the full table (Review / Debug / Feature / Fullstack / Migration / Security / Research), the cardinal subagent_type rule (read-only agents cannot implement), and custom-team guidelines. Use the smallest preset that covers all required dimensions — overlap between reviewers is a sizing signal to redefine focus areas, not add more agents.

Model selection by task complexity:

Task shapeModel
------------------
1-2 files, clear spec, mechanicalmodel: "haiku"
Multi-file integration, standard complexityDefault model
Architecture decisions, ambiguous scope, reviewmodel: "opus"

Handoff protocol -- structured agent-to-agent transfers:

When passing work between agents (leader→implementer, implementer→reviewer, reviewer→leader), include:

  1. Context: what was done, relevant files, constraints discovered
  2. Deliverable: specific output expected from the receiving agent
  3. Acceptance criteria: how the receiving agent knows the work is correct

The controller reads all tasks from the plan upfront and provides full task text directly to subagents. Never make subagents read plan files themselves -- they waste tokens navigating, may read different versions, and inherit unclear context. Paste the task content into the prompt. See handoff-templates.md for QA FAIL and Escalation Report formats.

Standardize implementer status signals:

Include the four statuses defined in ia-verification-before-completion (DONE, DONE_WITH_CONCERNS, BLOCKED, NEEDS_CONTEXT) in every teammate prompt so they know the reporting format. Expect teammates to report one. BLOCKED responses get further triage via the decision tree below.

BLOCKED triage decision tree -- when a teammate reports BLOCKED, classify the root cause before acting. Never retry the same prompt on the same model without changing a variable.

Root causeSignalResponse
-----------------------------
Missing contextAgent asked for a file, spec, or decision it neededProvide the missing context, re-dispatch same agent
Reasoning ceilingAgent attempted, got stuck on a subtlety it cannot resolveEscalate model (haiku → sonnet → opus) and re-dispatch
Task too largeAgent made partial progress but hit token/complexity limitsSplit into smaller tasks with explicit interface contracts
Spec wrongAgent surfaces a contradiction in the plan or a missing requirementEscalate to the user -- do not re-dispatch

Never ignore an escalation. Never force the same agent to retry without changing at least one variable (context, model, or task scope).

Two-stage review gate on subagent outputs:

Verify spec compliance first: does the output match what was requested? Only then evaluate quality. A beautifully written solution to the wrong problem is still wrong. Structure review as two explicit passes -- pass 1 rejects on spec mismatch without reading further, pass 2 assesses correctness and quality on spec-compliant outputs.

QA retry loop:

Max 3 attempts per task. After each QA failure, pass structured feedback to the implementer using the QA FAIL template. After 3 failures, mark the task as blocked, continue the pipeline (don't halt everything), and let final integration catch remaining issues. Counter resets when advancing to the next task.


Integration Rules

Post-integration verification -- after all agents return: check overlapping file edits, review for conflicting approaches, run full test suite.

Spawned-session behavior -- when a skill runs inside an orchestrated pipeline (as a subagent, not user-invoked), suppress interactive prompts: do not use AskUserQuestion, auto-choose the conservative/safe default, skip upgrade checks and telemetry. Focus on completing the task and reporting results via prose output. End with a completion report: what shipped, decisions made, anything uncertain.

Decision presentation -- never silently drop options. When the orchestrator surfaces a user-facing choice (team composition, an escalation path, a spec-wrong fork) via AskUserQuestion and the choice carries more than four viable options -- the tool's per-question cap -- split it into sequential rounds (D1.1, D1.2, ...) rather than truncating to the first four. Truncation hides viable choices the user never sees and silently narrows their decision space. Surface any cross-option dependency inline in the round that introduces it. (In spawned sessions, the rule above takes precedence: don't ask at all -- auto-pick the safe default.)


Context Carry-Forward

After each turn, five strategies exist for moving context forward: Continue, Rewind, /compact, Subagent, /clear+brief. Choose deliberately — the default "Continue" is rarely best, and Rewind is strictly better than "correcting in place" after a failed attempt. See context-carry-forward.md for the full decision table and rationale.

Coordination Models

Two approaches to multi-agent coordination exist. Choose based on the work pattern:

AspectStateless (copy-paste outputs)Stateful (file ownership + dependencies)
---------------------------------------------------------------------------------
How agents share stateLeader copies full outputs between promptsAgents read/write shared task files, claim ownership
Best forShort pipelines, 2-3 agents, sequential handoffsParallel work, 4+ agents, complex dependency graphs
Failure modeContext grows linearly with agent countConcurrent modification conflicts
MitigationSummarize before passing (keep essentials, drop navigation)Use worktrees or exclusive file ownership per agent

For most work, start with stateless handoffs. Graduate to stateful coordination only when parallelism provides a real speedup and you have worktree isolation to prevent file conflicts.


Dispatch Anti-Patterns

Before designing any multi-agent workflow, check it against the four named failure modes in dispatch-anti-patterns.md: router persona, persona calls persona, sequential paraphraser, deep persona trees. Rule of thumb: if the proposed swarm has more coordinator roles than worker roles, collapse it.

Anti-Sycophancy and Resilience

When dispatching judge panels, running parallel reviewers, or iterating on subjective evaluations, load anti-sycophancy.md — cold-start isolation, fresh instances per round, label randomization, convergence detection.

When designing multi-agent workflows that must survive partial failure, load resilience-patterns.md — cascade prevention (timeouts, circuit breakers, bulkheads), failure classification (retry vs reassign vs escalate), mid-pipeline compensation for irreversible side effects, post-failure synthesis of partial results.

Verify

  • All tasks in terminal state (completed or blocked)
  • No orphaned teammates (git worktree list shows no stale entries)
  • Overlapping file edits reviewed and merged
  • Full test suite passes post-integration

References

DocumentWhen to loadWhat it covers
---------------------------------------
team-compositions.mdSizing a team or choosing a preset7 preset compositions, subagent_type cardinal rule, custom-team guidelines
agent-types.mdChoosing which agent to spawnBuilt-in and plugin agent types with examples
teammate-operations.mdUsing TeammateTool for persistent agentsAll 13 operations (spawnTeam, write, broadcast, requestShutdown, etc.)
task-system.mdManaging work items and dependenciesTaskCreate, TaskList, TaskGet, TaskUpdate, file structure
message-formats.mdSending structured messages between agentsAll JSON message examples (regular, shutdown, idle, plan approval)
orchestration-patterns.mdDesigning a multi-agent workflow6 patterns + 3 complete workflow examples
spawn-backends.mdTroubleshooting agent spawn issuesBackend comparison, auto-detection, in-process/tmux/iterm2
environment-config.mdConfiguring team environmentEnvironment variables and team config structure
handoff-templates.mdPassing work between agentsQA FAIL and Escalation Report formats
context-carry-forward.mdLong sessions with orchestrated subagentsContinue / Rewind / compact / Subagent / clear+brief decision table
anti-sycophancy.mdJudge panels, parallel reviewers, subjective evalsCold-start isolation, fresh instances per round, label randomization, convergence detection
resilience-patterns.mdDesigning workflows that survive partial failureCascade prevention, failure classification, mid-pipeline compensation, post-failure synthesis

版本历史

共 1 个版本

  • v1.0.0 从ClawHub迁移发布 当前
    2026-06-07 12:45 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

agnes-image-gen

user_15292d5a
使用 Agnes AI 的图片生成模型生成图片,支持文生图(agnes-image-2.1-flash)和图生图(agnes-image-2.0-flash)。支持自定义 API Key,用户可使用自己的 Agnes Key。优化重点:降低
★ 0 📥 145
ai-agent

Find Skills

guipi888
场景驱动+关键词双模式技能发现工具。当用户用自然语言描述场景/需求(如"我想做一个海报""帮我分析股票"),或明确说"安装技能/find skills/找个skill"时,自动从官方内置、本地已安装、SkillHub、虾评、GitHub、C
★ 1,463 📥 529,389
ai-agent

self-improving agent

pskoett
捕获经验教训、错误及修正内容,以实现持续改进。适用于以下场景:(1)命令或操作意外失败;(2)用户纠正Claude(如“不,那不对……”“实际上……”);(3)用户请求的功能不存在;(4)外部API或工具出现故障;(5)Claude发现自身
★ 4,099 📥 827,850