← 返回
未分类 中文

Langgraph Architecture

Guides architectural decisions for LangGraph applications. Use when deciding between LangGraph vs alternatives, choosing state management strategies, designi...
为 LangGraph 应用提供架构决策指导,适用于在 LangGraph 与其他框架之间做选择、确定状态管理策略以及进行设计。
anderskev anderskev 来源
未分类 clawhub v1.0.1 2 版本 100000 Key: 无需
★ 0
Stars
📥 461
下载
💾 2
安装
2
版本
#latest

概述

LangGraph Architecture Decisions

When to Use LangGraph

Use LangGraph When You Need:

  • Stateful conversations - Multi-turn interactions with memory
  • Human-in-the-loop - Approval gates, corrections, interventions
  • Complex control flow - Loops, branches, conditional routing
  • Multi-agent coordination - Multiple LLMs working together
  • Persistence - Resume from checkpoints, time travel debugging
  • Streaming - Real-time token streaming, progress updates
  • Reliability - Retries, error recovery, durability guarantees

Consider Alternatives When:

ScenarioAlternativeWhy
----------------------------
Single LLM callDirect API callOverhead not justified
Linear pipelineLangChain LCELSimpler abstraction
Stateless tool useFunction callingNo persistence needed
Simple RAGLangChain retrieversBuilt-in patterns
Batch processingAsync tasksDifferent execution model

State Schema Decisions

TypedDict vs Pydantic

TypedDictPydantic
---------------------
Lightweight, fasterRuntime validation
Dict-like accessAttribute access
No validation overheadType coercion
Simpler serializationComplex nested models

Recommendation: Use TypedDict for most cases. Use Pydantic when you need validation or complex nested structures.

Reducer Selection

Use CaseReducerExample
----------------------------
Chat messagesadd_messagesHandles IDs, RemoveMessage
Simple appendoperator.addAnnotated[list, operator.add]
Keep latestNone (LastValue)field: str
Custom mergeLambdaAnnotated[list, lambda a, b: ...]
Overwrite listOverwriteBypass reducer

State Size Considerations

# SMALL STATE (< 1MB) - Put in state
class State(TypedDict):
    messages: Annotated[list, add_messages]
    context: str

# LARGE DATA - Use Store
class State(TypedDict):
    messages: Annotated[list, add_messages]
    document_ref: str  # Reference to store

def node(state, *, store: BaseStore):
    doc = store.get(namespace, state["document_ref"])
    # Process without bloating checkpoints

Graph Structure Decisions

Single Graph vs Subgraphs

Single Graph when:

  • All nodes share the same state schema
  • Simple linear or branching flow
  • < 10 nodes

Subgraphs when:

  • Different state schemas needed
  • Reusable components across graphs
  • Team separation of concerns
  • Complex hierarchical workflows

Conditional Edges vs Command

Conditional EdgesCommand
---------------------------
Routing based on stateRouting + state update
Separate router functionDecision in node
Clearer visualizationMore flexible
Standard patternsDynamic destinations
# Conditional Edge - when routing is the focus
def router(state) -> Literal["a", "b"]:
    return "a" if condition else "b"
builder.add_conditional_edges("node", router)

# Command - when combining routing with updates
def node(state) -> Command:
    return Command(goto="next", update={"step": state["step"] + 1})

Static vs Dynamic Routing

Static Edges (add_edge):

  • Fixed flow known at build time
  • Clearer graph visualization
  • Easier to reason about

Dynamic Routing (add_conditional_edges, Command, Send):

  • Runtime decisions based on state
  • Agent-driven navigation
  • Fan-out patterns

Persistence Strategy

Checkpointer Selection

CheckpointerUse CaseCharacteristics
-----------------------------------------
InMemorySaverTesting onlyLost on restart
SqliteSaverDevelopmentSingle file, local
PostgresSaverProductionScalable, concurrent
CustomSpecial needsImplement BaseCheckpointSaver

Checkpointing Scope

# Full persistence (default)
graph = builder.compile(checkpointer=checkpointer)

# Subgraph options
subgraph = sub_builder.compile(
    checkpointer=None,   # Inherit from parent
    checkpointer=True,   # Independent checkpointing
    checkpointer=False,  # No checkpointing (runs atomically)
)

When to Disable Checkpointing

  • Short-lived subgraphs that should be atomic
  • Subgraphs with incompatible state schemas
  • Performance-critical paths without need for resume

Multi-Agent Architecture

Supervisor Pattern

Best for:

  • Clear hierarchy
  • Centralized decision making
  • Different agent specializations
          ┌─────────────┐
          │  Supervisor │
          └──────┬──────┘
    ┌────────┬───┴───┬────────┐
    ▼        ▼       ▼        ▼
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│Agent1│ │Agent2│ │Agent3│ │Agent4│
└──────┘ └──────┘ └──────┘ └──────┘

Peer-to-Peer Pattern

Best for:

  • Collaborative agents
  • No clear hierarchy
  • Flexible communication
┌──────┐     ┌──────┐
│Agent1│◄───►│Agent2│
└──┬───┘     └───┬──┘
   │             │
   ▼             ▼
┌──────┐     ┌──────┐
│Agent3│◄───►│Agent4│
└──────┘     └──────┘

Handoff Pattern

Best for:

  • Sequential specialization
  • Clear stage transitions
  • Different capabilities per stage
┌────────┐    ┌────────┐    ┌────────┐
│Research│───►│Planning│───►│Execute │
└────────┘    └────────┘    └────────┘

Streaming Strategy

Stream Mode Selection

ModeUse CaseData
----------------------
updatesUI updatesNode outputs only
valuesState inspectionFull state each step
messagesChat UXLLM tokens
customProgress/logsYour data via StreamWriter
debugDebuggingTasks + checkpoints

Subgraph Streaming

# Stream from subgraphs
async for chunk in graph.astream(
    input,
    stream_mode="updates",
    subgraphs=True  # Include subgraph events
):
    namespace, data = chunk  # namespace indicates depth

Human-in-the-Loop Design

Interrupt Placement

StrategyUse Case
--------------------
interrupt_beforeApproval before action
interrupt_afterReview after completion
interrupt() in nodeDynamic, contextual pauses

Resume Patterns

# Simple resume (same thread)
graph.invoke(None, config)

# Resume with value
graph.invoke(Command(resume="approved"), config)

# Resume specific interrupt
graph.invoke(Command(resume={interrupt_id: value}), config)

# Modify state and resume
graph.update_state(config, {"field": "new_value"})
graph.invoke(None, config)

Gates (sequenced)

Complete in order before treating a LangGraph design as locked in. Each step has an objective pass condition (artifact or explicit “none”), not an honor-system “we considered it.”

  1. AlternativesPass: For the workload, either (a) at least one row from Consider Alternatives When was evaluated and rejected with a one-line reason, or (b) the use case clearly matches Use LangGraph When You Need and does not fit a “consider alternative” row.
  2. State contractPass: Every state field has an assigned reducer (or default/LastValue) documented in the same place as the schema; large payloads are references or Store-backed, not inlined blobs (see State Size Considerations).
  3. CheckpointerPass: The saver type is chosen for the target environment per Checkpointer Selection (e.g. production is not InMemorySaver unless explicitly test-only).
  4. Loops and flaky nodesPass: recursion_limit (or equivalent) is set for any graph that can cycle; per-node RetryPolicy or a documented “no retries” choice exists for external calls (see Retry Configuration).

Error Handling Strategy

Retry Configuration

# Per-node retry
RetryPolicy(
    initial_interval=0.5,
    backoff_factor=2.0,
    max_interval=60.0,
    max_attempts=3,
    retry_on=lambda e: isinstance(e, (APIError, TimeoutError))
)

# Multiple policies (first match wins)
builder.add_node("node", fn, retry_policy=[
    RetryPolicy(retry_on=RateLimitError, max_attempts=5),
    RetryPolicy(retry_on=Exception, max_attempts=2),
])

Fallback Patterns

def node_with_fallback(state):
    try:
        return primary_operation(state)
    except PrimaryError:
        return fallback_operation(state)

# Or use conditional edges for complex fallback routing
def route_on_error(state) -> Literal["retry", "fallback", "__end__"]:
    if state.get("error") and state["attempts"] < 3:
        return "retry"
    elif state.get("error"):
        return "fallback"
    return END

Scaling Considerations

Horizontal Scaling

  • Use PostgresSaver for shared state
  • Consider LangGraph Platform for managed infrastructure
  • Use stores for large data outside checkpoints

Performance Optimization

  1. Minimize state size - Use references for large data
  2. Parallel nodes - Fan out when possible
  3. Cache expensive operations - Use CachePolicy
  4. Async everywhere - Use ainvoke, astream

Resource Limits

# Set recursion limit
config = {"recursion_limit": 50}
graph.invoke(input, config)

# Track remaining steps in state
class State(TypedDict):
    remaining_steps: RemainingSteps

def check_budget(state):
    if state["remaining_steps"] < 5:
        return "wrap_up"
    return "continue"

Decision Checklist

After Gates (sequenced), before implementing:

  1. [ ] Is LangGraph the right tool? (vs simpler alternatives)
  2. [ ] State schema defined with appropriate reducers?
  3. [ ] Persistence strategy chosen? (dev vs prod checkpointer)
  4. [ ] Streaming needs identified?
  5. [ ] Human-in-the-loop points defined?
  6. [ ] Error handling and retry strategy?
  7. [ ] Multi-agent coordination pattern? (if applicable)
  8. [ ] Resource limits configured?

版本历史

共 2 个版本

  • v1.0.1 当前
    2026-05-03 06:35 安全 安全
  • v1.0.0
    2026-03-30 23:27

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

dev-programming

Vitest Testing

anderskev
Vitest 测试框架模式与最佳实践。适用于编写单元测试、集成测试、配置 vitest.config、使用 vi.mock/vi.fn 进行模拟等...
★ 0 📥 979
ai-agent

Find Skills

guipi888
场景驱动+关键词双模式技能发现工具。当用户用自然语言描述场景/需求(如"我想做一个海报""帮我分析股票"),或明确说"安装技能/find skills/找个skill"时,自动从官方内置、本地已安装、SkillHub、虾评、GitHub、C
★ 1,490 📥 555,997
ai-agent

Agent Browser

rez0
用于 AI 代理的浏览器自动化 CLI。当用户需要与网站交互(包括浏览页面、填写表单、点击按钮、截图等)时使用。
★ 843 📥 323,951