← 返回
未分类

Distributed Feasibility Checker

Evaluate whether a system should adopt distributed architecture by systematically checking against the 8 Fallacies of Distributed Computing and assessing tea...
quochungto
未分类 clawhub v1.0.0 100000 Key: 无需
★ 0
Stars
📥 349
下载
💾 0
安装

概述

Distributed Architecture Feasibility Checker

When to Use

Someone is proposing or considering distributed architecture (microservices, service-based, event-driven) and you need to evaluate whether it's actually justified and feasible. Typical situations:

  • "Let's move to microservices" — but has anyone checked if the team is ready?
  • Growing pains with a monolith — but is distribution the right solution?
  • CTO or tech lead pushing for distribution based on industry hype
  • Pre-requisite sanity check before architecture-style-selector
  • Post-quantum-analysis: you've identified multiple quanta, now check if the team can actually operate distributed

Before starting, verify:

  • Is there a genuine architectural problem to solve? (If the monolith is working fine, distribution adds cost without benefit)
  • Has quantum analysis been done? If not, distribution may not even be needed (use architecture-quantum-analyzer first)

Context & Input Gathering

Input Sufficiency Check

This skill critically depends on TEAM context, not just technical requirements. A system that technically needs distribution may still fail if the team can't operate it.

Required Context (must have — ask if missing)

  • System description: What does the system do? What's the current architecture?

→ Check prompt for: system purpose, current state (monolith/distributed/greenfield)

→ If missing, ask: "What does your system do, and what's your current architecture? (monolith, some services, greenfield?)"

  • Team size and distributed experience: How many developers? Have they operated distributed systems before?

→ Check prompt for: team size, experience mentions, technology familiarity

→ If missing, ask: "How many developers do you have, and has your team operated distributed systems (microservices, message queues, service mesh) before?"

WHY this is critical: Team experience is the #1 predictor of distributed architecture success. A team that's never run microservices will struggle regardless of technical merit.

  • Motivation for considering distribution: Why are they thinking about this?

→ Check prompt for: scaling issues, deployment pain, team autonomy needs, hype

→ If missing, ask: "What specific problem is driving you toward distributed architecture? (a) scaling bottleneck, (b) deployment takes too long, (c) teams stepping on each other, (d) someone said we should, (e) other?"

Observable Context (gather from environment)

  • Current infrastructure: What deployment and monitoring tools exist?

→ Look for: docker-compose, k8s manifests, CI/CD configs, monitoring configs

→ Reveals: operational maturity level

  • Service communication: Are there already distributed calls?

→ Look for: HTTP client imports, message queue configs, gRPC definitions

→ Reveals: whether distribution has already started

Default Assumptions

  • If team experience unknown → assume NO distributed experience (safer to overestimate the challenge)
  • If monitoring tools unknown → assume basic logging only (no distributed tracing)
  • If motivation unclear → probe before proceeding — distribution without clear motivation is the biggest risk

Sufficiency Threshold

SUFFICIENT: system description + team size + team experience + motivation are known
MUST ASK: team experience is unknown (this is NEVER safe to default)

Process

Step 1: Understand the Motivation

ACTION: Clarify WHY distribution is being considered. Categorize the motivation.

WHY: The most dangerous path to distributed architecture is "because everyone else is doing it." Valid motivations have specific, measurable problems. Invalid motivations are based on hype, resume-driven development, or "Netflix does it." Categorizing the motivation early prevents wasted analysis.

MotivationValidityNext step
-----------:---:----------
Specific scaling bottleneck in one partValidQuantify the bottleneck
Deployment takes too long (all-or-nothing)ValidCheck if modular monolith solves it first
Teams blocking each other on shared codeValidCheck if code ownership solves it first
"Everyone uses microservices now"InvalidPush back — this isn't a problem statement
"Our CTO read an article"InvalidAsk what specific problem they're trying to solve
Technology exploration / learningPartially validBe honest about the cost of learning in production

Step 2: Evaluate Against the 8 Fallacies

ACTION: Systematically evaluate the project against each of the 8 Fallacies of Distributed Computing. For each, assess: does the team understand this risk? Do they have mitigations?

WHY: The 8 fallacies are assumptions that developers make about distributed systems that are FALSE. Every distributed system must contend with all 8. Teams that haven't thought about them will be surprised — and surprises in distributed systems cause outages. Using these as a checklist transforms abstract knowledge into a concrete readiness assessment.

For each fallacy, evaluate:

#FallacyThe false assumptionReality check question
-------------------------------------------------------
1The Network Is ReliableNetwork calls always succeedDo you have timeouts and circuit breakers? What happens when Service B is unreachable?
2Latency Is ZeroRemote calls are as fast as localWhat's your average and 95th-percentile latency? How many chained service calls per request?
3Bandwidth Is InfiniteSend as much data as you wantAre you sending entire objects when you only need a few fields? (Stamp coupling)
4The Network Is SecureInternal network is safeDoes distribution multiply your attack surface? How many new network endpoints?
5The Topology Never ChangesNetwork layout is fixedWhat happens when ops upgrades routers on the weekend? Do your services use hardcoded IPs?
6There Is Only One AdministratorOne team controls everythingHow many teams manage infrastructure? Who coordinates deployments?
7Transport Cost Is ZeroNetwork calls are freeWhat's the actual infrastructure cost of service mesh, load balancers, API gateways?
8The Network Is HomogeneousAll network equipment is the sameDo you run multi-cloud? Different hardware vendors?

Step 3: Assess Operational Readiness

ACTION: Evaluate whether the team has the operational maturity to run distributed systems.

WHY: Distribution doesn't just change how you code — it fundamentally changes how you operate. Distributed logging, distributed tracing, distributed transactions, independent deployments, service discovery, contract versioning — these are operational capabilities that don't exist in monolith-land. A team without these capabilities will build a distributed system they can't debug, can't deploy safely, and can't monitor.

CapabilityQuestionReady if...Not ready if...
--------------------------------------------------
Distributed loggingCan you correlate logs across services?Have ELK/Datadog with correlation IDsConsole.log to stdout per service
Distributed tracingCan you trace a request across service boundaries?Have Jaeger/Zipkin/Datadog APMNo tracing infrastructure
CI/CD per serviceCan you deploy one service without deploying all?Per-service pipelines with independent versioningSingle pipeline deploying everything
Service discoveryHow do services find each other?Service mesh, DNS-based, or registryHardcoded URLs in config
Contract managementHow do you handle API changes between services?Versioned APIs, consumer-driven contract testsNo versioning strategy
Monitoring & alertingCan you detect when one service is degrading?Per-service health checks, SLO dashboardsAggregate-only monitoring

Step 4: Check for Simpler Alternatives

ACTION: Before recommending distribution, verify that simpler solutions don't solve the problem.

WHY: Distribution is the most expensive solution to almost any problem. A modular monolith with good code boundaries solves many of the same problems (team autonomy, code organization, independent development) without the operational overhead. The book explicitly states that monolith advantages are REAL — simpler deployment, simpler testing, simpler debugging, lower cost. Distribution should be the LAST option, not the first.

ProblemSimpler alternativeWhen it's NOT enough
-------------------------------------------------
Deployment takes too longModular monolith with independent module buildsDifferent modules need different deployment frequencies
Teams stepping on each otherCode ownership + branch-by-abstractionTeams need different technology stacks
One part needs to scaleSeparate the hot path only (strangler fig)3+ parts need independent scaling
"It's too complex"Better module boundaries, cleaner interfacesGenuine bounded contexts with different data models

Step 5: Produce the Feasibility Assessment

ACTION: Compile a structured go/no-go assessment with specific recommendations.

WHY: The value of this skill is the structured, honest assessment — not a blanket "yes" or "no" to microservices. Some teams are ready. Some aren't. Some should start with a single service extraction, not full distribution. The assessment should be specific enough to act on.

Inputs

  • System description and current architecture
  • Team size, experience, and operational capabilities
  • Motivation for considering distribution

Outputs

Distributed Architecture Feasibility Assessment

# Feasibility Assessment: {System Name}

## Motivation Analysis
**Stated motivation:** {what the team says}
**Validated motivation:** {Valid / Invalid / Partially valid}
**Underlying problem:** {the real problem, which may differ from stated motivation}

## 8 Fallacies Evaluation

| # | Fallacy | Team awareness | Mitigations in place | Risk level |
|---|---------|:---:|:---:|:---:|
| 1 | Network Is Reliable | Yes/No | {specific mitigations or "none"} | Low/Med/High |
| 2 | Latency Is Zero | Yes/No | {mitigations} | Low/Med/High |
| 3 | Bandwidth Is Infinite | Yes/No | {mitigations} | Low/Med/High |
| 4 | Network Is Secure | Yes/No | {mitigations} | Low/Med/High |
| 5 | Topology Never Changes | Yes/No | {mitigations} | Low/Med/High |
| 6 | Only One Administrator | Yes/No | {mitigations} | Low/Med/High |
| 7 | Transport Cost Is Zero | Yes/No | {mitigations} | Low/Med/High |
| 8 | Network Is Homogeneous | Yes/No | {mitigations} | Low/Med/High |

**Fallacy readiness score:** {X}/8 mitigated

## Operational Readiness

| Capability | Status | Gap |
|-----------|:---:|-----|
| Distributed logging | Ready/Not ready | {what's missing} |
| Distributed tracing | Ready/Not ready | {what's missing} |
| CI/CD per service | Ready/Not ready | {what's missing} |
| Service discovery | Ready/Not ready | {what's missing} |
| Contract management | Ready/Not ready | {what's missing} |
| Monitoring & alerting | Ready/Not ready | {what's missing} |

**Operational readiness score:** {X}/6 capabilities in place

## Simpler Alternatives Considered
| Alternative | Solves the problem? | Why/why not |
|------------|:---:|-------------|
| Modular monolith | Yes/No/Partially | {reasoning} |
| Single service extraction | Yes/No/Partially | {reasoning} |
| Better code boundaries | Yes/No/Partially | {reasoning} |

## Recommendation
**{Go / No-Go / Conditional Go}**
- {Primary reasoning}
- {If conditional: what must be done first}

## If Proceeding: Readiness Roadmap
1. {First capability to build before distributing}
2. {Second capability}
3. {Suggested first service to extract}

Key Principles

  • Distribution is a trade-off, not an upgrade — Distributed architecture gains scalability and team autonomy but pays with operational complexity, debugging difficulty, and infrastructure cost. It's not inherently better than monolith — it's different, with different trade-offs. The 8 fallacies are the price of admission.
  • Team readiness trumps technical need — A technically justified distributed architecture operated by an unprepared team produces worse outcomes than a monolith. Team experience with distributed operations, monitoring, and debugging is the #1 success predictor.
  • Check for simpler solutions first — A modular monolith with clean boundaries solves 80% of the problems people think require microservices, at 20% of the operational cost. Distribution should be the LAST option, not the first.
  • Monolith is not a dirty word — The book explicitly defends monolith advantages: simpler deployment, simpler testing, simpler debugging, lower operational cost. Many successful systems run as monoliths. Don't recommend distribution to be "modern."
  • The distributed monolith is the worst outcome — Adopting microservices but keeping all the coupling of a monolith gives you the operational overhead of distribution with none of the benefits. This is the most common result of premature distribution.
  • Latency is the deal-breaker — Fallacy #2 is the primary factor in whether distribution is feasible. If your request chains 10 service calls at 100ms each, you've added 1 second of latency. Know your numbers before committing.

Examples

Scenario: Startup wanting microservices

Trigger: "We're 5 developers building a SaaS. Should we start with microservices?"

Process: Asked about motivation — "our CTO says it's best practice." Invalid motivation — no specific problem. Evaluated: team has no distributed experience, no monitoring beyond basic logging, single CI pipeline. Checked simpler alternatives: modular monolith solves all current needs. Fallacy check: 0/8 mitigated. Operational readiness: 0/6.

Output: No-Go. Recommended modular monolith with clean domain boundaries. Distribution adds operational cost the 5-person team can't absorb. Revisit when: team hits 15+ developers, or specific parts need independent scaling proven by data.

Scenario: Growing monolith with real pain

Trigger: "We have 40 developers, deployments take 2 hours, and the payment module keeps bringing down the whole site during Black Friday."

Process: Valid motivation — specific scaling bottleneck + deployment pain. Team has 3 years of monolith experience, basic CI/CD, Datadog for monitoring but no distributed tracing. Fallacy evaluation: aware of #1 and #2, unaware of #3-#8. Operational readiness: 2/6 (monitoring, basic CI/CD). Simpler alternatives: modular monolith partially solves deployment but not the Black Friday scaling.

Output: Conditional Go. Extract payment module as first service (strangler fig pattern). Before extracting: implement distributed tracing, per-service CI/CD pipeline, and circuit breakers. Don't attempt full microservices — start with 2-3 services maximum.

Scenario: Already distributed but struggling

Trigger: "We went microservices 6 months ago and everything is on fire. Can't trace bugs, deployments are a nightmare, and our latency tripled."

Process: Diagnosed against fallacies: Fallacy #2 (latency) — chaining 8 synchronous calls, 95th percentile at 2 seconds. Fallacy #1 — no circuit breakers, cascading failures. Operational readiness: 1/6 (only basic logging). The team fell into the distributed monolith anti-pattern — services share a database and deploy in lockstep.

Output: Feasibility assessment showing the team wasn't ready. Recommended: consolidate back to 3-4 larger services (from 12), implement distributed tracing and circuit breakers, establish per-service databases before attempting fine-grained services again.

References

License

This skill is licensed under CC-BY-SA-4.0.

Source: BookForge — Fundamentals of Software Architecture by Mark Richards, Neal Ford.

Related BookForge Skills

This skill is standalone. Browse more BookForge skills: bookforge-skills

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 12:09 安全 安全

安全检测

暂无安全检测报告