Rust Systems & Services

Covers modern application-layer Rust (edition 2024): CLIs, web services, libraries. Not no_std/embedded.

Tooling

Tool	Purpose
------	---------
`cargo`	Build, dep management, script runner
`clippy`	Lint (`cargo clippy --workspace --all-targets -- -D warnings`)
`rustfmt`	Formatter (`cargo fmt --all`)
`cargo-nextest`	Test runner, noticeably faster than `cargo test`, better isolation
`cargo-deny`	License + advisory + duplicate-dep checks
`cargo-machete`	Find unused dependencies

Pin rust-toolchain.toml per repo so every contributor and CI uses the same compiler.
cargo update -p for single-package upgrades. cargo update rewrites everything — avoid in PR diffs.
Cargo.lock goes in version control for binaries and libraries (modern guidance; reproducibility wins).

Workspaces

Multi-crate projects use a workspace with layered crates. Dependencies point inward only.

Cargo.toml                  # [workspace] members + [workspace.dependencies]
crates/
  protocol/    # Shared types, no deps on other workspace crates
  storage/     # Persistence, depends on protocol
  service/    # Business logic, depends on protocol + storage
  cli/        # Binary, depends on everything

Centralize versions in [workspace.dependencies], reference as foo = { workspace = true } in members.
Keep the leaf-most crate (protocol / types) dependency-free so every other crate can depend on it without cycles.
Feature flags belong on the crate that introduces the dependency, not re-exported through the workspace root.
Library crates expose one stable facade: a thin lib.rs with a //! module doc comment stating purpose, followed by pub use re-exports of the public surface. Consumers learn one import path per concept; internal module layout can be reorganized without breaking callers.
Feature gates must error, never silently degrade. If runtime config requests a capability the binary wasn't compiled with (e.g. device = "gpu" on a non-CUDA build), fail at startup with a clear error. Silent fallback produces different behavior from what the operator configured, often without anyone noticing.
Centralize lints at the workspace root with [workspace.lints.*]. Every member crate inherits the same ruleset — no drift between crates, no per-crate #![deny(...)] stacks. Example:

```toml

[workspace.lints.rust]

unsafe_code = "warn"

missing_docs = "warn"

[workspace.lints.clippy]

all = { level = "warn", priority = -1 }

pedantic = { level = "warn", priority = -1 }

nursery = { level = "warn", priority = -1 }

module_name_repetitions = "allow"

must_use_candidate = "allow"

```

Each member crate opts in with [lints] workspace = true in its own Cargo.toml. Changing a lint in one place updates every crate.

Build Profiles

When tuning Cargo build profiles (release LTO, release-dbg symbols, release-min for distributable binaries) or adding dev-machine speedups (mold linker, target-cpu=native, share-generics), load build-profiles.md.

Error Handling

Split by crate role:

Libraries / lower crates: define typed errors with thiserror. Consumers can pattern-match.
Binaries / top-level crates: use anyhow::Result with .context("what was being attempted"). Human-readable error chains.
Never return Box from library APIs — it erases variant information.
Use ? liberally. Never .unwrap() or .expect() outside tests and main. An expect("...") is acceptable only when the invariant is provably upheld and the message explains why.
Convert at boundaries: #[from] on thiserror variants for auto-conversion; .map_err(MyError::from) when explicit.
bail!("...") / ensure!(cond, "...") in application code for early exits.
Prefer Result over panics for any recoverable error. Panics are for programmer bugs (broken invariants), not runtime failures.
#[must_use] on fallible APIs: annotate functions returning Result or newtype-wrapped results that callers frequently ignore. Catches let _ = validate(x); at compile time instead of shipping a silently-dropped error.
Make illegal call-sequences unrepresentable rather than returning a runtime error — the type-state pattern. When an API has a mandatory call order (configure → connect → use), encode each stage as a distinct type (Client → Client → Client) carrying PhantomData; a method only exists on the state that permits it. Calling send_request before connect then fails to compile instead of panicking at runtime — there is no error variant to handle because the bad sequence cannot be written.

Ownership Discipline

Take &str over &String, &[T] over &Vec in function signatures — accepts more call sites for free.
Return owned (String, Vec) from constructors and public APIs. Borrow in hot paths where lifetimes are obvious.
Reach for Arc only when sharing across threads. Single-threaded sharing uses Rc or references.
Cow<'_, str> when a function sometimes allocates and sometimes borrows (e.g. normalization).
Lifetime elision handles 90% of cases. If you're writing 'a in more than one signature, reconsider whether that type should own its data instead.
bytes::Bytes for zero-copy slicing of shared immutable buffers — network parsers, frame decoders, protocol handlers. BytesMut for building buffers that split_to / split_off into Bytes without reallocation. Prefer Bytes over Arc> when slicing is the dominant access pattern.
Reduce hot-path heap allocations with stack-or-inline collections when the typical size is small and known:
smallvec::SmallVec<[T; N]> — inline for ≤N items, spills to heap beyond. Good for "usually 1-8 items" cases like parsed tag lists, lookup keys, small event batches.
arrayvec::ArrayVec — fixed capacity, never heap-allocates. Returns an error when full. Good for bounded message buffers or per-request scratch space.
String interning for repeatedly-seen strings (enum-like values parsed from config, tenant IDs, route keys): dashmap::DashMap with Box::leak on miss gives &'static str comparisons without per-call allocations.

These are optimizations — profile first. Vec/String on a cold path isn't the bottleneck.

Async with Tokio

Default runtime: #[tokio::main] with features = ["full"] for apps; features = ["rt", "macros", "sync"] for libraries that need to stay slim.
tokio::spawn for independent tasks. JoinSet for a dynamic group you'll await together with cancellation.
tokio::select! for racing futures (timeouts, cancellation, first-wins).
Never block the runtime: tokio::task::spawn_blocking for sync CPU work or blocking I/O libs.
tokio::sync::Mutex only when the guard must be held across .await. Otherwise std::sync::Mutex is faster.
tokio::sync::RwLock when reads dominate writes (config snapshots, route tables, hot caches). Many readers proceed in parallel; Mutex serializes them. For snapshot-swap semantics (rarely-updated config), arc-swap::ArcSwap is faster still — no lock on the read path.
Cancellation: CancellationToken (from tokio-util) propagates shutdown. Long-running tasks must check it.
Backpressure via bounded mpsc channels — unbounded channels hide memory growth until OOM.
Semaphore for hard concurrency limits on spawn paths that don't fit a channel model (e.g. "at most 50 concurrent outbound HTTP calls"). let _permit = sem.acquire().await?; inside the task; dropping the permit releases the slot. Pair with Arc shared across spawners.
Don't mix async runtimes. Pick tokio and stick with it; async-std and smol don't interop cleanly.
Vectored writes (write_vectored + std::io::IoSlice) coalesce many buffers — interleaved headers and payloads — into a single syscall when flushing a batch of messages to a socket; the kernel does the gather. An optimization for measured syscall-bound flush paths — profile first; a single write_all is fine elsewhere.

CLI Tools (clap)

Use the derive API: #[derive(Parser)] + #[derive(Subcommand)]. Less boilerplate, types drive the help text.
One enum Commands variant per subcommand; flatten shared flags into a #[command(flatten)] struct CommonArgs.
--json flag on query commands for agent/pipe consumption. Emit via serde_json::to_string(&value)?.
Exit codes: 0 success, 1 for errors main returned, 2 for argparse (clap handles this), reserve 3+ for domain meanings documented in --help.
Provide --version automatically via #[command(version)].

See cli-tools.md for config layering, logging setup, progress reporting, and shell completions.

HTTP Services (axum)

Framework default: axum (tokio-native, tower middleware, extractor-based handlers). Pick actix-web only if an existing codebase uses it.
Handlers return Result. Implement IntoResponse for AppError to centralize error → status mapping.
Validate input at the boundary: axum::extract::Json where T: Deserialize + Validate (use validator crate). Internal services trust input was validated.
Share state via State> — not globals, not lazy_static.
Middleware via tower::ServiceBuilder: tracing → timeout → auth → CORS → handler. Order matters.
Resilience layer stack (outbound HTTP clients and shared services): ServiceBuilder::new().layer(TimeoutLayer).layer(RateLimitLayer).layer(ConcurrencyLimitLayer).layer(LoadShedLayer).layer(RetryLayer).service(client). Name each layer explicitly — LoadShedLayer sheds excess load, ConcurrencyLimitLayer caps in-flight requests, RateLimitLayer bounds request rate, RetryLayer retries classified transient errors. Combining LoadShedLayer + ConcurrencyLimitLayer produces proper backpressure instead of unbounded queueing.

See axum-service.md for project layout, extractors, error types, graceful shutdown, and OpenAPI generation.

Concurrency

Workload	Approach
----------	----------
Independent async I/O	`tokio::spawn` + `JoinSet` or `futures::join!`
Data-parallel CPU work	`rayon` with `par_iter`
Shared mutable state across threads	`Arc>` or `Arc>`, smallest scope possible
Single-producer pipelines	`tokio::sync::mpsc` (async) or `std::sync::mpsc` (sync)
Broadcast / fan-out	`tokio::sync::broadcast`

rayon and tokio coexist — use tokio::task::spawn_blocking to call a rayon pool from async code. Never call .block_on() from inside a tokio task; it deadlocks the runtime.

Testing

Built-in #[test]. Prefer cargo nextest run --workspace over cargo test — it runs tests in parallel processes with proper isolation.
Unit tests live in mod tests { ... } at the bottom of the file (access to private items).
Integration tests in tests/ directory. One file per public surface area.
#[tokio::test] for async tests. Add flavor = "multi_thread" when the code under test spawns tasks.
rstest for parametrized tests and fixtures. proptest / quickcheck for property-based tests on pure logic.
insta for snapshot testing CLI output, serialization, large structs. Review diffs with cargo insta review.
assert_cmd + predicates for CLI integration tests (invokes the binary, asserts on stdout/stderr/exit code).
Assert on error variants with matches!: assert!(matches!(result.unwrap_err(), MyError::Validation(_))). Cleaner than match arms when the test only cares whether the error is the right kind, and doesn't force updates when unrelated variants are added.
Coverage: cargo llvm-cov --workspace --html. Target 70%+ on application code, higher on library crates.
Fuzzing for parsers: cargo fuzz + libfuzzer-sys on any code that parses untrusted input (file formats, protocols, query languages). A short nightly fuzz run surfaces the panics and UB that unit tests miss.

For generic test discipline (anti-patterns, mock rules, rationalization resistance), see the ia-writing-tests skill.

Unsafe Discipline

Default: no unsafe. If clippy flags it, don't #[allow] it — refactor.
Every unsafe block gets a // SAFETY: comment above it explaining why each invariant holds. No comment = reviewer rejects.
Keep unsafe blocks minimal — wrap in a safe abstraction at module boundary, mark the module pub(crate).
Use miri (cargo +nightly miri test) on any crate containing unsafe or raw pointer arithmetic — catches UB that optimizers mask.
Prefer bytemuck, zerocopy, bytes over hand-rolled transmutes for zero-copy patterns.
std::env::set_var and remove_var are unsafe under edition 2024. Concurrent getenv from another thread is UB at the libc level; the unsafety can't be wrapped away by OnceLock::call_once or std::sync::Once — they ensure the closure runs once, not that it runs while no other thread is reading the environment. Pin every env-var write to single-threaded startup, before tokio::main or any std::thread::spawn. Common offender: native-library discovery paths (LD_LIBRARY_PATH, ORT_DYLIB_PATH, LIBTORCH, plugin loader paths) set lazily on first use — compute and set them in main (or a static initializer that runs before the runtime) so they're written before any concurrent reader exists.

Production Resilience

When productionizing a service (config validation, /health + /ready endpoints, graceful shutdown, retries/timeouts/jitter, connection pools, diagnostic secret redaction), load production-resilience.md.

Observability

For logging (tracing + tracing-subscriber with init recipe), #[instrument] spans, correlation IDs, metrics, and distributed tracing patterns, load observability.md. Never use println! or log:: in new code.

CI

General CI design lives with the ia-infrastructure-engineer agent. For Rust-specific callouts (rustsec/audit-check, cargo-llvm-cov, Swatinem/rust-cache, taiki-e/install-action, matrix coverage guidance, doc-test step), load ci-pipeline.md.

Discipline

Simplicity first — every change as simple as possible, impact minimal code.
Only touch what's necessary — avoid unrelated changes in a PR.
No #[allow(clippy::...)] as a shortcut — fix the underlying issue. Document exceptions with a rationale.
Before adding a trait or generic, verify it's used in 3+ places. Otherwise a concrete type is clearer.
Verify: see Verify section — pass all checks with zero warnings before declaring done.

Verify

cargo fmt --all -- --check passes with zero diffs
cargo clippy --workspace --all-targets --all-features -- -D warnings passes
cargo nextest run --workspace (or cargo test --workspace) passes with zero failures
cargo deny check passes (licenses, advisories, duplicates) for any crate going to production
No new unsafe without // SAFETY: comment

References

cli-tools.md — clap patterns, config layering, tracing setup, progress, shell completions
axum-service.md — project layout, extractors, error types, graceful shutdown, testing
build-profiles.md — release/release-dbg/release-min profiles, mold linker, dev compile speedups
ci-pipeline.md — Rust-specific CI steps (cargo audit, llvm-cov, rust-cache, matrix strategy, doc tests)
production-resilience.md — fail-fast config, health/ready endpoints, graceful shutdown, retries, timeouts, connection pools
observability.md — tracing init recipe, span instrumentation, correlation IDs, metrics, distributed tracing

ia-rust-systems

概述

Rust Systems & Services

Tooling

Workspaces

Build Profiles

Error Handling

Ownership Discipline

Async with Tokio

CLI Tools (clap)

HTTP Services (axum)

Concurrency

Testing

Unsafe Discipline

Production Resilience

Observability

CI

Discipline

Verify

References

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

CodeConductor.ai

Github

ia-brainstorming