← 返回
数据分析 中文

Observability Lgtm

Set up a full local LGTM observability stack (Loki + Grafana + Tempo + Prometheus + Alloy) for FastAPI apps. One Docker Compose, one Python import, unified d...
Set up a full local LGTM observability stack (Loki + Grafana + Tempo + Prometheus + Alloy) for FastAPI apps. One Docker Compose, one Python import, unified d...
nissan
数据分析 clawhub v1.2.2 2 版本 100000 Key: 无需
★ 0
Stars
📥 948
下载
💾 10
安装
2
版本
#latest

概述

Last used: 2026-03-24

Memory references: 7

Status: Active

observability-lgtm

Set up a full local observability stack (Loki + Grafana + Tempo + Prometheus + Alloy)

for FastAPI apps on macOS (Apple Silicon) or Linux. One command to start, one import

to instrument any app. Logs → Loki, metrics → Prometheus, traces → Tempo, all

unified in Grafana.

When to use

  • User is building a FastAPI web app and wants logs, metrics, and traces
  • User wants a local Grafana dashboard without setting up ELK (too heavy)
  • User wants to correlate logs ↔ traces ↔ metrics in one UI
  • User has multiple local apps and wants universal observability

When NOT to use

  • Production cloud deployments (use managed Grafana Cloud or Datadog instead)
  • Non-Python apps (the Python lib only works for FastAPI; the stack itself is language-agnostic)
  • When Docker is not available

Prerequisites

  • Docker + Docker Compose v2 installed
  • Python 3.10+ (for the instrumentation lib)
  • FastAPI app to instrument

What gets installed

ServicePortPurpose
---------
Grafana3000Dashboards — no login in dev mode
Prometheus9091Metrics scraping (avoids 9090 if MinIO running)
Loki3300Log storage (avoids 3100 if Langfuse running)
Tempo gRPC4317OTLP trace receiver
Tempo HTTP4318OTLP HTTP alternative
Alloy UI12345Agent status

Steps

Step 1 — Check for port conflicts

lsof -iTCP -sTCP:LISTEN -n -P 2>/dev/null | grep -E ":(3000|3300|9091|4317|4318|12345)" | awk '{print $9, $1}'

If any of the ports above are in use, update the relevant port in docker-compose.yml

and the matching url: in config/grafana/provisioning/datasources/datasources.yml.

Common conflicts: Langfuse on 3100, MinIO on 9090.

Step 2 — Copy the stack

Copy these files from the skill directory into a projects/observability/ folder

in the workspace:

  • assets/docker-compose.yml
  • assets/config/ (entire directory tree)
  • assets/lib/observability.py
  • assets/scripts/register_app.sh
mkdir -p projects/observability
cp -r SKILL_DIR/assets/* projects/observability/
mkdir -p projects/observability/logs
touch projects/observability/logs/.gitkeep
chmod +x projects/observability/scripts/register_app.sh

Step 3 — Start the stack

cd projects/observability
docker compose up -d

Wait ~15 seconds for all services to start, then verify:

curl -s -o /dev/null -w "Grafana: %{http_code}\n"    http://localhost:3000/api/health
curl -s -o /dev/null -w "Prometheus: %{http_code}\n" http://localhost:9091/-/healthy
curl -s -o /dev/null -w "Loki: %{http_code}\n"       http://localhost:3300/ready
curl -s -o /dev/null -w "Tempo: %{http_code}\n"      http://localhost:4318/ready

All should return 200. If Loki or Tempo return 503, wait 10 more seconds and retry

(they have a slower startup than Grafana/Prometheus).

Step 4 — Install Python deps for the app

pip install \
  "prometheus-fastapi-instrumentator>=7.0.0" \
  "opentelemetry-sdk>=1.25.0" \
  "opentelemetry-exporter-otlp-proto-grpc>=1.25.0" \
  "opentelemetry-instrumentation-fastapi>=0.46b0" \
  "python-json-logger>=2.0.7"

Step 5 — Instrument the FastAPI app

Add to the app's app.py (or main.py), just after app = FastAPI(...):

import sys
sys.path.insert(0, "path/to/projects/observability/lib")
from observability import setup_observability
logger = setup_observability(app, service_name="my-service-name")

That's it. The app now:

  • Exposes /metrics for Prometheus
  • Writes JSON logs to projects/observability/logs/my-service-name/app.log
  • Sends traces to Tempo on localhost:4317

Step 6 — Register with Prometheus

cd projects/observability
./scripts/register_app.sh my-service-name <port>
# e.g.: ./scripts/register_app.sh image-gen-studio 7860

Prometheus hot-reloads the target within 30 seconds. Verify:

curl -s "http://localhost:9091/api/v1/targets" | python3 -c "
import json, sys
data = json.load(sys.stdin)
for t in data['data']['activeTargets']:
    svc = t['labels'].get('service', '')
    print(svc, '->', t['health'])
"

Step 7 — Open Grafana

Open http://localhost:3000

The FastAPI — App Overview dashboard is pre-loaded. Select your service from

the dropdown at the top. You'll see:

  • Request rate (req/s)
  • Error rate (%)
  • Latency p50/p95/p99
  • Requests by endpoint
  • HTTP status codes
  • Live log panel (Loki)

To jump from a log line to its trace: click the trace_id link in the log detail panel.

It opens the full trace in Tempo automatically (datasource pre-wired).

Step 8 — Import additional dashboards (optional)

In Grafana → Dashboards → Import:

  • 16110 — FastAPI Observability (richer alternative to the built-in)
  • 13407 — Loki Logs Overview
  • 16112 — Tempo Service Graph (service dependency map)

Useful commands

# Reload Prometheus config after registering a new app:
curl -s -X POST http://localhost:9091/-/reload

# Restart a single service without losing data:
docker compose -f projects/observability/docker-compose.yml restart grafana

# Stop everything (data volumes preserved):
docker compose -f projects/observability/docker-compose.yml down

# Nuclear reset (wipes all stored data):
docker compose -f projects/observability/docker-compose.yml down -v

# Check Alloy log shipping status:
open http://localhost:12345

Manual tracing (optional)

from observability import get_tracer
tracer = get_tracer(__name__)

@app.get("/expensive-endpoint")
async def handler():
    with tracer.start_as_current_span("db-query") as span:
        span.set_attribute("db.table", "users")
        result = await db.query(...)
    return result

Log/trace correlation

The OTel instrumentation injects trace_id into every log record. Grafana Loki

is pre-configured with a derived field that turns "trace_id":"abc123" into a

clickable link to the Tempo trace.

To manually include trace context in your own log calls:

from opentelemetry import trace

def trace_ctx() -> dict:
    ctx = trace.get_current_span().get_span_context()
    return {"trace_id": format(ctx.trace_id, "032x")} if ctx.is_valid else {}

logger.info("Processing request", extra=trace_ctx())

Notes

  • Logs are written to projects/observability/logs//app.log as JSON.

Alloy tails these files and ships to Loki — no code changes needed beyond setup_observability().

  • All observability is local — no data leaves the machine.
  • data_classification: LOCAL_ONLY is the default for all traces/logs.
  • The Alloy config drops DEBUG-level logs by default. Edit config/alloy/config.alloy

to remove the stage.drop block if you need debug logs.

版本历史

共 2 个版本

  • v1.2.2 当前
    2026-05-23 15:49 安全 安全
  • v1.2.0
    2026-03-29 19:57 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Fact Checker

nissan
对照源数据验证 Markdown 草稿中的声明、数字和事实。适用场景:发布前审核博客文章、报告或文档的准确性。
★ 3 📥 2,095
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 198 📥 64,855
data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 366 📥 139,959