← 返回
数据分析 中文

Survival Analysis (KM)

Generates Kaplan-Meier survival curves, calculates survival statistics (log-rank test, median survival time), and estimates hazard ratios for clinical and bi...
生成Kaplan‑Meier生存曲线,计算生存统计量(log‑rank检验、中位生存时间),并估算临床及生物...的 hazard ratios。
aipoch-ai
数据分析 clawhub v1.0.0 1 版本 99892.5 Key: 无需
★ 2
Stars
📥 889
下载
💾 18
安装
1
版本
#data-analysis#latest#survival analysis

概述

Survival Analysis (Kaplan-Meier)

Kaplan-Meier survival analysis tool for clinical and biological research. Generates publication-ready survival curves with statistical tests.

Features

  • Kaplan-Meier Curve Generation: Publication-quality survival plots with confidence intervals
  • Statistical Tests: Log-rank test, Wilcoxon test, Peto-Peto test
  • Hazard Ratios: Cox proportional hazards regression with 95% CI
  • Summary Statistics: Median survival time, restricted mean survival time (RMST)
  • Multi-group Analysis: Supports 2+ comparison groups
  • Risk Tables: Optional at-risk table below curves

Usage

Python Script

python scripts/main.py --input data.csv --time time_col --event event_col --group group_col --output results/

Arguments

ArgumentDescriptionRequired
---------------------------------
--inputInput CSV file pathYes
--timeColumn name for survival timeYes
--eventColumn name for event indicator (1=event, 0=censored)Yes
--groupColumn name for grouping variableOptional
--outputOutput directory for resultsYes
--conf-levelConfidence level (default: 0.95)Optional
--risk-tableInclude risk table in plotOptional

Input Format

CSV with columns:

  • Time column: Numeric, time to event or censoring
  • Event column: Binary (1 = event occurred, 0 = censored/right-censored)
  • Group column: Categorical variable for stratification

Example:

patient_id,time_months,death,treatment_group
P001,24.5,1,Drug_A
P002,36.2,0,Drug_A
P003,18.7,1,Placebo

Output Files

  • km_curve.png: Kaplan-Meier survival curve
  • km_curve.pdf: Vector version for publications
  • survival_stats.csv: Statistical summary (median survival, confidence intervals)
  • hazard_ratios.csv: Cox regression results with HR and 95% CI
  • `logrank_test.csv**: Pairwise comparison p-values
  • `report.txt**: Human-readable summary report

Technical Details

Statistical Methods

  1. Kaplan-Meier Estimator: Non-parametric maximum likelihood estimate of survival function
    • Product-limit estimator: Ŝ(t) = Π(tᵢ≤t) (1 - dᵢ/nᵢ)
    • Greenwood's formula for variance estimation
  1. Log-Rank Test: Most widely used test for comparing survival curves
    • Null hypothesis: No difference between groups
    • Weighted by number at risk at each event time
  1. Cox Proportional Hazards: Semi-parametric regression model
    • h(t|X) = h₀(t) × exp(β₁X₁ + β₂X₂ + ...)
    • Proportional hazards assumption checked via Schoenfeld residuals

Dependencies

  • lifelines: Core survival analysis library
  • matplotlib, seaborn: Visualization
  • pandas, numpy: Data handling
  • scipy: Statistical tests

Technical Difficulty: High ⚠️

This skill involves advanced statistical modeling. Results should be reviewed by a biostatistician, especially for:

  • Proportional hazards assumption violations
  • Small sample sizes (< 30 per group)
  • Heavy censoring (> 50%)
  • Time-varying covariates

References

See references/ folder for:

  • Kaplan EL, Meier P (1958) original paper
  • Cox DR (1972) regression models paper
  • Sample datasets for testing
  • Clinical reporting guidelines (ATN, CONSORT)

Parameters

ParameterTypeDefaultDescription
---------------------------------------
--inputstrRequiredInput CSV file path
--timestrRequiredColumn name for survival time
--eventstrRequired
--groupstrRequired
--outputstrRequiredOutput directory for results
--conf-levelfloat0.95
--risk-tablestrRequiredInclude risk table in plot
--figsizestr'10
--dpiint300

Example

# Basic survival curve
python scripts/main.py \
  --input clinical_data.csv \
  --time overall_survival_months \
  --event death \
  --group treatment_arm \
  --output ./results/ \
  --risk-table

Output includes:

  • Survival curves with 95% confidence bands
  • Median survival: Drug A = 28.4 months (95% CI: 24.1-32.7), Placebo = 18.2 months (95% CI: 15.3-21.1)
  • Log-rank test p-value: 0.0023
  • Hazard ratio: 0.62 (95% CI: 0.45-0.85), p = 0.003

Risk Assessment

Risk IndicatorAssessmentLevel
-----------------------------------
Code ExecutionPython/R scripts executed locallyMedium
Network AccessNo external API callsLow
File System AccessRead input files, write output filesMedium
Instruction TamperingStandard prompt guidelinesLow
Data ExposureOutput files saved to workspaceLow

Security Checklist

  • [ ] No hardcoded credentials or API keys
  • [ ] No unauthorized file system access (../)
  • [ ] Output does not expose sensitive information
  • [ ] Prompt injection protections in place
  • [ ] Input file paths validated (no ../ traversal)
  • [ ] Output directory restricted to workspace
  • [ ] Script execution in sandboxed environment
  • [ ] Error messages sanitized (no stack traces exposed)
  • [ ] Dependencies audited
  • Prerequisites

# Python dependencies
pip install -r requirements.txt

Evaluation Criteria

Success Metrics

  • [ ] Successfully executes main functionality
  • [ ] Output meets quality standards
  • [ ] Handles edge cases gracefully
  • [ ] Performance is acceptable

Test Cases

  1. Basic Functionality: Standard input → Expected output
  2. Edge Case: Invalid input → Graceful error handling
  3. Performance: Large dataset → Acceptable processing time

Lifecycle Status

  • Current Stage: Draft
  • Next Review Date: 2026-03-06
  • Known Issues: None
  • Planned Improvements:
  • Performance optimization
  • Additional feature support

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 01:57 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

A股量化 AkShare

mbpz
A股量化数据分析工具,基于AkShare库获取A股行情、财务数据、板块信息等。用于回答关于A股股票查询、行情数据、财务分析、选股等问题。
★ 161 📥 59,605
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 198 📥 64,806
data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 366 📥 139,875