← 返回
未分类 Key 中文

mrmrmr

LLM-powered automated Mendelian Randomization for causal discovery in biomedical research
基于大语言模型的自动化孟德尔随机化因果发现技术
rqth123 rqth123 来源
未分类 clawhub v1.0.2 1 版本 100000 Key: 需要
★ 1
Stars
📥 424
下载
💾 0
安装
1
版本
#latest

概述

MRAgent - Automated Mendelian Randomization Analysis Agent

Description

MRAgent is an intelligent agent that automates the entire process of Mendelian randomization analysis for causal discovery in biomedical research. It:

  1. Knowledge Discovery Mode (KD): Given a disease (outcome), automatically scans PubMed literature to discover potential modifiable exposure (risk factor) that have correlation but no established causal inference, then performs comprehensive Mendelian randomization analysis using OpenGWAS data to identify novel causal relationships.
  1. Causal Validation Mode (CV): Directly test whether a user-specified exposure (e.g., "body mass index") has a causal effect on a specific outcome (e.g., "type 2 diabetes") using two-sample Mendelian randomization.

MRAgent handles all steps automatically:

  • PubMed literature crawling
  • LLM-based extraction of candidate exposure-outcome pairs
  • Check for existing MR studies via PubMed search
  • Optional STROBE-MR quality assessment of existing studies
  • UMLS medical synonym expansion
  • OpenGWAS database query for GWAS summary statistics
  • LLM selection of most appropriate GWAS datasets
  • Multiple MR methods: Inverse variance weighted, MR-Egger, weighted median, etc.
  • Heterogeneity analysis and pleiotropy sensitivity testing
  • MRlap correction for sample overlap
  • Automatic generation of publication-ready PDF reports with LLM-written interpretation

Requirements

  • Python packages (install via pip install -r {baseDir}/requirements.txt)
  • R language (>= 4.3.4) with the following packages installed:
  • TwoSampleMR - core Mendelian randomization
  • ieugwasr - OpenGWAS interface
  • vcfR - required for MRlap (optional)
  • MRlap - sample overlap correction (optional)
  • jsonlite - JSON processing (required for MRlap)
  • OPENAI_API_KEY environment variable must be set (OpenAI API key)
  • OPENGWAS_JWT environment variable (optional, for OpenGWAS access token)

Usage

Knowledge Discovery Mode (recommended for novel discoveries)

Discover novel causal relationships starting from a disease:

python {baseDir}/run_mragent.py --mode KD --outcome "back pain" --num-pubmed 100 --bidirectional

Discover novel causal relationships starting from an exposure:

(Currently the CLI implements KD starting from outcome; for exposure-based discovery use the Python API directly)

Causal Validation Mode

Validate a specific hypothesis: does exposure causally affect outcome?

python {baseDir}/run_mragent.py --mode CV --exposure "body mass index" --outcome "type 2 diabetes"

Common Options

OptionDescriptionExample
------------------------------
`--mode KD\CV`Required. KD=knowledge discovery, CV=causal validation--mode CV
--outcome NAMERequired. Outcome (disease) to study--outcome "back pain"
--exposure NAMERequired in CV mode. Exposure factor to test--exposure "osteoporosis"
--num-pubmed NNumber of papers to fetch from PubMed (default: 100)--num-pubmed 50
`--model MR\MR_MOE`MR model type. MR=standard, MR_MOE=mixture of experts--model MR_MOE
--bidirectionalPerform bidirectional analysis (also test outcome → exposure)--bidirectional
--no-synonymsDisable synonym expansion (faster)--no-synonyms
--strobe-mrEnable STROBE-MR quality assessment of existing studies--strobe-mr
--mrlapEnable MRlap sample overlap correction--mrlap
--output-dir DIROutput directory (default: ./output)--output-dir /tmp/mragent-out
--steps 1,2,3Only run specific steps (for debugging/intervention)--steps 1,2

Environment Variables

VariableRequiredDescription
---------------------------------
OPENAI_API_KEYYesOpenAI API key
OPENGWAS_JWTOptionalOpenGWAS JWT access token
LLM_MODELNoLLM model name (default: gpt-4o)
LLM_PROVIDERNoopenai or ollama (default: openai)
OPENAI_BASE_URLNoCustom base URL for OpenAI-compatible API
MRAGENT_SOURCE_PATHNoPath to original MRAgent source if not installed globally

Output

MRAgent outputs a JSON summary to stdout with:

  • success: boolean indicating success
  • output_directory: directory containing all results
  • discovered_pairs: number of candidate exposure-outcome pairs found
  • selected_for_mr: number of pairs selected for MR analysis
  • reports: list of paths to generated PDF files (Report.pdf, Introduction.pdf, Conclusion.pdf, etc.)

Example output:

{
  "success": true,
  "mode": "CV",
  "outcome": "back pain",
  "exposure": "osteoarthritis",
  "output_directory": "./output/osteoarthritis_back_pain_gpt-4o",
  "discovered_pairs": 1,
  "selected_for_mr": 1,
  "reports": [
    "./output/osteoarthritis_back_pain_gpt-4o/osteoarthritis_back_pain/Introduction.pdf",
    "./output/osteoarthritis_back_pain_gpt-4o/osteoarthritis_back_pain/osteoarthritis_back_pain/MR_ieu-a-2_ieu-a-1008/Report.pdf",
    "./output/osteoarthritis_back_pain_gpt-4o/osteoarthritis_back_pain/Conclusion.pdf"
  ]
}

Workflow Steps

When running all steps (default):

StepDescription
-------------------
1Crawl PubMed, extract candidate exposure-outcome pairs
2Check if each pair already has MR studies in literature
3Extract unique terms, expand with medical synonyms
4Check which terms have available GWAS data in OpenGWAS
5LLM selects most appropriate GWAS IDs for each term
6Generate all combinations of exposure-outcome pairs
7Check new combinations for existing MR studies
8Select final set of novel pairs for analysis
9Run MR analysis, generate plots and LLM-interpreted PDF reports

Notes

  • This is a computationally intensive process. The full analysis can take tens of minutes to hours depending on the number of pairs.
  • All intermediate results are saved as CSV files in the output directory, allowing manual editing and intervention between steps.
  • MRAgent requires R to be installed with the necessary packages, because the actual MR analysis is performed by R's TwoSampleMR package.

版本历史

共 1 个版本

  • v1.0.2 当前
    2026-05-03 08:29 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

suspicious
查看报告

🔗 相关推荐

professional

Stock Analysis

udiedrichsen
{"answer":"基于雅虎财经数据,分析股票与加密货币。支持投资组合管理、自选股预警、股息分析、8维评分、热门趋势扫描及传闻/早期信号探测。适用于股票分析、持仓追踪、财报异动、加密监控、热门股追踪或提前发掘非主流传闻。"}
★ 279 📥 57,758
professional

All-Market Financial Data Hub

financial-ai-analyst
基于东方财富数据库,支持自然语言查询金融数据,覆盖A股、港股、美股、基金、债券等资产,提供实时行情、公司信息、估值、财务报表等,适用于投资研究、交易复盘、市场监控、行业分析、信用研究、财报审计、资产配置等场景,满足机构与个人需求。返回结果为
★ 128 📥 42,278
professional

A股量化 AkShare

mbpz
A股量化数据分析工具,基于AkShare库获取A股行情、财务数据、板块信息等。用于回答关于A股股票查询、行情数据、财务分析、选股等问题。
★ 194 📥 63,210