← 返回
未分类 中文

gamma-phase-associator

An overview of the python package for running the GaMMA earthquake phase association algorithm. The algorithm expects phase picks data and station data as in...
GaMMA 地震相位关联算法的 Python 包概述。该算法期望相位拾取数据和台站数据,如...
wu-uk
未分类 clawhub v0.1.0 1 版本 99710.1 Key: 无需
★ 0
Stars
📥 344
下载
💾 0
安装
1
版本
#latest

概述

GaMMA Associator Library

What is GaMMA?

GaMMA is an earthquake phase association algorithm that treats association as an unsupervised clustering problem. It uses multivariate Gaussian distribution to model the collection of phase picks of an event, and uses Expectation-Maximization to carry out pick assignment and estimate source parameters i.e., earthquake location, origin time, and magnitude.

GaMMA is a python library implementing the algorithm. For the input earthquake traces, this library assumes P/S wave picks have already been extracted. We provide documentation of its core API.

Zhu, W., McBrearty, I. W., Mousavi, S. M., Ellsworth, W. L., & Beroza, G. C. (2022). Earthquake phase association using a Bayesian Gaussian mixture model. Journal of Geophysical Research: Solid Earth, 127(5).

The skill is a derivative of the repo https://github.com/AI4EPS/GaMMA

Installing GaMMA

pip install git+https://github.com/wayneweiqiang/GaMMA.git

GaMMA core API

association

Function Signature

def association(picks, stations, config, event_idx0=0, method="BGMM", **kwargs)

Purpose

Associates seismic phase picks (P and S waves) to earthquake events using Bayesian or standard Gaussian Mixture Models. It clusters picks based on arrival time and amplitude information, then fits GMMs to estimate earthquake locations, times, and magnitudes.

1. Input Parameters

ParameterTypeDefaultDescription
---------------------------------------
picksDataFramerequiredSeismic phase pick data
stationsDataFramerequiredStation metadata with locations
configdictrequiredConfiguration parameters
event_idx0int0Starting event index for numbering
methodstr"BGMM""BGMM" (Bayesian) or "GMM" (standard)

2. Required DataFrame Columns

picks DataFrame
ColumnTypeDescriptionExample
------------------------------------
idstrStation identifier (must match stations)network.station. or network.station.location.channel
timestampdatetime/strPick arrival time (ISO format or datetime)"2019-07-04T22:00:06.084"
typestrPhase type: "p" or "s" (lowercase)"p"
probfloatPick probability/weight (0-1)0.94
ampfloatAmplitude in m/s (required if use_amplitude=True)0.000017

Notes:

  • Timestamps must be in UTC or converted to UTC
  • Phase types are forced to lowercase internally
  • Picks with amp == 0 or amp == -1 are filtered when use_amplitude=True
  • The DataFrame index is used to track pick identities in the output
stations DataFrame
ColumnTypeDescriptionExample
------------------------------------
idstrStation identifier"CI.CCC..BH"
x(km)floatX coordinate in km (projected)-35.6
y(km)floatY coordinate in km (projected)45.2
z(km)floatZ coordinate (elevation, typically negative)-0.67

Notes:

  • Coordinates should be in a projected local coordinate system (e.g., you can use the pyproj package)
  • The id column must match the id values in the picks DataFrame (e.g., network.station. or network.station.location.channel)
  • Group stations by unique id, identical attribute are collapsed to a single value and conflicting metadata are preseved as a sorted list.

3. Config Dictionary Keys

Required Keys
KeyTypeDescriptionExample
---------------------------------
dimslist[str]Location dimensions to solve for["x(km)", "y(km)", "z(km)"]
min_picks_per_eqintMinimum picks required per earthquake5
max_sigma11floatMaximum allowed time residual in seconds2.0
use_amplitudeboolWhether to use amplitude in clusteringTrue
bfgs_boundstupleBounds for BFGS optimization((-35, 92), (-128, 78), (0, 21), (None, None))
oversample_factorfloatFactor for oversampling initial GMM components5.0 for BGMM, 1.0 for GMM

Notes on dims:

  • Options: ["x(km)", "y(km)", "z(km)"], ["x(km)", "y(km)"], or ["x(km)"]

Notes on bfgs_bounds:

  • Format: ((x_min, x_max), (y_min, y_max), (z_min, z_max), (None, None))
  • The last tuple is for time (unbounded)
Velocity Model Keys
KeyTypeDefaultDescription
---------------------------------
veldict{"p": 6.0, "s": 3.47}Uniform velocity model (km/s)
eikonaldict/NoneNone1D velocity model for travel times
DBSCAN Pre-clustering Keys (Optional)
KeyTypeDefaultDescription
---------------------------------
use_dbscanboolTrueEnable DBSCAN pre-clustering
dbscan_epsfloat25Max time between picks (seconds)
dbscan_min_samplesint3Min samples in DBSCAN neighborhood
dbscan_min_cluster_sizeint500Min cluster size for hierarchical splitting
dbscan_max_time_space_ratiofloat10Max time/space ratio for splitting
  • dbscan_eps is obtained from estimate_eps Function
Filtering Keys (Optional)
KeyTypeDefaultDescription
------------------------
max_sigma22float1.0Max phase amplitude residual in log scale (required if use_amplitude=True)
max_sigma12float1.0Max covariance
max_sigma11float2.0Max phase time residual (s)
min_p_picks_per_eqint0Min P-phase picks per event
min_s_picks_per_eqint0Min S-phase picks per event
min_stationsint5Min unique stations per event
Other Optional Keys
KeyTypeDefaultDescription
---------------------------------
covariance_priorlist[float]autoPrior for covariance [time, amp]
ncpuintautoNumber of CPUs for parallel processing

4. Return Values

Returns a tuple (events, assignments):

events (list[dict])

List of dictionaries, each representing an associated earthquake:

KeyTypeDescription
------------------------
timestrOrigin time (ISO 8601 with milliseconds)
magnitudefloatEstimated magnitude (999 if use_amplitude=False)
sigma_timefloatTime uncertainty (seconds)
sigma_ampfloatAmplitude uncertainty (log10 scale)
cov_time_ampfloatTime-amplitude covariance
gamma_scorefloatAssociation quality score
num_picksintTotal picks assigned
num_p_picksintP-phase picks assigned
num_s_picksintS-phase picks assigned
event_indexintUnique event index
x(km)floatX coordinate of hypocenter
y(km)floatY coordinate of hypocenter
z(km)floatZ coordinate (depth)
assignments (list[tuple])

List of tuples (pick_index, event_index, gamma_score):

  • pick_index: Index in the original picks DataFrame
  • event_index: Associated event index
  • gamma_score: Probability/confidence of assignment

estimate_eps Function Documentation

Function Signature

def estimate_eps(stations, vp, sigma=2.0)

Purpose

Estimates an appropriate DBSCAN epsilon (eps) parameter for clustering seismic phase picks based on station spacing. The eps parameter controls the maximum time distance between picks that should be considered neighbors in the DBSCAN clustering algorithm.

1. Input Parameters

ParameterTypeDefaultDescription
---------------------------------------
stationsDataFramerequiredStation metadata with 3D coordinates
vpfloatrequiredP-wave velocity in km/s
sigmafloat2.0Number of standard deviations above the mean

2. Required DataFrame Columns

stations DataFrame
ColumnTypeDescriptionExample
------------------------------------
x(km)floatX coordinate in km-35.6
y(km)floatY coordinate in km45.2
z(km)floatZ coordinate in km-0.67

3. Return Value

TypeDescription
-------------------
floatEpsilon value in seconds for use with DBSCAN clustering

4. Example Usage

from gamma.utils import estimate_eps

# Assuming stations DataFrame is already prepared with x(km), y(km), z(km) columns
vp = 6.0  # P-wave velocity in km/s

# Estimate eps automatically based on station spacing
eps = estimate_eps(stations, vp, sigma=2.0)

# Use in config
config = {
    "use_dbscan": True,
    "dbscan_eps": eps,  # or use estimate_eps(stations, config["vel"]["p"])
    "dbscan_min_samples": 3,
    # ... other config options
}
Typical Usage Pattern
from gamma.utils import association, estimate_eps

# Automatic eps estimation
config["dbscan_eps"] = estimate_eps(stations, config["vel"]["p"])

# Or manual override (common in practice)
config["dbscan_eps"] = 15  # seconds

5. Practical Notes

  • In example notebooks, the function is often commented out in favor of hardcoded values (10-15 seconds)
  • Practitioners may prefer manual tuning for specific networks/regions
  • Typical output values range from 10-20 seconds depending on station density
  • Useful when optimal eps is unknown or when working with new networks

6. Related Configuration

The output is typically used with these config parameters:

config["dbscan_eps"] = estimate_eps(stations, config["vel"]["p"])
config["dbscan_min_samples"] = 3
config["dbscan_min_cluster_size"] = 500
config["dbscan_max_time_space_ratio"] = 10

版本历史

共 1 个版本

  • v0.1.0 当前
    2026-05-07 19:45 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

office-efficiency

modora

wu-uk
使用此技能,可通过远程 MoDora HTTP 服务分析 PDF,凭据通过声明的环境变量管理,不会在服务器上存储。
★ 0 📥 504

xlsx

wu-uk
全面的电子表格创建、编辑与分析,支持公式、格式化、数据分析和可视化。当Claude需要工作时...
★ 0 📥 522

pdf

wu-uk
全面PDF工具,支持文本/表格提取、新PDF创建、合并/拆分文档、表单处理。当Claude需要...
★ 0 📥 515