Prerequisites
This skill requires hologres-cli to be installed first:
pip install hologres-cli
export HOLOGRES_SKILL=hologres-query-optimizer
All SQL execution and GUC parameter operations depend on hologres-cli commands (hologres sql run, hologres guc show/set).
Hologres Query Execution Plan Analyzer
This skill helps analyze and optimize Hologres SQL query execution plans using EXPLAIN and EXPLAIN ANALYZE commands.
> Version Note: This documentation is based on Hologres V1.3.4x+. Upgrade your instance for better execution plan readability.
Overview
| Command | Description |
|---|
| --------- | ------------- |
EXPLAIN | Shows estimated execution plan from Query Optimizer (QO). Reference only. |
EXPLAIN ANALYZE | Shows actual execution plan with real runtime metrics. Use for optimization. |
Quick Start
-- Estimated plan (no execution)
EXPLAIN SELECT * FROM my_table WHERE id > 100;
-- Actual plan with runtime metrics (executes query)
EXPLAIN ANALYZE SELECT * FROM my_table WHERE id > 100;
Reading EXPLAIN Output
Read execution plans bottom-up. Each arrow (->) represents a node/operator.
| Parameter | Description |
|---|
| ----------- | ------------- |
cost | Estimated cost: startup_cost..total_cost. Parent includes child costs. |
rows | Estimated output rows. rows=1000 indicates missing statistics — run ANALYZE .width | Estimated average output width (bytes). | Reading EXPLAIN ANALYZE OutputEXPLAIN ANALYZE includes four sections: Query Plan, ADVICE, Cost, and Resource. Query Plan MetricsFormat: [dop_in:dop_out id=X dop=N time=max/avg/min rows=total(max/avg/min) mem=max/avg/min open=X get_next=Y] | Metric | Description |
|---|
| -------- | ------------- | dop_in:dop_out | Parallelism ratio (e.g., 21:1 for gather, 21:21 for shuffle) | dop | Actual parallelism degree (matches shard count) | time | Total time = open + get_next (ms). Cumulative from children. | rows | Output rows: total(max/avg/min). Large variance = data skew. | mem | Memory: max/avg/min | open | Initialization time. Hash operators build tables here. | get_next | Data fetch time. Called repeatedly until complete. |
> Important: time is cumulative. Current operator time = current time - child time. ADVICE SectionSystem-generated suggestions: - Missing indexes:
Table xxx misses bitmap index - Missing statistics:
Table xxx Miss Stats! please run 'analyze xxx'; - Data skew:
shuffle data skew! max rows is X, min rows is Y
Cost Breakdown| Metric | Description |
|---|
| -------- | ------------- | | Total cost | Query total time (ms) | | Optimizer cost | QO plan generation time | | Start query cost | Pre-execution init (schema sync, locking) | | Get the first block cost | Time to first record batch | | Get result cost | Time to all results |
Resource ConsumptionFormat: total(max_worker/avg_worker/min_worker) | Metric | Description |
|---|
| -------- | ------------- | | Memory | Total and per-worker memory | | CPU time | Cumulative CPU time across cores | | Physical read bytes | Disk reads (cache miss) | | Read bytes | Total reads (disk + cache) |
Common OperatorsFor detailed operator reference, see references/operators.md. Scan Operators| Operator | Description |
|---|
| ---------- | ------------- | | Seq Scan | Full table scan | | Index Scan using Clustering_index | Column-store index scan | | Index Seek (pk_index) | Row-store primary key scan |
Filter Operators| Operator | Description |
|---|
| ---------- | ------------- | | Filter | No index hit — add indexes | | Segment Filter | Segment key hit | | Cluster Filter | Clustering key hit | | Bitmap Filter | Bitmap index hit |
Data Movement| Operator | Description |
|---|
| ---------- | ------------- | | Local Gather | Merge files within shard | | Gather | Merge shards to final result | | Redistribution | Data shuffle — check distribution_key | | Broadcast | Small table broadcast to all shards |
Join Operators| Operator | Description |
|---|
| ---------- | ------------- | | Hash Join | Hash-based join (ensure small table is hash table) | | Nested Loop | Nested loop join (avoid for large data) | | Cross Join | Optimized non-equi join (V3.0+) |
Aggregation| Operator | Description |
|---|
| ---------- | ------------- | | HashAggregate | Hash-based aggregation | | Partial/Final HashAggregate | Multi-stage aggregation |
Other| Operator | Description |
|---|
| ---------- | ------------- | | Sort | ORDER BY | | Limit | Row limit (check if pushed to scan) | | ExecuteExternalSQL | PQE execution — rewrite for HQE |
Optimization Workflow- Run
EXPLAIN ANALYZE on slow query - Check ADVICE section for immediate fixes
- Identify bottleneck operators (highest time)
- Apply targeted optimizations:
| Issue | Symptom | Solution |
|---|
| ------- | --------- | ---------- | | Missing stats | rows=1000 | ANALYZE | Data shuffle | Redistribution | Fix distribution_key | | Wrong hash table | Large table as hash | Update statistics | | No index | Filter only | Add clustering/bitmap index | | PQE execution | ExecuteExternalSQL | Rewrite to HQE functions | | Data skew | Large max/min variance | Review distribution |
Key GUC Parameters-- Multi-stage aggregation
SET optimizer_force_multistage_agg = on;
-- Join order control (for complex multi-table joins)
SET optimizer_join_order = 'query'; -- Follow SQL order
SET optimizer_join_order = 'greedy'; -- Greedy algorithm
-- Disable Cross Join
SET hg_experimental_enable_cross_join_rewrite = off;
To persist these settings at database level, use the CLI: hologres guc set optimizer_force_multistage_agg on
hologres guc set optimizer_join_order query
Best Practices- Always use
EXPLAIN ANALYZE for production analysis - Run
ANALYZE after significant data changes - Design
distribution_key based on JOIN/GROUP BY patterns - Set
clustering_key for range query columns - Use bitmap indexes for low-cardinality filters
- Ensure small table is hash table in joins
- Avoid non-equi joins when possible
- Rewrite PQE functions to HQE alternatives
Reference Links
版本历史
共 1 个版本
-
v0.2.0
当前
2026-05-21 15:47 安全 安全
安全检测
腾讯云安全 (Sanbu)
安全,无风险
查看报告
🔗 相关推荐
wenbingyu 使用动态表和 RoaringBitmap 在 Hologres 中进行 UV/PV 实时去重计算,适用于构建增量 UV/PV 流水线,Roaring...
★ 0
📥 252
wenbingyu Hologres BSI位切片索引画像分析技能,用于用户画像与标签计算,涵盖BSI表设计、数据导入、属性+行为标签联合人群圈选、GMV分析、标签分布统计、TopK查询及分桶并行计算等场景。
★ 0
📥 274
wenbingyu Hologres慢查询日志分析与诊断技能:用于分析慢查询、失败查询、查询性能诊断及日志管理(阿里云)。
★ 0
📥 247
|
|