← 返回
未分类 中文

Phy Sql Explainer

Analyze PostgreSQL, MySQL, or SQLite EXPLAIN ANALYZE output to identify bottlenecks, stale stats, missing indexes, N+1 issues, and suggest fixes.
分析 PostgreSQL、MySQL 或 SQLite 的 EXPLAIN ANALYZE 输出,定位瓶颈、陈旧统计、缺失索引、N+1 问题并提供修复建议。
phy041
未分类 clawhub v1.0.3 1 版本 100000 Key: 无需
★ 0
Stars
📥 167
下载
💾 0
安装
1
版本
#latest

概述

SQL Explainer

Diagnose slow SQL queries in seconds. Paste your EXPLAIN ANALYZE output and get a plain-English breakdown: which nodes are killing performance, why estimates are off, and exactly which indexes or rewrites will fix it.

Works with PostgreSQL, MySQL, SQLite. No API keys. No config.


Trigger Phrases

  • "slow query", "why is my query slow", "explain this query"
  • "explain analyze", "query plan", "execution plan"
  • "optimize sql", "add index", "missing index"
  • "query taking too long", "database slow"
  • "/sql-explainer"

How to Provide Input

Give the agent any combination of:

# Option 1: Just paste EXPLAIN ANALYZE output
/sql-explainer
[paste EXPLAIN ANALYZE output here]

# Option 2: Query + EXPLAIN output
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending'

EXPLAIN ANALYZE output:
[paste here]

# Option 3: Just the query (agent will run EXPLAIN ANALYZE if DB is accessible)
/sql-explainer SELECT * FROM orders WHERE user_id = 123

# Option 4: Include table schema for better index suggestions
Table: orders (user_id INT, status VARCHAR, created_at TIMESTAMPTZ, total DECIMAL)
Indexes: PRIMARY KEY (id), INDEX (created_at)
Query: SELECT * FROM orders WHERE user_id = 123 AND status = 'pending'

Step 1: Run EXPLAIN ANALYZE

If the user provides a query but no EXPLAIN output, run it:

PostgreSQL

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

MySQL

EXPLAIN FORMAT=JSON
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

-- Or for runtime stats:
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

SQLite

EXPLAIN QUERY PLAN
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

Step 2: Parse the Plan Tree

Read the EXPLAIN output as a tree. For each node, extract:

FieldWhat to Look For
------------------------
Node TypeSeq Scan = no index used; Index Scan = good; Bitmap Heap Scan = OK for bulk
Cost(cost=startup..total) — total cost is the bottleneck metric
Rowsrows=N (estimated) vs actual rows=M — big gap = stale statistics
LoopsNested Loop with loops=1000 = N+1 problem
Buffersshared read=N = disk reads; shared hit=N = cache; high read = cold cache or missing index
Timeactual time=start..end — where is the wall clock going?

Critical Signals to Flag

🔴 Seq Scan on large table (> 10K rows estimated)
   → Table is being scanned fully — missing index

🔴 Rows estimated << actual rows (off by 10x+)
   → Stale statistics → run ANALYZE tablename
   → Correlated columns not captured by statistics

🔴 Nested Loop with high loops count (> 100)
   → N+1 query pattern — join condition may be missing index on inner side

🔴 Hash Join with high hash batches (> 1)
   → Spilling to disk — work_mem too low, or query needs refactoring

🟠 Sort with "Sort Method: external merge Disk"
   → Sort spilled to disk → increase work_mem or add covering index

🟠 Filter with high "rows removed by filter" ratio (> 80%)
   → Index exists but predicate not selective enough — consider partial index

🟡 Index Scan with many "heap fetches"
   → Index-Only Scan would be faster — add covering index

Step 3: Identify Root Cause

After reading the tree, classify the problem:

Category A: Missing Index

Symptoms:

  • Seq Scan on a large table
  • Filter condition in WHERE clause not covered by any index

Diagnosis:

-- Check what indexes exist on the table
\d tablename  -- PostgreSQL
SHOW INDEXES FROM tablename;  -- MySQL

Typical patterns:

Query PatternRecommended Index
---------------------------------
WHERE col = valueCREATE INDEX ON t(col)
WHERE col1 = v AND col2 = vComposite: CREATE INDEX ON t(col1, col2)
WHERE col = v ORDER BY created_atCREATE INDEX ON t(col, created_at)
WHERE status IN ('a','b') AND user_id = vCREATE INDEX ON t(user_id, status)
WHERE email LIKE 'prefix%'B-tree works; WHERE email LIKE '%suffix' needs pg_trgm GIN
WHERE body @@ to_tsquery('word')GIN: CREATE INDEX ON t USING gin(to_tsvector('english', body))
WHERE deleted_at IS NULL (most rows)Partial: CREATE INDEX ON t(user_id) WHERE deleted_at IS NULL

Category B: Stale Statistics

Symptoms:

  • Planner estimates rows=5 but actual is rows=50000
  • Plan looks like it should be fast but isn't

Fix:

ANALYZE tablename;  -- PostgreSQL (updates statistics)
ANALYZE TABLE tablename;  -- MySQL

For correlated columns:

-- PostgreSQL 14+ extended statistics
CREATE STATISTICS s1 ON col1, col2 FROM tablename;
ANALYZE tablename;

Category C: N+1 / Nested Loop

Symptoms:

  • Nested Loop with loops=1000+
  • Hash Join or inner query repeated many times
  • Code fetches one row, then queries for related data in a loop

Fix:

  • Add index on the foreign key / join column in the inner table
  • Rewrite to use a single JOIN instead of a loop:
-- Before (N+1): fetch users, then query orders for each
SELECT * FROM users WHERE id = $1;
-- × N times:
SELECT * FROM orders WHERE user_id = $1;

-- After: single JOIN
SELECT u.*, o.*
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.id = $1;

Category D: Query Rewrite Needed

Symptoms:

  • IN (SELECT ...) subquery producing full scan
  • OR condition preventing index use
  • SELECT * fetching unnecessary columns (prevents Index-Only Scan)

Common rewrites:

-- ❌ Slow: correlated subquery
SELECT * FROM orders
WHERE user_id IN (SELECT id FROM users WHERE country = 'US');

-- ✅ Fast: JOIN
SELECT o.* FROM orders o
JOIN users u ON u.id = o.user_id
WHERE u.country = 'US';

-- ❌ Slow: OR prevents index use
SELECT * FROM events WHERE type = 'click' OR type = 'view';

-- ✅ Fast: UNION ALL
SELECT * FROM events WHERE type = 'click'
UNION ALL
SELECT * FROM events WHERE type = 'view';

-- ❌ Slow: function on indexed column
SELECT * FROM orders WHERE DATE(created_at) = '2026-03-18';

-- ✅ Fast: range scan uses index
SELECT * FROM orders
WHERE created_at >= '2026-03-18' AND created_at < '2026-03-19';

-- ❌ Slow: LIKE with leading wildcard
SELECT * FROM products WHERE name LIKE '%widget%';

-- ✅ Fast: full-text search
SELECT * FROM products WHERE name_vector @@ to_tsquery('widget');
-- Requires: CREATE INDEX ON products USING gin(to_tsvector('english', name));

Category E: Configuration Limits

Symptoms:

  • Sort Method: external merge Disk
  • Hash Batches: 4 (spilling to disk)
  • Large aggregations are slow

Fixes (PostgreSQL):

-- Increase per-query memory (default 4MB is often too low)
SET work_mem = '64MB';  -- session level
-- In postgresql.conf for permanent change:
-- work_mem = 64MB

-- Check current value
SHOW work_mem;

Step 4: Output Report

Always produce this report structure:

## SQL Explainer Report
Query: [first 60 chars of query...]
Database: PostgreSQL / MySQL / SQLite
Execution Time: [from EXPLAIN ANALYZE, if available]

### Diagnosis

| Severity | Node | Problem | Impact |
|----------|------|---------|--------|
| 🔴 Critical | Seq Scan on orders (50K rows) | No index on (user_id, status) | 2.1s → <5ms |
| 🟠 High | Stale statistics | Planner estimated 3 rows, got 48K | Wrong plan chosen |
| 🟡 Medium | SELECT * | 42 columns fetched, 3 used | Prevents Index-Only Scan |

---

### Root Cause

[1-2 sentence plain-English explanation of why the query is slow]

---

### Fix Plan (Prioritized)

**Fix 1 — Add composite index (estimated: 2.1s → <5ms)**

CREATE INDEX CONCURRENTLY idx_orders_user_status

ON orders(user_id, status)

WHERE deleted_at IS NULL; -- partial index if most rows are soft-deleted

Why: Eliminates Seq Scan. CONCURRENTLY means no table lock in production.

**Fix 2 — Update statistics**

ANALYZE orders;

Why: Planner estimated 3 rows, got 48K. Wrong estimate → wrong plan.

**Fix 3 — Select only needed columns**

-- Before:

SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

-- After (enables Index-Only Scan):

SELECT id, total, created_at FROM orders WHERE user_id = 123 AND status = 'pending';


---

### Expected After Fix

| Metric | Before | After |
|--------|--------|-------|
| Execution time | 2.1s | <5ms |
| Node type | Seq Scan | Index Only Scan |
| Rows scanned | 50,000 | 3 |
| Disk reads | 412 | 1 |

---

### Quick Copy-Paste Fix

-- Run these in order:

CREATE INDEX CONCURRENTLY idx_orders_user_status ON orders(user_id, status) WHERE deleted_at IS NULL;

ANALYZE orders;


Quick Mode

If user just wants fast feedback:

Quick Check: [query first 40 chars...]

🔴 Missing index on orders(user_id, status) — Seq Scan detected
🟠 Stale stats — run ANALYZE orders
🟡 SELECT * fetching 42 cols

Fix: CREATE INDEX ON orders(user_id, status);
Full analysis: /sql-explainer --full [paste EXPLAIN ANALYZE]

Index Type Reference

Use CaseIndex TypeSyntax
------------------------------
Equality, range, ORDER BYB-tree (default)CREATE INDEX ON t(col)
JSON, arrays, full-textGINCREATE INDEX ON t USING gin(col)
Geometric, PostGISGiSTCREATE INDEX ON t USING gist(col)
Exclude overlapping rangesBRINCREATE INDEX ON t USING brin(col) (huge tables only)
Many repeated values (few distinct)Don't index; use partial index instead
Only index subset of rowsPartialCREATE INDEX ON t(col) WHERE condition
Avoid heap fetch entirelyCoveringCREATE INDEX ON t(a) INCLUDE (b, c)

What EXPLAIN ANALYZE Nodes Mean

NodeMeaningRed Flag?
--------------------------
Seq ScanFull table scan✅ On tables > 10K rows
Index ScanUsed an index, then fetched heap rowsUsually fine
Index Only ScanUsed index, no heap fetch neededBest case
Bitmap Index Scan + Bitmap Heap ScanMultiple index ranges mergedOK for bulk
Nested LoopFor each outer row, scan inner✅ if loops >> 10
Hash JoinBuild hash table of smaller side✅ if Hash Batches > 1
Merge JoinBoth sides pre-sortedFine if Sort cost is low
SortExplicit sort step✅ if external merge Disk
AggregateGROUP BY / COUNT etc.Fine usually
LimitStops earlyWatch startup cost

Why This Doesn't Exist Elsewhere

Most database GUIs (pgAdmin, TablePlus, DBeaver) show you the EXPLAIN output as a tree or graph — they visualize it. But they don't tell you what to do about it.

This skill reads the plan like a senior DBA would: identifies the specific bottleneck, explains why it's happening (stale stats, missing index, wrong join type), and gives you the exact SQL to fix it — optimized for your specific query pattern, not a generic "add an index" suggestion.


Author

Canlah AI — Run performance marketing without breaking your brand.

版本历史

共 1 个版本

  • v1.0.3 当前
    2026-05-23 23:40

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

🔗 相关推荐

developer-tools

Twitter Scrape

phy041
Scrape Twitter profiles and tweets via GraphQL, export to JSON or database
★ 0 📥 1,034
ai-intelligence

Phy Lenny Mentor

phy041
由300+期Lenny播客驱动的AI产品导师。提炼Brian Chesky、Shreyas Doshi、April Dunford等世界级领袖的智慧。T...
★ 1 📥 591
content-creation

Phy Prd Writer

phy041
多轮PRD(产品需求文档)创建助手,逐步提出澄清问题以收集需求,随后生成完整、结构化的PRD。
★ 2 📥 567