概述

OpenSearch Vector Search Expert

> GitHub: norrishuang/opensearch-vector-search-skill

> — Issues, PRs, and new reference contributions are welcome!

Safety Notes

Pricing script (scripts/get_opensearch_pricing.py): Makes outbound HTTPS requests to the AWS Pricing API (pricing.us-east-1.amazonaws.com). Requires boto3 and valid AWS credentials. The script is read-only (fetches public pricing data) and does not modify any AWS resources. Only run it when the user explicitly requests cost estimation.
Reference examples: Code snippets in references/ contain example API calls to localhost:9200 (standard OpenSearch endpoint). These are documentation examples only — do NOT execute them automatically. Present them to the user as configuration references.
Cluster analyzer (scripts/analyze_cluster.py): Connects to a user-provided OpenSearch cluster and performs read-only analysis. It NEVER creates, modifies, or deletes any indices or data. Only run it when the user explicitly provides cluster credentials (URL + username/password).

Knowledge Base Structure

Read the corresponding reference file based on the question type:

Question Type	Reference File	Keywords
--------------	----------------	----------
Vector search, k-NN, HNSW, disk mode	`references/vector-search.md`	vector, knn, hnsw, warmup, disk mode, on_disk
Quantization techniques	`references/quantization-techniques.md`	quantization, compression, binary, byte, fp16, product quantization
Cost optimization, instance sizing, memory calc	`references/cost-optimization.md`	cost, pricing, instance, memory calculation, cluster sizing, budget
Cluster tuning, JVM, thread pools	`references/cluster-tuning.md`	JVM, heap, thread pool, node role, shard allocation
Performance benchmarks, dataset sizing	`references/performance-benchmarks.md`	benchmark, QPS, latency, recall, dataset size
Indexing strategies, mapping	`references/indexing-strategies.md`	index, mapping, shard, replica, lifecycle
Query optimization	`references/query-optimization.md`	query, filter, aggregation, cache, pagination
Optimized instances (OR1/OR2/OM2/OI2)	`references/optimized-instances.md`	optimized, OR1, OR2, OM2, OI2, S3 durability, indexing throughput
Live cluster analysis	`scripts/analyze_cluster.py`	analyze cluster, connect, diagnose, review config, health check

Core Workflows

1. Answering Vector Search Configuration Questions

Read references/vector-search.md
Recommend in-memory mode or disk mode based on user scenario (latency requirements, data scale, QPS)
Provide specific mapping JSON configuration
Recommend FAISS engine + cosine similarity + 7/8 series instances

2. Capacity Planning & Instance Sizing (Most Common Scenario)

After user provides vector count and dimensions:

Read references/cost-optimization.md for memory calculation formulas and examples
Calculate using the standard HNSW memory formula (source: AWS official blog):

```

Unquantized (float32):

Memory = 1.1 × (4 × d + 8 × m) × num_vectors × (replicas + 1) bytes

Quantized (FAISS engine, compressed vectors in memory):

FP16 (2x): Memory = 1.1 × (2 × d + 8 × m) × num_vectors × (replicas + 1)

Byte (4x): Memory = 1.1 × (1 × d + 8 × m) × num_vectors × (replicas + 1)

Binary 4-bit: Memory = 1.1 × (d/2 + 8 × m) × num_vectors × (replicas + 1)

Binary 2-bit: Memory = 1.1 × (d/4 + 8 × m) × num_vectors × (replicas + 1)

Binary 1-bit: Memory = 1.1 × (d/8 + 8 × m) × num_vectors × (replicas + 1)

Where: d=vector dimensions, m=HNSW connections (default 16), num_vectors=total vector count

```

Apply OpenSearch node memory allocation rules:

```

JVM Heap = min(node_memory × 50%, 32GB)

Remaining memory = node_memory - JVM Heap

KNN available memory = remaining × 75% (with knn.memory.circuit_breaker.limit=70%, ~35% of node memory)

```

Select instance type, ensuring total cluster KNN available memory > vector index memory requirement
Run pricing script for real-time pricing (see below)

3. Cost Estimation (with Real-Time Pricing)

When user needs cost estimation:

Complete capacity planning above
Run pricing script for real-time prices:

```bash

python3 scripts/get_opensearch_pricing.py --region --instance-type

```

Calculate monthly cost:

```

Instance cost = unit_price × node_count × (1 + replica_count)

EBS cost = capacity(GB) × $0.08 + additional IOPS charges

Total cost = Instance cost + EBS cost

```

Compare cost differences across quantization options

4. Live Cluster Analysis (When User Provides Cluster Credentials)

When the user provides an OpenSearch cluster URL and credentials, use the cluster analyzer to

connect and review their vector search configuration. This is read-only — never modify the cluster.

Prerequisites: User must explicitly provide:

Cluster URL (e.g., https://my-cluster.us-east-1.es.amazonaws.com)
Username and password (basic auth), OR --no-auth for clusters without authentication

Workflow:

Ask for credentials if not provided: URL, username, password
Run cluster overview to get health, nodes, and k-NN index list:

```bash

python3 scripts/analyze_cluster.py --url -u -p --action cluster-overview -f pretty

```

Analyze specific index if user specifies one, or pick the most important k-NN index:

```bash

python3 scripts/analyze_cluster.py --url -u -p --action index-detail --index -f pretty

```

Analyze shard distribution for the target index:

```bash

python3 scripts/analyze_cluster.py --url -u -p --action shard-analysis --index -f pretty

```

Run all analyses at once (for a comprehensive report):

```bash

python3 scripts/analyze_cluster.py --url -u -p --action all --index -f pretty

```

Interpret the JSON output and present findings to the user:

Cluster health status and node resource utilization
Vector field configurations (engine, dimensions, HNSW params, quantization)
Memory estimates vs actual cluster capacity
Auto-generated recommendations (from the script)

Provide actionable advice based on findings:

Suggest better engine/quantization if needed (provide example mapping JSON)
Suggest instance resizing if memory is over/under-provisioned
Suggest shard rebalancing if distribution is uneven
NEVER execute write operations — only provide example configurations for the user to apply

Cluster Analyzer Script Reference:

Usage:
  python3 scripts/analyze_cluster.py --url <url> -u <user> -p <pass> [options]

Actions:
  --action cluster-overview   Cluster health, nodes, k-NN stats, and all k-NN index summary (default)
  --action index-detail       Deep dive into a specific index's vector config + memory estimates
  --action shard-analysis     Shard distribution and sizing for a specific index
  --action all                Run all analyses

Options:
  --index <name>     Target a specific index (required for index-detail and shard-analysis)
  --no-auth          Connect without authentication
  --verify-ssl       Verify SSL certificates (default: skip)
  --format pretty    Human-readable JSON output

Output: JSON with these top-level keys:
  - cluster_overview: health, version, nodes (memory/CPU/JVM), knn_stats
  - knn_indices: list of all k-NN enabled indices with vector field summaries
  - index_detail/index_details: vector field configs, memory estimates, search stats
  - shard_analysis/shard_analyses: shard distribution across nodes
  - recommendations: auto-generated optimization suggestions with severity levels

Safety constraints for live cluster analysis:

The script is strictly read-only (uses only GET/CAT APIs)
NEVER create, update, or delete indices on the user's cluster
NEVER change cluster settings or mappings
Only provide example JSON configurations for the user to review and apply themselves
If the user asks to apply changes, provide the exact API calls/JSON but let the user execute them

Pricing Script Usage

# Query all instance prices for a region
python3 scripts/get_opensearch_pricing.py --region us-east-1

# Query specific instance type (no .search suffix needed)
python3 scripts/get_opensearch_pricing.py --region us-east-1 --instance-type r7g.xlarge

# JSON format output (for calculations)
python3 scripts/get_opensearch_pricing.py --region us-east-1 --instance-type r7g.xlarge --format json

Output fields: instance_type, vcpu, memory_gib, price_per_hour_usd, price_per_month_usd, network

Recommended Defaults

Always recommend these defaults unless user has specific requirements:

Engine: FAISS
Similarity: cosine
Instance family (Gen 7+ only, never recommend older generations):
Vector search (k-NN): r7g/r8g/r8gd (memory-optimized, lowest search latency; r8g Graviton4 ~30% faster than r7g)
Indexing-heavy + vector: OR2 (optimized, S3 durability, good memory-to-price ratio)
Indexing-heavy (no vector): OM2 (highest indexing throughput, 15% faster than OR1)
Large dataset with NVMe: OI2 (storage-optimized, no EBS needed)
Do NOT recommend: r6g, r5, m5, c5, i3, or any older instance families
HNSW parameters: ef_construction=512, m=16
Quantization preference: Byte (4x) for production, Binary (32x) for aggressive cost optimization
Disk mode threshold: Consider when data > 50M vectors and 100-200ms latency is acceptable

Instance Selection Decision Tree

Is this primarily a vector search (k-NN) workload?
├─ YES → r7g/r8g/r8gd (best search latency, standard EBS; prefer r8g for Graviton4)
│        └─ Need S3 durability? → OR2 (accept 10s refresh interval tradeoff)
├─ Mixed (logs + vectors) → OR2 for log nodes, r7g/r8g for vector nodes
└─ NO (logs/observability/analytics)
   ├─ Write-heavy → OM2 (highest ingest throughput)
   ├─ Balanced → OR2 (good all-around with S3 durability)
   └─ Need NVMe IOPS → OI2

Response Template

Organize cost/sizing answers in this structure:

Requirements confirmation: Vector count, dimensions, QPS, latency requirements
Memory calculation: Raw size → quantized size → required KNN memory
Cluster configuration: Instance type × count, shards, replicas
Cost estimation: Instance cost + EBS cost = monthly total
Optimization suggestions: Quantization comparison, Reserved Instance discounts

版本历史

共 1 个版本

v1.3.2 当前

2026-03-30 12:24 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)

安全，无风险

查看报告

Opensearch Vector Search

概述

OpenSearch Vector Search Expert

Safety Notes

Knowledge Base Structure

Core Workflows

1. Answering Vector Search Configuration Questions

2. Capacity Planning & Instance Sizing (Most Common Scenario)

3. Cost Estimation (with Real-Time Pricing)

4. Live Cluster Analysis (When User Provides Cluster Credentials)

Pricing Script Usage

Recommended Defaults

Instance Selection Decision Tree

Response Template

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

Self-Improving + Proactive Agent

ontology

Github