> GitHub: norrishuang/opensearch-vector-search-skill
> — Issues, PRs, and new reference contributions are welcome!
scripts/get_opensearch_pricing.py): Makes outbound HTTPS requests to the AWS Pricing API (pricing.us-east-1.amazonaws.com). Requires boto3 and valid AWS credentials. The script is read-only (fetches public pricing data) and does not modify any AWS resources. Only run it when the user explicitly requests cost estimation.references/ contain example API calls to localhost:9200 (standard OpenSearch endpoint). These are documentation examples only — do NOT execute them automatically. Present them to the user as configuration references.scripts/analyze_cluster.py): Connects to a user-provided OpenSearch cluster and performs read-only analysis. It NEVER creates, modifies, or deletes any indices or data. Only run it when the user explicitly provides cluster credentials (URL + username/password).Read the corresponding reference file based on the question type:
| Question Type | Reference File | Keywords |
|---|---|---|
| -------------- | ---------------- | ---------- |
| Vector search, k-NN, HNSW, disk mode | references/vector-search.md | vector, knn, hnsw, warmup, disk mode, on_disk |
| Quantization techniques | references/quantization-techniques.md | quantization, compression, binary, byte, fp16, product quantization |
| Cost optimization, instance sizing, memory calc | references/cost-optimization.md | cost, pricing, instance, memory calculation, cluster sizing, budget |
| Cluster tuning, JVM, thread pools | references/cluster-tuning.md | JVM, heap, thread pool, node role, shard allocation |
| Performance benchmarks, dataset sizing | references/performance-benchmarks.md | benchmark, QPS, latency, recall, dataset size |
| Indexing strategies, mapping | references/indexing-strategies.md | index, mapping, shard, replica, lifecycle |
| Query optimization | references/query-optimization.md | query, filter, aggregation, cache, pagination |
| Optimized instances (OR1/OR2/OM2/OI2) | references/optimized-instances.md | optimized, OR1, OR2, OM2, OI2, S3 durability, indexing throughput |
| Live cluster analysis | scripts/analyze_cluster.py | analyze cluster, connect, diagnose, review config, health check |
references/vector-search.mdAfter user provides vector count and dimensions:
references/cost-optimization.md for memory calculation formulas and examples```
Unquantized (float32):
Memory = 1.1 × (4 × d + 8 × m) × num_vectors × (replicas + 1) bytes
Quantized (FAISS engine, compressed vectors in memory):
FP16 (2x): Memory = 1.1 × (2 × d + 8 × m) × num_vectors × (replicas + 1)
Byte (4x): Memory = 1.1 × (1 × d + 8 × m) × num_vectors × (replicas + 1)
Binary 4-bit: Memory = 1.1 × (d/2 + 8 × m) × num_vectors × (replicas + 1)
Binary 2-bit: Memory = 1.1 × (d/4 + 8 × m) × num_vectors × (replicas + 1)
Binary 1-bit: Memory = 1.1 × (d/8 + 8 × m) × num_vectors × (replicas + 1)
Where: d=vector dimensions, m=HNSW connections (default 16), num_vectors=total vector count
```
```
JVM Heap = min(node_memory × 50%, 32GB)
Remaining memory = node_memory - JVM Heap
KNN available memory = remaining × 75% (with knn.memory.circuit_breaker.limit=70%, ~35% of node memory)
```
When user needs cost estimation:
```bash
python3 scripts/get_opensearch_pricing.py --region
```
```
Instance cost = unit_price × node_count × (1 + replica_count)
EBS cost = capacity(GB) × $0.08 + additional IOPS charges
Total cost = Instance cost + EBS cost
```
When the user provides an OpenSearch cluster URL and credentials, use the cluster analyzer to
connect and review their vector search configuration. This is read-only — never modify the cluster.
Prerequisites: User must explicitly provide:
https://my-cluster.us-east-1.es.amazonaws.com)--no-auth for clusters without authenticationWorkflow:
```bash
python3 scripts/analyze_cluster.py --url
```
```bash
python3 scripts/analyze_cluster.py --url
```
```bash
python3 scripts/analyze_cluster.py --url
```
```bash
python3 scripts/analyze_cluster.py --url
```
Cluster Analyzer Script Reference:
Usage:
python3 scripts/analyze_cluster.py --url <url> -u <user> -p <pass> [options]
Actions:
--action cluster-overview Cluster health, nodes, k-NN stats, and all k-NN index summary (default)
--action index-detail Deep dive into a specific index's vector config + memory estimates
--action shard-analysis Shard distribution and sizing for a specific index
--action all Run all analyses
Options:
--index <name> Target a specific index (required for index-detail and shard-analysis)
--no-auth Connect without authentication
--verify-ssl Verify SSL certificates (default: skip)
--format pretty Human-readable JSON output
Output: JSON with these top-level keys:
- cluster_overview: health, version, nodes (memory/CPU/JVM), knn_stats
- knn_indices: list of all k-NN enabled indices with vector field summaries
- index_detail/index_details: vector field configs, memory estimates, search stats
- shard_analysis/shard_analyses: shard distribution across nodes
- recommendations: auto-generated optimization suggestions with severity levels
Safety constraints for live cluster analysis:
# Query all instance prices for a region
python3 scripts/get_opensearch_pricing.py --region us-east-1
# Query specific instance type (no .search suffix needed)
python3 scripts/get_opensearch_pricing.py --region us-east-1 --instance-type r7g.xlarge
# JSON format output (for calculations)
python3 scripts/get_opensearch_pricing.py --region us-east-1 --instance-type r7g.xlarge --format json
Output fields: instance_type, vcpu, memory_gib, price_per_hour_usd, price_per_month_usd, network
Always recommend these defaults unless user has specific requirements:
Is this primarily a vector search (k-NN) workload?
├─ YES → r7g/r8g/r8gd (best search latency, standard EBS; prefer r8g for Graviton4)
│ └─ Need S3 durability? → OR2 (accept 10s refresh interval tradeoff)
├─ Mixed (logs + vectors) → OR2 for log nodes, r7g/r8g for vector nodes
└─ NO (logs/observability/analytics)
├─ Write-heavy → OM2 (highest ingest throughput)
├─ Balanced → OR2 (good all-around with S3 durability)
└─ Need NVMe IOPS → OI2
Organize cost/sizing answers in this structure:
共 1 个版本