← 返回
未分类 中文

TurboQuant Memory

Compress and accelerate vector search in memory/RAG systems using TurboQuant (ICLR 2026) — near-optimal vector quantization with 5-8x compression and 98%+ se...
使用TurboQuant(ICLR 2026)在内存/检索增强生成系统中压缩并加速向量搜索——近最优向量量化,实现5-8倍压缩,98%+搜索效率...
sunnyztj
未分类 clawhub v2.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 351
下载
💾 0
安装
1
版本
#latest

概述

TurboQuant Memory

Compress embedding vectors 5-8x with 98%+ search accuracy using TurboQuant (Google, ICLR 2026).

Quick Start

1. Run tests

python3 scripts/turboquant.py

15 built-in tests: FWHT correctness, MSE distortion, IP correlation, recall, compression ratio, determinism.

2. Validate on your data

python3 scripts/validate.py --db /path/to/memory.sqlite --auto-detect --bits 5

Auto-detects sqlite-vec vec0 tables, analyzes distribution, reports quantization quality and recall.

3. Quantize a memory database

python3 scripts/memory_quantize.py --db /path/to/memory.db --bits 5 --benchmark
python3 scripts/memory_quantize.py --db /path/to/memory.db --bits 5 --migrate

4. Integrate into code

from turboquant import TurboQuantMSE

# Initialize (deterministic — same seed = same quantization)
tq = TurboQuantMSE(dim=3072, bits=5)

# Quantize for storage
stored = tq.quantize(embedding_vector)  # float32 → compressed

# Reconstruct
reconstructed = tq.dequantize(stored)   # compressed → float32

# Search: query stays float32, database is quantized
q_rot = tq.rotation.apply(query)
for doc in database:
    score = doc['norm'] * doc['scale'] * np.dot(q_rot, tq.codebook[doc['indices']])

Recommended Configuration

PresetModeBitsR@1CompressionUse Case
------------------------------------------------
DefaultMSE598%6.4xMost memory/RAG search
ConservativeMSE698%+5.3xHigh-fidelity retrieval
AggressiveMSE492%8.0xLarge-scale, storage-constrained

Parameters

ParameterDefaultDescription
---------------------------------
dimauto-detectEmbedding dimension (768, 1536, 3072, etc.)
bits5Bits per coordinate. See table above.
seed42Rotation seed. Same seed = reproducible quantization.

Algorithm

Blockwise Hadamard Rotation → Lloyd-Max Scalar Quantization

  1. Split vector into power-of-2 blocks (e.g., 3072 = 3 × 1024)
  2. Per block: random sign flip + Fast Walsh-Hadamard Transform (fully invertible)
  3. Per-vector scale normalization
  4. Lloyd-Max optimal scalar quantizer per coordinate (precomputed codebook for N(0,1))
  5. Pack indices into compact bit representation

Key properties:

  • Data-oblivious: no training or calibration needed
  • Fully invertible: zero information loss from rotation
  • Near-optimal: within 2.7x of Shannon information-theoretic lower bound
  • Deterministic: same seed = same output

See references/algorithm.md for full details.

Benchmark (Gemini embedding-001, 3072-dim, 112 vectors)

BitsMSECosineR@1R@5R@10Bytes/vecCompression
-----------------------------------------------------------
31.1e-50.98288%90%91%1,16010.6x
43.2e-60.99592%93%93%1,5448.0x
58.2e-70.99998%96%96%1,9286.4x
62.2e-71.00096%98%98%2,3125.3x
78e-81.000100%98%99%2,6964.6x
83e-81.00098%98%99%3,0804.0x

Compatibility

  • Python 3.9+, numpy only (no scipy, no GPU)
  • Any embedding dimension ≥ 128
  • Any embedding model (Gemini, OpenAI, Cohere, sentence-transformers, etc.)
  • SQLite / sqlite-vec vec0 tables (auto-detected)

References

版本历史

共 1 个版本

  • v2.0.0 当前
    2026-05-03 08:47 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Bybit Futures

sunnyztj
完整的 Bybit USDT 永续期货交易系统,包含风险管理、模拟交易和实盘执行,适用于构建加密期货交易机器人。
★ 5 📥 1,760
data-analysis

Crypto Backtest

sunnyztj
加密期货回测引擎,支持内置 EMA、RSI、MACD、布林带策略,可从任意 ccxt 支持的交易所(如 Bybit、Binance 等)获取 OHLCV 数据。
★ 1 📥 1,619
data-analysis

Trading Signals Ws

sunnyztj
实时加密货币交易信号生成器,使用WebSocket价格数据。连接Bybit(或其他交易所)WebSocket,在实时行情上运行可配置策略生成交易信号...
★ 0 📥 1,354