概述

TurboQuant Context Compression Skill

Maintains a compressed vector store in OpenClaw's memory directory

(~/.openclaw/memory/turboquant-$OPENCLAW_SESSION_ID) and provides

three lifecycle commands: ingest → assemble → compact.

Store path convention

The store path is automatically derived from the session:

STORE=~/.openclaw/memory/turboquant-$OPENCLAW_SESSION_ID

All commands default to this path when --store is omitted.

Phase A — Ingest (after each turn)

After every message, save the embedding vector of that message text to the store.

Embeddings must be pre-computed and saved as a .npy file by the caller.

# Add one message to the store
uv run openclaw-turboquant ingest \
  --id "turn_$TURN_NUMBER" \
  --text "Full message text goes here" \
  --embedding /tmp/turn_embedding.npy

# Output:
# {"action":"ingest","entry_id":"turn_5","store_size":5,"store_path":"...","ok":true}

--store, --dim, --bit-width are inferred automatically on subsequent calls.

Set --dim only when creating a store for the first time.

Phase B — Assemble (before building the prompt)

Retrieve the most relevant past entries that fit within the token budget.

uv run openclaw-turboquant assemble \
  --query /tmp/current_query.npy \
  --token-budget 4096

# Output (one JSON object per line, sorted by relevance):
# {"role":"context","content":"...","entry_id":"turn_3","score":0.912}
# {"role":"context","content":"...","entry_id":"turn_1","score":0.743}

Include the returned content fields as context messages in the system prompt.

Phase C — Compact (when the context window is filling up)

Remove the least relevant entries when the store grows too large.

Trigger this when store_size exceeds a threshold (e.g., 50).

uv run openclaw-turboquant compact \
  --query /tmp/current_query.npy \
  --keep-ratio 0.5

# Output:
# {"action":"compact","before":50,"after":25,"removed":25,"ok":true}

Inspect the store

uv run openclaw-turboquant store-info

# Output:
# {"path":"/...","size":25,"dim":1536,"bit_width":4,"memory_bytes":12288,"memory_kb":12.0}

Low-level commands (batch file operations)

# Compress a batch of embeddings from a .npy file
uv run openclaw-turboquant compress \
  --input embeddings.npy --output compressed.npz --bit-width 4

# Retrieve top-k from a compressed file index
uv run openclaw-turboquant retrieve \
  --query query.npy --index compressed.npz --top-k 5

# Benchmark distortion quality
uv run openclaw-turboquant benchmark --dim 128 --bit-width 4 --n-vectors 1000

Notes

Embeddings are not generated by this skill. The caller must produce .npy

vectors using any embedding model (e.g., text-embedding-3-small).

The store persists across turns inside ~/.openclaw/memory/ so context survives

session restarts.

bit-width 4 is the recommended default: ~6× compression with negligible quality loss.
Use prod mode (default) for retrieval; it provides unbiased inner-product estimation.
Requires uv on $PATH and uv sync run once in the project directory.

版本历史

共 1 个版本

v0.1.0 当前

2026-05-03 09:00 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)