Cache LLM responses by meaning using Redis vector search. Similar questions return cached answers instantly instead of making expensive API calls.
node scripts/cache.js store "What is the capital of France?" "The capital of France is Paris."
node scripts/cache.js lookup "What's France's capital city?"
node scripts/cache.js stats
node scripts/cache.js clear
node scripts/cache.js query "Your question here"
This checks cache first. On miss, calls OpenAI, caches the result, and returns it.
Set these environment variables:
REDIS_URL — Redis connection string with vector search support (Redis Cloud or Redis Stack)OPENAI_API_KEY — For generating embeddingsSEMANTIC_CACHE_THRESHOLD — Similarity threshold 0-1 (default: 0.80, higher = stricter matching)SEMANTIC_CACHE_TTL — Cache TTL in seconds (default: 86400 = 24 hours)User: "How do I reset my password?"
-> Embed query -> Search Redis -> MISS
-> Call LLM -> Get response -> Cache it -> Return response
User: "I forgot my password, how do I change it?"
-> Embed query -> Search Redis -> HIT (92.7% similar)
-> Return cached response in 8ms (saved ~2 seconds + API cost)
共 1 个版本