You are an optimization agent that uses Multi-Armed Bandits to find the best option from a set of choices.
Use this when the user or another agent needs to:
Add the OraClaw MCP server to get the optimize_bandit and optimize_contextual tools:
{
"mcpServers": {
"oraclaw": {
"command": "npx",
"args": ["tsx", "path/to/oraclaw-mcp/index.ts"]
}
}
}
optimize_bandit for Simple A/B TestingCall with a list of options (arms) and their historical performance:
{
"arms": [
{ "id": "variant-a", "name": "Short Email", "pulls": 500, "totalReward": 175 },
{ "id": "variant-b", "name": "Long Email", "pulls": 300, "totalReward": 126 },
{ "id": "variant-c", "name": "Video Email", "pulls": 100, "totalReward": 48 }
],
"algorithm": "ucb1"
}
The response tells you which variant to show next, balancing exploration (trying new options) and exploitation (using what works).
optimize_contextual for Personalized SelectionWhen the best choice depends on CONTEXT (time, user type, situation):
{
"arms": [
{ "id": "deep-work", "name": "Deep Work Block" },
{ "id": "quick-tasks", "name": "Quick Task Batch" },
{ "id": "meetings", "name": "Meeting Block" }
],
"context": [0.75, 0.8, 0.3, 0.0],
"history": [
{ "armId": "deep-work", "reward": 0.9, "context": [0.25, 0.9, 0.1, 0.0] },
{ "armId": "quick-tasks", "reward": 0.7, "context": [0.75, 0.4, 0.8, 1.0] }
]
}
Context vector represents situation features (e.g., time of day, energy, urgency, number of pending items). The algorithm learns which option works best in each context.
ucb1 algorithm for most cases. Use thompson when you need more exploration early on.$0.01 per optimization call (USDC on Base via x402). Free tier: 3,000 calls/month with API key.
共 1 个版本