← 返回
AI智能 中文

Agent Learner

Benchmark and compare agent prompts and evaluation results. Use when tuning strategies, evaluating outputs, or comparing configurations.
{"answer":"基准测试并对比智能体提示词与评估结果。适用于调优策略、评估输出或比较配置。"}
xueyetianya
AI智能 clawhub v2.0.2 4 版本 100000 Key: 无需
★ 0
Stars
📥 741
下载
💾 11
安装
4
版本
#chinese#latest#productivity

概述

Agent Learner

An AI toolkit for configuring, benchmarking, comparing, and optimizing agent prompts and evaluation results. Agent Learner provides persistent, file-based logging for each command category with timestamped entries, summary statistics, multi-format export, and full-text search across all records.

Commands

CommandDescription
----------------------
configureConfigure agent settings — log configuration entries or view recent ones
benchmarkBenchmark agent performance — log benchmark results or view history
compareCompare agent outputs — log comparison data or view recent comparisons
promptPrompt management — log prompt variations or view recent prompts
evaluateEvaluate agent outputs — log evaluation results or view history
fine-tuneFine-tune parameters — log fine-tuning sessions or view recent ones
analyzeAnalyze agent behavior — log analysis entries or view recent analyses
costCost tracking — log cost data or view recent cost entries
usageUsage monitoring — log usage metrics or view recent usage data
optimizeOptimize configurations — log optimization runs or view history
testTest agent behavior — log test results or view recent tests
reportReport generation — log report entries or view recent reports
statsShow summary statistics across all log categories (entry counts, data size, first entry date)
export Export all data in json, csv, or txt format to the data directory
search Full-text search across all log files (case-insensitive)
recentShow the 20 most recent entries from the activity history log
statusHealth check — show version, data directory, total entries, disk usage, and last activity
helpShow the full help message with all available commands
versionPrint the current version string

Each data command (configure, benchmark, compare, etc.) works in two modes:

  • Without arguments: displays the 20 most recent entries from that category
  • With arguments: saves the input as a new timestamped entry and reports the total count

Data Storage

All data is stored in plain text files under the data directory:

  • Category logs: $DATA_DIR/.log — one file per command (e.g., configure.log, benchmark.log, prompt.log), each entry is timestamp|value
  • History log: $DATA_DIR/history.log — audit trail of every command executed with timestamps
  • Export files: $DATA_DIR/export. — generated by the export command in json, csv, or txt format

Default data directory: ~/.local/share/agent-learner/

Requirements

  • Bash (with set -euo pipefail support)
  • Standard Unix utilities: grep, cat, date, echo, wc, du, head, tail, basename
  • No external dependencies or API keys required

When to Use

  1. Benchmarking agent performance — When you need to track and compare benchmark results across different agent configurations, models, or prompt strategies
  2. Prompt engineering iteration — When you're testing multiple prompt variations and want to log each version with results for later comparison
  3. Cost and usage tracking — When you need to monitor API costs and usage metrics over time to optimize spending
  4. Fine-tuning experiments — When running fine-tuning sessions and you want to log parameters, results, and observations for reproducibility
  5. Cross-category analysis — When you need to search across all logged data (benchmarks, prompts, evaluations, costs) to find patterns or specific entries

Examples

# Initialize and check status
agent-learner status

# Log a benchmark result
agent-learner benchmark "GPT-4o on MMLU: 88.7% accuracy, 1.2s avg latency"

# Log a prompt variation
agent-learner prompt "System: You are a helpful coding assistant. Always explain your reasoning step by step."

# Compare two configurations
agent-learner compare "GPT-4o vs Claude-3.5: GPT-4o 12% faster, Claude 5% more accurate on code tasks"

# Track costs
agent-learner cost "March batch: 12,450 tokens input, 3,200 tokens output, $0.47 total"

# View all recent benchmarks
agent-learner benchmark

# Search across all logs for a specific term
agent-learner search "accuracy"

# Export all data as JSON
agent-learner export json

# View summary statistics
agent-learner stats

# Show recent activity
agent-learner recent

Output

All commands return output to stdout. Export files are written to the data directory:

agent-learner export json   # → ~/.local/share/agent-learner/export.json
agent-learner export csv    # → ~/.local/share/agent-learner/export.csv
agent-learner export txt    # → ~/.local/share/agent-learner/export.txt

Every command execution is logged to $DATA_DIR/history.log for auditing purposes.


Powered by BytesAgain | bytesagain.com | hello@bytesagain.com

版本历史

共 4 个版本

  • v2.0.2 当前
    2026-03-29 11:34 安全 安全
  • v2.3.6
    2026-03-27 21:20
  • v1.0.0
    2026-03-26 22:25
  • v1.0.1
    2026-03-14 02:07

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Shell

xueyetianya
Shell脚本参考:涵盖Bash语法、重定向、进程替换、信号处理及调试技巧。适用于编写Shell脚本和故障排查。
★ 1 📥 6,137
ai-intelligence

Proactive Agent

halthelobster
将AI智能体从任务执行者升级为主动预判需求、持续优化的智能伙伴。集成WAL协议、工作缓冲区、自主定时任务及实战验证模式。Hal Stack核心组件 🦞
★ 836 📥 213,222
ai-intelligence

ontology

oswalpalash
类型化知识图谱,用于结构化智能体记忆与可组合技能。支持创建/查询实体(人员、项目、任务、事件、文档)及关联...
★ 712 📥 243,922