← 返回
未分类 中文

Autoresearch Pilot

Guide for setting up and running Karpathy's autoresearch — autonomous AI-driven LLM training experiments. Helps write program.md, interpret results, and opti...
设置并运行 Karpathy 的 autoresearch(自主 AI 驱动的 LLM 训练实验)的指南。帮助编写 program.md、解读结果并优化...
tommot2 tommot2 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 365
下载
💾 1
安装
1
版本
#latest

概述

Autoresearch Pilot v1.0

Install: clawhub install autoresearch-pilot

Your co-pilot for Karpathy's autoresearch — autonomous AI-driven LLM training experiments on a single GPU.

Language

Detect from user's message language. Default: English.

How It Works

Autoresearch lets an AI agent modify train.py, run 5-minute experiments, check if val_bpb improved, and iterate. This skill helps you set it up, write optimal program.md, and interpret results.

The Three Files

FileRoleModified by
-------------------------
prepare.pyData prep, tokenizer, utilitiesNever (fixed)
train.pyModel, optimizer, training loopThe AI agent
program.mdInstructions for the AI agentYou (the human)

Key Concepts

  • val_bpb — Validation bits per byte. Lower = better. Vocab-size-independent metric.
  • Time budget — Each experiment runs exactly 5 minutes (wall clock). ~100 experiments per night.
  • Muon optimizer — Included. Often outperforms AdamW for small models.
  • DEPTH — Primary model complexity knob (default 8). Lower for smaller GPUs.

Setup Guide

Walk the user through these steps when they want to start:

  1. Prerequisites: Python 3.10+, NVIDIA GPU (H100 recommended), uv package manager
  2. Clone repo: git clone https://github.com/karpathy/autoresearch
  3. Install: uv sync inside the repo
  4. Prepare data: uv run prepare.py (one-time, ~2 min)
  5. Test run: uv run train.py (should complete in ~5 min)
  6. Point your AI agent at program.md and let it experiment

Small GPU Tips (RTX 3090, Macbook, etc.)

When the user has a smaller GPU, suggest these prepare.py changes:

  • Use TinyStories dataset (lower entropy, works with small models)
  • Lower vocab_size to 4096 or 2048 (or 256 for byte-level)
  • Lower MAX_SEQ_LEN to 256
  • Lower DEPTH to 4 in train.py
  • Use WINDOW_PATTERN of "L" only
  • Lower TOTAL_BATCH_SIZE to 2**14

Writing program.md

When the user asks for help with program.md, help them define:

  1. Research goal — What to optimize for (speed, quality, efficiency)
  2. Experiment strategy — What to try first, what to vary
  3. Success criteria — Target val_bpb or improvement threshold
  4. Safety guardrails — What the agent should NOT change

Example structure for program.md:

  • State the goal clearly
  • List allowed modifications (architecture, hyperparams, optimizer)
  • Define experiment logging format
  • Set a stopping condition (e.g., "stop after 50 experiments with no improvement")

Interpreting Results

When the user shares experiment logs:

MetricGoodBad
-------------------
val_bpb decreasingModel is learningCheck for bugs
val_bpb plateauedMay need architecture changeNormal for small models
Training loss << val lossOverfittingIncrease regularization
NaN lossLearning rate too high or instabilityLower LR, check gradients

Quick Commands

User saysAction
-------------------
"set up autoresearch"Walk through setup steps
"help me write program.md"Draft research instructions
"my val_bpb is X"Evaluate and suggest next steps
"optimize for small GPU"Suggest parameter changes
"what should I try next"Analyze recent experiments, propose new direction

Guidelines for Agent

  1. Read-only guidance — suggest changes, let the user apply them
  2. Check GPU capability — ask what GPU they have before recommending parameters
  3. Start simple — recommend TinyStories + DEPTH 4 for first-time users
  4. Explain val_bpb — many users are new to this metric
  5. Refer to autoresearch repo — it's the source of truth for all defaults
  6. No exec — guide only, never run training commands

What This Skill Does NOT Do

  • Does NOT run training commands or experiments
  • Does NOT modify train.py or prepare.py directly
  • Does NOT require an NVIDIA GPU (guidance works for any platform)
  • Does NOT access credentials or private data
  • Does NOT write any files — pure advisory

More by TommoT2

  • setup-doctor — Diagnose and fix OpenClaw setup issues
  • context-brief — Persistent context survival across sessions
  • model-pilot — Intelligent model routing and cost optimization

Install the full suite:

clawhub install autoresearch-pilot setup-doctor context-brief model-pilot

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-03 10:47 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

suspicious
查看报告

🔗 相关推荐

dev-programming

Mcporter

steipete
使用 mcporter CLI 直接列出、配置、认证及调用 MCP 服务器/工具(支持 HTTP 或 stdio),涵盖临时服务器、配置编辑及 CLI/类型生成功能。
★ 195 📥 67,533
dev-programming

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 677 📥 326,755
ai-agent

Context Brief

tommot2
Persistent context survival for OpenClaw. Writes file-based anchors to memory/anchors/ to preserve critical context acro
★ 0 📥 660