← 返回
未分类 中文

llama.cpp Benchmark

Run llama.cpp benchmarks on GGUF models to measure prompt processing (pp) and token generation (tg) performance. Use when the user wants to benchmark LLM mod...
在 GGUF 模型上运行 llama.cpp 基准测试,以测量提示处理(pp)和 token 生成(tg)性能。用于用户想要对 LLM 进行基准测试时。
alexhegit
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 385
下载
💾 1
安装
1
版本
#latest

概述

llamacpp-bench

Run standardized benchmarks on GGUF models using llama.cpp's llama-bench tool.

Quick Start

# Basic benchmark
llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99

# With specific backend
LLAMA_BACKEND=vulkan llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99

Benchmark Parameters

ParameterDescriptionDefault
---------------------------------
-mModel path (GGUF file)required
-pPrompt sizes to test512
-nGeneration lengths to test128
-nglGPU layers to offload99
-tCPU threadsauto
-devDevice selectionauto

Standard Test Suite

For consistent comparisons across models, use:

-p 512,1024,2048 -n 128,256 -ngl 99

This tests:

  • Prompt processing: 512, 1024, 2048 tokens
  • Token generation: 128, 256 tokens

Interpreting Results

MetricMeaningGood Performance
-----------------------------------
pp512Prompt processing speed at 512 tokens>1000 t/s
pp1024Prompt processing speed at 1024 tokens>1000 t/s
pp2048Prompt processing speed at 2048 tokens>1000 t/s
tg128Token generation speed (128 tokens)>50 t/s
tg256Token generation speed (256 tokens)>50 t/s

Backend Selection

llama-bench auto-detects available backends. Priority order:

  1. CUDA (NVIDIA GPUs)
  2. ROCm (AMD GPUs)
  3. Vulkan (cross-platform GPU)
  4. CPU (fallback)

To force a backend, set environment variable or check build:

# Check available backends
llama-bench --help | grep -i "backend\|cuda\|rocm\|vulkan"

Batch Benchmarking

Use the provided script for benchmarking multiple models:

./scripts/benchmark_models.sh /path/to/models/*.gguf

Saving Results

Output can be redirected to a file:

llama-bench -m model.gguf -p 512,1024,2048 -n 128,256 -ngl 99 > results.txt

Or use the benchmark script which auto-saves to timestamped files.

Common Issues

  1. Out of memory: Reduce -ngl (GPU layers) or test smaller prompt sizes
  2. Slow CPU performance: Ensure -t matches CPU core count
  3. Backend not found: Check llama.cpp was built with the desired backend

Building / Updating llama.cpp

Check Current Version

./scripts/build_llamacpp.sh -v

Shows:

  • Current Git commit and branch
  • Build date
  • Whether behind upstream
  • Available backends

Build or Update

# Interactive mode (prompts for backend selection)
./scripts/build_llamacpp.sh -u

# Specify backend directly
./scripts/build_llamacpp.sh -u -b vulkan   # Vulkan (AMD/Intel GPUs)
./scripts/build_llamacpp.sh -u -b cuda     # CUDA (NVIDIA GPUs)
./scripts/build_llamacpp.sh -u -b rocm     # ROCm (AMD GPUs)
./scripts/build_llamacpp.sh -u -b cpu      # CPU only

# Clean rebuild
./scripts/build_llamacpp.sh -c -b vulkan

# Custom build directory
./scripts/build_llamacpp.sh -u -b cuda -d /custom/path

Build Options

FlagDescription
-------------------
-vShow version info and exit
-uUpdate to latest from GitHub
-cClean build (remove existing)
-bBackend: vulkan, cuda, rocm, cpu
-dBuild directory path
-jParallel jobs (default: CPU count)

Finding llama-bench

The benchmark script auto-detects llama-bench in these locations:

  • /DATA/Benchmark/llama.cpp/build/bin/llama-bench
  • ~/Repo/llama.cpp/build/bin/llama-bench
  • ~/lab/build/bin/llama-bench

If not found, it will search your home directory or you can build it using the script above.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 06:17 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,219 📥 266,744
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,362 📥 318,886
developer-tools

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 672 📥 324,418