概述

paperbanana-dashscope

Native TypeScript CLI for generating academic figures from paper text. Zero Python dependencies. Powered by Alibaba Cloud DashScope.

Install & Update

npm install -g paperbanana-dashscope
paperbanana-dashscope --version

Prerequisites

User must configure a DashScope API key. Check current status:

paperbanana-dashscope info

If no API key is configured, set one of:

# Option 1: Environment variable (simplest)
export OPENAI_API_KEY="sk-xxx"

# Option 2: Global config file
mkdir -p ~/.paperbanana-dashscope
cat > ~/.paperbanana-dashscope/config.yaml << 'YAML'
defaults:
  main_model_name: "qwen-vl-max"
  image_gen_model_name: "wanx2.1-t2i-turbo"
api_keys:
  openai_api_key: "sk-xxx"
YAML

Basic Usage

Generate a single figure from text:

paperbanana-dashscope generate \
  --content "Method section text describing the architecture..." \
  --caption "Figure 1: System overview" \
  --output ~/Downloads/figure.png \
  --num-candidates 1

Key Options

Option	Description	Default
---	---	---
`--content`	Paper text describing the method	required
`--caption`	Figure caption	required
`--output`	Output PNG file path	required
`--task`	`diagram` or `plot`	`diagram`
`--num-candidates`	Number of candidates to generate	`1`
`--max-critic-rounds`	Critic refinement iterations	`3`
`--aspect-ratio`	`1:1`, `16:9`, `4:3`, `21:9`, etc	`21:9`
`--main-model-name`	VLM for planning/critic	`qwen-vl-max`
`--image-gen-model-name`	Image generation model	`wanx2.1-t2i-turbo`

Available Image Models

DashScope supports three families of text-to-image models:

Wanxiang legacy (fast, cheap):

wanx2.1-t2i-turbo (default, fastest)
wanx2.1-t2i-plus (better quality)

Wanxiang 2.7 (latest, highest quality):

wan2.7-image-pro (professional, supports 4K output in text-to-image)
wan2.7-image (standard, supports up to 2K, same pricing as wan2.6)

Wanxiang 2.x (previous generation):

wan2.6-t2i (flagship of 2.6 series)
wan2.5-t2i-preview
wan2.2-t2i-flash / wan2.2-t2i-plus

Qwen-Image (best for figures with text labels):

qwen-image-plus (recommended for diagrams with English/Chinese labels)
qwen-image-max (top-tier text rendering)

Switch models inline:

paperbanana-dashscope generate \
  --content "..." \
  --caption "..." \
  --image-gen-model-name wan2.6-t2i \
  --output figure.png

Pipeline Modes

Use --exp-mode to control which agents run:

Mode	Agents	Use case
---	---	---
`vanilla`	Vanilla only	Fastest, no refinement
`dev_planner`	Planner only	Just generate description
`dev_planner_critic`	Planner + Critic	With refinement loop
`dev_full`	Planner + Stylist + Visualizer + Critic	Full pipeline
`demo_full`	Same as dev_full + retriever	Default, best quality

Common Workflows

Quick draft (fast, low cost):

paperbanana-dashscope generate \
  --content "..." \
  --caption "..." \
  --output draft.png \
  --exp-mode vanilla \
  --image-gen-model-name wanx2.1-t2i-turbo

High-quality figure for paper submission:

paperbanana-dashscope generate \
  --content "..." \
  --caption "..." \
  --output paper_fig.png \
  --image-gen-model-name wan2.6-t2i \
  --num-candidates 3 \
  --max-critic-rounds 5

Figure with English/Chinese text labels:

paperbanana-dashscope generate \
  --content "..." \
  --caption "..." \
  --output labeled.png \
  --image-gen-model-name qwen-image-plus

Troubleshooting

"未检测到任何 API Key": Run paperbanana-dashscope info and follow the configuration guide.
"size is not in the correct format": This is fixed in v1.0.2+. Run npm update -g paperbanana-dashscope.
"url error": Old version. Upgrade to v1.0.2+ for support of new wan2.6 / qwen-image models.

Resources

npm: https://www.npmjs.com/package/paperbanana-dashscope
GitHub: https://github.com/TashanGKD/PaperBanana-DashScope

版本历史

共 1 个版本

v1.0.5 当前

2026-05-03 09:59 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)