Native TypeScript CLI for generating academic figures from paper text. Zero Python dependencies. Powered by Alibaba Cloud DashScope.
npm install -g paperbanana-dashscope
paperbanana-dashscope --version
User must configure a DashScope API key. Check current status:
paperbanana-dashscope info
If no API key is configured, set one of:
# Option 1: Environment variable (simplest)
export OPENAI_API_KEY="sk-xxx"
# Option 2: Global config file
mkdir -p ~/.paperbanana-dashscope
cat > ~/.paperbanana-dashscope/config.yaml << 'YAML'
defaults:
main_model_name: "qwen-vl-max"
image_gen_model_name: "wanx2.1-t2i-turbo"
api_keys:
openai_api_key: "sk-xxx"
YAML
Generate a single figure from text:
paperbanana-dashscope generate \
--content "Method section text describing the architecture..." \
--caption "Figure 1: System overview" \
--output ~/Downloads/figure.png \
--num-candidates 1
| Option | Description | Default |
|---|---|---|
| --- | --- | --- |
--content | Paper text describing the method | required |
--caption | Figure caption | required |
--output | Output PNG file path | required |
--task | diagram or plot | diagram |
--num-candidates | Number of candidates to generate | 1 |
--max-critic-rounds | Critic refinement iterations | 3 |
--aspect-ratio | 1:1, 16:9, 4:3, 21:9, etc | 21:9 |
--main-model-name | VLM for planning/critic | qwen-vl-max |
--image-gen-model-name | Image generation model | wanx2.1-t2i-turbo |
DashScope supports three families of text-to-image models:
Wanxiang legacy (fast, cheap):
wanx2.1-t2i-turbo (default, fastest)wanx2.1-t2i-plus (better quality)Wanxiang 2.7 (latest, highest quality):
wan2.7-image-pro (professional, supports 4K output in text-to-image)wan2.7-image (standard, supports up to 2K, same pricing as wan2.6)Wanxiang 2.x (previous generation):
wan2.6-t2i (flagship of 2.6 series)wan2.5-t2i-previewwan2.2-t2i-flash / wan2.2-t2i-plusQwen-Image (best for figures with text labels):
qwen-image-plus (recommended for diagrams with English/Chinese labels)qwen-image-max (top-tier text rendering)Switch models inline:
paperbanana-dashscope generate \
--content "..." \
--caption "..." \
--image-gen-model-name wan2.6-t2i \
--output figure.png
Use --exp-mode to control which agents run:
| Mode | Agents | Use case |
|---|---|---|
| --- | --- | --- |
vanilla | Vanilla only | Fastest, no refinement |
dev_planner | Planner only | Just generate description |
dev_planner_critic | Planner + Critic | With refinement loop |
dev_full | Planner + Stylist + Visualizer + Critic | Full pipeline |
demo_full | Same as dev_full + retriever | Default, best quality |
Quick draft (fast, low cost):
paperbanana-dashscope generate \
--content "..." \
--caption "..." \
--output draft.png \
--exp-mode vanilla \
--image-gen-model-name wanx2.1-t2i-turbo
High-quality figure for paper submission:
paperbanana-dashscope generate \
--content "..." \
--caption "..." \
--output paper_fig.png \
--image-gen-model-name wan2.6-t2i \
--num-candidates 3 \
--max-critic-rounds 5
Figure with English/Chinese text labels:
paperbanana-dashscope generate \
--content "..." \
--caption "..." \
--output labeled.png \
--image-gen-model-name qwen-image-plus
paperbanana-dashscope info and follow the configuration guide.npm update -g paperbanana-dashscope.共 1 个版本