← 返回
内容创作 Key 中文

nano-banana-pdf-edit

Edit PDF files visually using natural language with the nano-pdf CLI tool, powered by Google's Gemini 3 Pro Image (Nano Banana). Use this skill whenever the...
利用 Google Gemini 3 Pro Image (Nano Banana) 驱动的 nano-pdf CLI 工具,通过自然语言可视化编辑 PDF 文件。
ps06756
内容创作 clawhub v1.0.1 1 版本 99923 Key: 需要
★ 3
Stars
📥 1,237
下载
💾 150
安装
1
版本
#latest

概述

Nano PDF Editing Skill

Edit PDF files with natural language prompts using the nano-pdf CLI tool.

Nano-PDF converts PDF pages to images, sends them to Google's Gemini 3 Pro Image with your edit instructions, then stitches the AI-edited pages back into the PDF — preserving searchable text via OCR re-hydration.

Source: https://github.com/gavrielc/Nano-PDF

Prerequisites

Before running any nano-pdf command, ensure the following dependencies are installed. If any are missing, install them before proceeding:

  1. nano-pdfpip install nano-pdf (or use uvx nano-pdf to run without installing)
  2. poppler — PDF-to-image rendering (brew install poppler on macOS / sudo apt-get install poppler-utils on Linux)
  3. tesseract — OCR for text layer restoration (brew install tesseract on macOS / sudo apt-get install tesseract-ocr on Linux)
  4. GEMINI_API_KEY — A paid Google Gemini API key (free tier does not support image generation). Get one at https://aistudio.google.com/api-keys — then export GEMINI_API_KEY="your_key"

Two Commands

nano-pdf edit — Modify existing pages

nano-pdf edit <file.pdf> <page> "<prompt>" [<page> "<prompt>" ...] [options]

Pages are 1-indexed. Multiple page+prompt pairs can be provided and are processed in parallel.

nano-pdf add — Insert new AI-generated slides

nano-pdf add <file.pdf> <position> "<prompt>" [options]

Position 0 inserts at the beginning. The new slide automatically matches the visual style of the existing deck. Document context is enabled by default for add.

Options Reference

For full details on all available flags, read references/options.md in this skill directory.

Key flags:

  • --output "new.pdf" — Output filename (default: edited_.pdf)
  • --resolution "4K"4K (default), 2K, or 1K
  • --style-refs "1,5" — Pages to use as style references
  • --use-context / --no-use-context — Include full PDF text as model context
  • --disable-google-search — Prevent model from using Google Search

Workflow

When a user asks to edit a PDF:

  1. Check dependencies — Ensure nano-pdf, poppler, tesseract, and GEMINI_API_KEY are available. If any are missing, tell the user what to install and stop.
  2. Identify the edit — Determine which page(s) need changes and what the prompt should be
  3. Choose the right commandedit for modifying existing pages, add for inserting new ones
  4. Pick appropriate options:
    • Use --style-refs if the user wants a specific visual style from certain pages
    • Use --use-context when editing multiple pages that need to be consistent
    • Use --resolution "2K" if speed matters more than quality
  5. Run nano-pdf and present the output PDF to the user

Prompt Writing Tips

The quality of the edit depends heavily on the prompt. Follow these guidelines:

  • Be specific: "Change the title from 'Overview' to 'Q3 Summary'" beats "update the title"
  • Reference visible elements: "The bar chart on the left side" helps the model locate what to change
  • One focused change per prompt: For complex edits, use multiple page+prompt pairs
  • Mention what to preserve: "Keep the layout the same but change the header color to blue"
  • Use style refs for consistency: When updating branding across pages, point at a reference page

Examples

For a comprehensive set of examples covering common use cases (typos, charts, branding, adding slides, batch edits), read references/examples.md in this skill directory.

Quick reference:

# Fix a typo on page 3
nano-pdf edit report.pdf 3 "Fix 'recieve' to 'receive'"

# Update chart data
nano-pdf edit deck.pdf 12 "Update the revenue chart to show Q3 at $2.5M"

# Multi-page branding update
nano-pdf edit slides.pdf \
  1 "Change header background to dark blue, text to white" \
  2 "Update the logo to show 'NewCorp' instead of 'OldCorp'" \
  --style-refs "1" --output branded.pdf

# Add a new title slide at the beginning
nano-pdf add deck.pdf 0 "Title slide: 'Annual Review 2025' with subtitle 'Building the Future'"

# Add a summary slide after page 5 using document context
nano-pdf add deck.pdf 5 "Summary slide with key takeaways as bullet points"

Troubleshooting

IssueSolution
-----------------
Missing system dependenciesInstall missing deps (see Prerequisites above), restart terminal
GEMINI_API_KEY not foundexport GEMINI_API_KEY="your_key"
PAID API key requiredEnable billing at https://aistudio.google.com/api-keys
Style mismatchUse --style-refs "1,3" pointing at pages with desired style
Slow processingUse --resolution "2K" or "1K"
Bad OCR / text layerUse --resolution "4K" for better OCR accuracy
Model ignores part of promptBreak into smaller, focused edits across multiple runs

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-03-29 08:54 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,247
content-creation

AdMapix

fly0pants
广告情报与应用数据分析助手,支持搜索广告素材、分析应用排名、下载量、收入及市场洞察,用于广告素材和竞品分析。
★ 295 📥 136,554
content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 862 📥 200,234