← 返回
未分类 中文

Image to Editable PowerPoint

Convert static images (slides, posters, infographics) to editable PowerPoint files. OCR detects text, classical CV textmask detects ink pixels, mask-clip pre...
将静态图片(如幻灯片、海报、信息图)转换为可编辑的 PowerPoint 文件。OCR 检测文字,传统 CV 文本掩码识别墨迹像素,掩码裁剪预处理……
minutemighty minutemighty 来源
未分类 clawhub v1.0.1 1 版本 100000 Key: 无需
★ 2
Stars
📥 373
下载
💾 1
安装
1
版本
#image processing#inpainting#latest#ocr#pptx

概述

image2pptx: Image to Editable PowerPoint

What It Does

Converts a static image into an editable .pptx file where every text element is a selectable, editable text box over a clean inpainted background.

  1. OCR (PaddleOCR PP-OCRv5) — detects text regions with bounding boxes and content
  2. Textmask (classical CV) — finds text ink pixels via adaptive thresholding
  3. Mask-clip — ANDs textmask with OCR bboxes to preserve non-text elements
  4. Inpaint (LAMA) — reconstructs masked regions with neural inpainting
  5. Assemble — places editable text boxes with auto-scaled fonts and detected colors

When to Use

ScenarioRecommendation
-------------------------
Slide with text on solid/flat backgroundBest results
Slide with photo backgroundGood — uses inpainting (warn about overlap areas)
Slide with solid backgroundGood — use --skip-inpaint for speed
Chinese/multilingual slideGood — ch OCR handles both Chinese and English
Poster or infographic with textGood — works well if text is separate from graphics
Dense chart with axis labels on barsCaution — line grouping may over-merge crowded labels
Very thick/large decorative fontsCaution — may exceed standard mask dilation range
Extract individual assets as PNGsNo — use px-asset-extract
Read text without creating PPTXNo — use OCR directly
Edit an existing .pptx fileNo — use the pptx skill

Installation

git clone https://github.com/JadeLiu-tech/px-image2pptx.git
cd px-image2pptx
pip install -e ".[all]"

Usage

CLI

px-image2pptx slide.png -o output.pptx
px-image2pptx slide.png -o output.pptx --lang ch
px-image2pptx slide.png -o output.pptx --skip-inpaint
px-image2pptx slide.png -o output.pptx --ocr-json text_regions.json
px-image2pptx slide.png -o output.pptx --work-dir ./debug/

Python API

from px_image2pptx import image_to_pptx

report = image_to_pptx("slide.png", "output.pptx")

# With options
report = image_to_pptx(
    "slide.png", "output.pptx",
    lang="ch",
    skip_inpaint=False,
    work_dir="./debug/",
)

CLI Options

OptionDefaultDescription
------------------------------
-o, --outputoutput.pptxOutput PPTX path
--ocr-jsonPre-computed OCR JSON (skips OCR)
--langautoOCR language: auto, en, ch
--sensitivity16Textmask sensitivity (lower = more)
--dilation12Textmask dilation pixels
--min-font8Min font size in points
--max-font72Max font size in points
--skip-inpaintSkip LAMA inpainting
--work-dirSave intermediate files

Models

Downloaded automatically on first use (~370 MB total). All models are from official open-source repositories.

ModelSizeLicenseSource
------------------------------
PP-OCRv5_server_det84 MBApache 2.0PaddlePaddle/PaddleOCR
PP-OCRv5_server_rec81 MBApache 2.0PaddlePaddle/PaddleOCR
big-lama196 MBApache 2.0advimman/lama

Models are cached locally after first download (~/.paddlex/official_models/ for OCR, ~/.cache/torch/hub/checkpoints/ for LAMA). To skip model downloads entirely, use --ocr-json with pre-computed OCR and --skip-inpaint.

Limitations — When to Warn the User

InputImpactWhat to tell the user
-------------------------------------
Text on solid/flat backgroundBest resultsNo caveats needed
Text on textured backgroundGood resultsLAMA handles repeating textures well
Text overlapping photosInpainting artifacts likely"Areas where text covers photos may show blurring"
Dense chart with many labelsOver-merged labels"Crowded labels may be grouped incorrectly"
Very thick/large fontsIncomplete mask coverage"Large fonts may exceed dilation range — try increasing --dilation"
Light text on dark backgroundBlockier inpainting"White-on-dark text uses box masks instead of tight ink masks"
WebP imageOCR fails (0 regions)Convert to PNG first: Image.open("in.webp").save("in.png")
Very large image (>4000px)Slow inpaintingSuggest --skip-inpaint or downscaling
Decorative/handwritten fontsTypeface won't match"Fonts are reconstructed as Arial/Helvetica"
Centered/justified textLeft-aligned output"Text alignment is not preserved"

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-05-07 07:03 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

office-efficiency

Nano Pdf

steipete
使用nano-pdf CLI通过自然语言指令编辑PDF
★ 281 📥 117,201
office-efficiency

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 934 📥 187,549
office-efficiency

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 393 📥 148,701