← 返回
未分类

PPTtranslator

Translate Chinese PPTX presentations to English with full-context awareness. This skill should be used when the user wants to translate a Chinese PPT/PPTX file to English, convert Chinese slides to English, or localize a presentation from Chinese to English. Handles both text content and text embedded in images (OCR + overlay replacement). Supports automatic style detection (business/academic/technical). 触发词:翻译PPT、PPT翻译、中文翻译英文、幻灯片翻译、PPT转英文、翻译演示文稿、translate ppt、translate slides、chinese to english
Translate Chinese PPTX presentations to English with full-context awareness. This skill should be used when the user wants to translate a Chinese PPT/PPTX file to English, convert Chinese slides to English, or localize a presentation from Chinese to English. Handles both text content and text embedded in images (OCR + overlay replacement). Supports automatic style detection (business/academic/technical). 触发词:翻译PPT、PPT翻译、中文翻译英文、幻灯片翻译、PPT转英文、翻译演示文稿、translate ppt、translate slides、chinese to english ppt、翻译PPTX
user_d95913fd
未分类 community v1.0.0 1 版本 99386.5 Key: 无需
★ 0
Stars
📥 162
下载
💾 4
安装
1
版本
#latest

概述

PPTX Translator

Overview

Translate Chinese PPTX presentations to English with full-context awareness. This skill extracts all text and images from a PPTX, performs whole-document context-aware translation (not page-by-page), preserves original formatting, and overlays translated text on images where Chinese text is detected.

Core Principle: Whole-Document Translation

CRITICAL: Do NOT translate slide by slide. The entire PPT content must be reviewed first to build a consistent glossary and translation style, then all text is translated in one pass to ensure term consistency across slides.

Workflow

Phase 1: Extract Content

  1. Ensure dependencies are installed:

```bash

pip install python-pptx Pillow

```

  1. Run the extraction script:

```bash

python scripts/extract_content.py

```

This generates:

  • content.json — Structured JSON with all text, formatting metadata, and shape positions
  • images/ — Directory with all extracted images
  1. Read content.json to understand the full document structure and content
  2. Review all extracted images in the images/ directory

Phase 2: Whole-Document Translation

  1. Read ALL text content from content.json before translating anything
  2. Identify the PPT's domain and style:
    • Business/corporate → Use formal business English
    • Academic/research → Use academic English with precise terminology
    • Technical/engineering → Use technical English with standard industry terms
    • Mixed → Apply the dominant style, adjust per section as needed
  3. Build a glossary of key Chinese terms and their English translations:
    • Technical terms, product names, proper nouns
    • Domain-specific jargon
    • Abbreviations and their expansions
  4. Translate all text maintaining glossary consistency
  5. Generate two JSON files:

translations.json (for text in shapes and tables):

```json

{

"translations": [

{

"slide_index": 0,

"shape_index": 0,

"paragraph_index": 0,

"run_translations": [

{"run_index": 0, "translated_text": "Annual Revenue Report"}

]

}

],

"table_translations": [

{

"slide_index": 1,

"shape_index": 3,

"row": 0,

"col": 0,

"paragraph_index": 0,

"run_translations": [

{"run_index": 0, "translated_text": "Q1 Revenue"}

]

}

]

}

```

overlays.json (for text in images):

```json

{

"overlays": [

{

"slide_index": 0,

"shape_index": 2,

"image_filename": "slide_0_shape_2.png",

"text_overlays": [

{

"x_pct": 10.0,

"y_pct": 20.0,

"width_pct": 30.0,

"height_pct": 8.0,

"original_text": "销售额",

"translated_text": "Revenue",

"font_size": 16

}

]

}

]

}

```

Translation rules (see references/translation_guidelines.md for full details):

  • Merge adjacent runs that form a single Chinese phrase before translating; put the full translation in the first run, clear others
  • English text is typically 20-40% longer than Chinese — factor this in for layout
  • Keep proper nouns in their established English form (e.g., company names, product names)
  • Preserve numbers, dates, and units — only translate surrounding text
  • Maintain the original tone and register

Phase 3: Apply Text Translations

python scripts/apply_translations.py <input.pptx> translations.json --output intermediate.pptx

This script:

  • Replaces text in each run while preserving all formatting (font size, color, bold, italic, alignment)
  • Merges multi-run Chinese text and places the full English translation in the first run
  • Auto-expands text box width if English text is likely to overflow

Phase 4: Image Text Overlay

  1. Analyze each extracted image for Chinese text:
    • Use visual capabilities to identify Chinese characters in images
    • For each detected text, estimate its position as percentage coordinates (x_pct, y_pct, width_pct, height_pct)
    • Translate the text and determine appropriate font size
  1. Generate overlays.json with all detected image text translations and coordinates
  1. Run the overlay script:

```bash

python scripts/overlay_image_text.py intermediate.pptx overlays.json --output final.pptx

```

This script:

  • Draws white rectangles over the Chinese text areas in images
  • Renders the English translation text on top
  • Replaces the original images in the PPTX with modified versions

Phase 5: Finalize Output

  1. Create the output directory:

```

/translated/

```

  1. Save the final translated PPTX:

```

/translated/_en.pptx

```

  1. Clean up temporary files (intermediate.pptx, extraction directory)
  1. Report completion with:
    • Output file path
    • Number of slides translated
    • Number of images with text overlays applied
    • Any warnings (e.g., images that could not be processed)

Error Handling

  • If python-pptx or Pillow is not installed, install them automatically with pip install python-pptx Pillow
  • If an image format is unsupported (e.g., EMF/WMF), log a warning and skip that image's overlay
  • If a shape has no text frame or is not translatable, skip it silently
  • If text expansion causes layout issues, note it in the completion report

Important Notes

  • The scripts directory also contains requirements.txt for dependency management
  • For EMF/WMF images (common in Windows-generated PPTs), overlay may not work — these are logged as warnings
  • Always use the absolute path of scripts when invoking them: /scripts/