> A 4-stage, audit-trailed Chinese→English deck globalization system with a swappable project profile.
This skill bundles the generic deck-globalization engine (originally upstream
DeckGlobalizer v2.1.1) and an editable PROFILE block (palette, fonts,
glossary, style preferences). The two are separated by section so the
profile can be swapped per project / brand without touching the engine.
For a marketing-style overview, see README.md in this directory.
For implementation, see scripts/ and the per-stage runbooks below.
| Mode | Trigger | Stages |
|---|---|---|
| --- | --- | --- |
| Full pipeline | CN deck (± EN draft) + user wants English output | 1 → 2 → 3 → 4 |
| Polish-only | Single-language deck + "layout / format only / skip translation" | 1 → 3 → 4 |
| Reverse-sync only | User hand-edited a PPT after a comparison Excel was generated | 3.5 (sync sub-routine) |
Detect the mode in the first turn. If ambiguous, ask one yes/no question
("This deck is already in EN — should I just polish layout, or also rewrite
McKinsey-style?"). Do not guess silently.
Edit this block to retarget the skill for your project / brand. Everything
below this block is profile-agnostic.
PROFILE:
# ---- L1 Tokens ----
palette:
# Replace with your brand colors.
ink: "#1A1A1A"
primary: "#000000" # accent / brand primary
soft: "#FFFFFF" # soft fill behind banners
page_bg: "#FFFFFF"
fonts:
# Choose a serif title face + a sans-serif body face for best contrast.
title: "Georgia"
body: "Verdana"
title_bold: true
unit_table:
# Chinese number magnitudes → English. 亿 is 100M, NOT "billion".
"百万": "M"
"千万": "10M"
"亿": "100M"
"十亿": "1B"
"百亿": "10B"
"千亿": "100B"
"万亿": "1T"
# Currency suffix is left to the user — append "$" / "RMB" / "€" as appropriate.
# ---- L2 Constants ----
size_ladder: [22, 14, 10, 8, 6, 4] # H1, H2, body, caption, footnote, source
floors:
body: 7
caption: 6
source: 4
compression_step: 0.1 # discrete -0.1pt iterations only
line_height_default: 1.25
line_height_fallback: 1.15 # used before sub-floor compression
quote_style: "single" # 'McKinsey' single quotes
footer_format: "Confidential · For Intended Recipients Only · {month} {year}"
separator_in_footer: "·" # middle dot, NOT em-dash
# ---- L1 Glossary (extensible) ----
# Replace the example entries below with your project's locked terms.
# Categories are illustrative; you can rename / add / remove.
glossary:
locked:
people_orgs:
# "<source term>": "<canonical translation>"
# e.g. "John Smith": "John Smith"
# e.g. "Acme Capital": "Acme Capital"
{}
business_terms:
# Common Chinese business-deck idioms with industry-standard
# English mappings. Edit / extend as needed.
"流水": "gross revenue"
"私域": "owned audience"
"出海": "global expansion"
domain_specific:
# Project / industry / domain terms.
# "<source term>": "<canonical translation>"
{}
rejected_rewrites:
# Entries the user vetoed during prior sessions.
# Format: { source: "...", proposed: "...", reason: "..." }
[]
pending: []
session_added: []
# ---- Style rules ----
# McKinsey is the default baseline. Additional style references can be
# uploaded and distilled via scripts/style_distill.py; their rules layer
# ON TOP of the McKinsey base.
style_baseline: "mckinsey"
mckinsey:
title_is_takeaway: true # title = the so-what, not the topic
lead_with_so_what: true
parallel_structure: true # bullets share tense, opening part-of-speech
strong_action_verbs: true # cut "is/has", prefer concrete verb
cut_filler:
- "in order to → to"
- "a number of → many"
- "due to the fact that → because"
- "at this point in time → now"
case: "sentence" # lowercase unless proper noun or locked term
em_dash_policy: "use em-dash for parentheticals; use · (middle dot) in lists/footers"
style_references:
# Each entry is a PDF / .pptx reference. style_distill.py reads it and
# emits rules (cadence, signature phrases, paragraph length, tone) that
# layer on top of the McKinsey base. Conflicts: more recent entry wins;
# user is asked at first conflict.
# Example:
# - path: "/path/to/sample.pdf"
# weight: 0.7
[]
# ---- Structural anchor heuristics ----
anchor_detection:
min_pages: 3 # appears on ≥3 slides
match_on: # signature components
- position_xy
- fill_color
- font_size_class
auto_protect: true
# ---- Overflow estimator ----
overflow:
severity:
high: 1.5
med: 1.15
low: 1.0
surface_only: "high" # surface MED/LOW only when explicitly asked
defer_to_user_threshold: 10 # if HIGH > 10 → ask user to render externally
# ---- CN ↔ EN slide alignment ----
# Default is 1:1 (EN slide N maps to CN slide N).
# Set overrides only when the two decks have been restructured.
# Pass this config to excel_sync.py via `--cn-offset <yaml>`.
cn_en_slide_offset:
default: 0 # offset added to EN slide number (0 = 1:1)
overrides: {} # e.g. {"9-26": -1, "20": null}
# int = relative offset; null = no CN counterpart
> Profile-agnostic note: all sections below treat PROFILE as an
> opaque dict. Do not hardcode project-specific values anywhere outside the
> PROFILE block.
Each stage has: inputs · what it does · outputs · stop-and-ask conditions.
Inputs: one or two .pptx paths (CN, optional EN draft)
What it does:
scripts/sense_pass.py to extract:PROFILE.palette / PROFILE.fonts. If a sensed font is NOT in the whitelist AND NOT in SKIP_POLLUTION,
record it as font pollution.
≥2 times and isn't already in glossary.locked.
Outputs:
Style_Manifest.md (in-memory; not written to disk unless requested)pollution_report (slide → font → count)candidate_glossary (term → count → sample context)Stop-and-ask:
ask user, write answer to glossary.session_added.
PROFILE.palette.primary →ask whether to update profile or keep existing.
Inputs: Stage 1 outputs + the CN deck (and optional EN draft for diff context) + any uploaded style_references.
Style layering: McKinsey base rules (PROFILE.mckinsey) apply first. If
PROFILE.style_references is non-empty, run scripts/style_distill.py on
each reference before translation begins; the distilled rules (cadence,
signature phrases, paragraph length, tone) layer on top. More recent entry
wins on conflict; ask user at the first conflict.
Page-by-page execution (hard requirement):
per-page edit count + sample of style rules in effect; wait for "go".
scripts/extract.py. filler-word table applied; glossary locked inline
session_addedscripts/apply.py with the slide's batch → writes that slide's changes into AND appends rows to
immediately.
wait for user OK before moving to P{n+1}.
Why per-page (not all-at-once):
Inputs: the post-translation deck (or, in polish-only mode, the raw deck)
What it does:
Run scripts/layout_audit.py --fix:
font.name is NOT in the title/body whitelist OR ends in a style suffix (Bold / Regular / Italic / Light):
font.name to the pure familyfont.bold / font.italic attributes accordinglySKIP_POLLUTION set.Run scripts/anchor_detect.py:
(rounded_position, fill_color, font_size_class).PROFILE.anchor_detection.min_pagespages becomes an anchor.
per_page_protect[page] = [anchor_shape_ids...].Run scripts/overflow_recheck.py:
auto_size (skip if SHAPE_TO_FIT_TEXT or TEXT_TO_FIT_SHAPE).margin_*.PROFILE.line_height_default = 1.25 initially. If a shape is flagged,try 1.15 as a what-if before flagging as HIGH.
iIl, wide MW, digits, upper, space).ratio > PROFILE.overflow.severity.high) by default.If HIGH count > PROFILE.overflow.defer_to_user_threshold:
pages look broken, I'll fix those targeted pages."
For each shape needing fix:
per_page_protect[page]? → SKIP (it's an anchor).width until ratio < 1.0 OR shape collides.font.size -= PROFILE.compression_step (0.1pt) until floor (PROFILE.floors.) hit.
current size, the calculated ratio, and ask whether to break the floor.
Run scripts/glossary_audit.py:
glossary.locked:appears → flag.
(wavering) → flag.
translation; ask otherwise.
Run scripts/excel_sync.py --reverse:
en_optimized column.difflib.get_close_matches against same-slide texts.Outputs:
-final-.pptx (full pipeline) or -final-.pptx (polish-only)Inputs: all prior-stage outputs
What it does:
HANDOFF.md to the same directory as the deck — see scripts/handoff.py.Stop-and-ask: none.
Before writing any .pptx or .xlsx:
~$ lock file in the same directory. is open inPowerPoint/Excel. Save and close it, then say 'go' to continue."
[page, kind, cn, en_original, en_optimized, notes]. If columns missing
→ rebuild header before writing data.
max_column ≥ 7 and header is intact.
See Stage 3d. The single rule: never bulk-reduce font sizes.
Always discrete -0.1pt, always after exhausting widening + line-height
fallback, always with anchor protection.
session_added forthe rest of the session.
session_added to a glossary_proposed_additions.yamlfile next to the deck. The user can copy them into PROFILE for the next run.
Any number with a CN magnitude word (百万 / 千万 / 亿 / 百亿 / 千亿 / 万亿)
must be re-verified against PROFILE.unit_table before being written to EN.
Treat this as a HARD CHECK; do NOT take prior-session translations on faith.
When auto-aligning the Excel's cn column by paragraph ordinal:
notes column as needs-review.scripts/)| Script | Role | Stage |
|---|---|---|
| --- | --- | --- |
sense_pass.py | extract design DNA, font usage, palette | 1 |
extract.py | paragraph-level text extraction | 1, 2, 3 |
apply.py | apply EN edits + write Excel with highlight | 2 |
layout_audit.py | font pollution cleanup, suffix audit | 3a |
anchor_detect.py | cross-page anchor signature detection | 3b |
overflow_recheck.py | overflow estimator with severity tiers | 3c |
glossary_audit.py | late-stage glossary re-scan + wavering | 3e |
excel_sync.py | bidirectional PPT ↔ Excel sync (configurable slide offset) | 3f |
handoff.py | write HANDOFF.md | 4 |
style_distill.py | distill style fingerprint from a reference PDF/.pptx | pre-2 |
Each script is invokable standalone; the skill wires them together.
Full pipeline (4 files):
-en-polished-.pptx -final-.pptx -bilingual-diff-.xlsx HANDOFF.mdPolish-only (3 files):
-final-.pptx -layout-changes-.xlsx HANDOFF.mdMode 3.5 (reverse-sync only):
-bilingual-diff-.xlsx requires external rendering (Keynote / PowerPoint export to PDF).
python-pptx cannot render slides. There is no built-in preview. pages — configure PROFILE.cn_en_slide_offset.overrides for known cases.
propagate to EN unless the user catches them.
~$xxx checkis the only line of defense.
See CHANGELOG.md.
Generic deck-globalization engine derived from upstream DeckGlobalizer v2.1.1
by tinadu-ai (
three-phase architecture (Visual Audit / Semantic Alignment /
Page-by-Page Execution) credited and retained.
共 1 个版本