Convert Markdown (.md) documents to Word (.docx) with embedded images, proper heading styles, tables, and formatting preservation.
All scripts are in the scripts/ directory relative to this skill file. Always use these scripts instead of writing inline code.
| Script | Purpose |
|---|---|
| -------- | --------- |
scripts/install_pandoc.py | Auto-install pandoc on Windows/macOS/Linux |
scripts/check_images.py | Verify all image paths resolve correctly before conversion |
scripts/md_to_docx.py | Convert markdown to docx with image embedding and caption fixing |
scripts/fix_captions.py | Fix image captions in an existing docx: center align, 8pt, no italic |
pandoc --version
If pandoc is not found, run the install script:
py scripts/install_pandoc.py
After installation, verify:
pandoc --version
> IMPORTANT: After successfully installing pandoc, you MUST tell the user explicitly:
>
> "Pandoc has been automatically installed. You may need to restart your terminal for the PATH changes to take effect. If pandoc --version still fails, please restart your terminal and try again."
If auto-install fails, show this message:
> "Could not automatically install pandoc. Please install it manually:
> - Windows: Download from https://github.com/jgm/pandoc/releases/latest
> - macOS: brew install pandoc
> - Linux: sudo apt install pandoc"
Before converting, verify all image references resolve correctly:
py scripts/check_images.py input.md
This checks that all !alt references point to existing files. Pandoc resolves paths relative to the markdown file location.
Recommended: use the bundled script (handles conversion + caption fixing automatically):
py scripts/md_to_docx.py input.md output.docx
This runs pandoc and then automatically fixes image captions (center aligned, 8pt font, no italic).
Manual conversion with pandoc:
pandoc input.md -o output.docx
With reference document (preserves styling from an existing .docx):
pandoc input.md -o output.docx --reference-doc=template.docx
If you used pandoc directly (not the script), fix image captions afterwards:
py scripts/fix_captions.py output.docx
If your markdown has image alt text like ![image2.png] (with file extension), pandoc will use the full filename including .png as the caption text. Fix this before conversion:
import re
with open('input.md', 'r', encoding='utf-8') as f:
content = f.read()
content = re.sub(r'!\[(image\d+)\.\w+\]\(', r'
with open('input.md', 'w', encoding='utf-8') as f:
f.write(content)
Pandoc inserts image alt text as a plain Normal style paragraph after each image. The bundled scripts fix this to:
If using scripts/md_to_docx.py, this is done automatically. Otherwise run:
py scripts/fix_captions.py output.docx
Check that images were properly embedded:
python -c "
import zipfile
with zipfile.ZipFile('output.docx', 'r') as z:
images = [f for f in z.namelist() if 'word/media/' in f]
print(f'Embedded images: {len(images)}')
for img in sorted(images):
print(f' {img}')
"
Check document structure:
python -c "
from docx import Document
doc = Document('output.docx')
headings = [p for p in doc.paragraphs if p.style and p.style.name.startswith('Heading')]
print(f'Headings: {len(headings)}')
print(f'Tables: {len(doc.tables)}')
print(f'Paragraphs: {len(doc.paragraphs)}')
"
When converting docx → md → docx, be aware of:
./img/ relative paths. All 32 images in a typical document convert successfully.## Heading in markdown becomes Heading 2 style in docx.text) and italic (text) are preserved.!alt works when img/ is a sibling of the .md filehttp://...) are downloaded and embedded by pandoc automaticallycheck_images.py firstUsing --reference-doc allows you to control:
Create a reference docx by:
--reference-doctemplate.docx--reference-doc=template.docx for future conversionsproject/
├── input.md # Source markdown
├── output.docx # Converted Word document
└── img/ # Images referenced by markdown
├── image1.png
├── image2.jpeg
└── ...
py script.py not inline in PowerShell (escaping issues).emf images (Windows metafile) are embedded but may not render on non-Windows systems./img/) before conversion--reference-doc is provided共 1 个版本