A comprehensive file format conversion toolkit supporting documents, images, and spreadsheets.
Activate this skill when any of these scenarios occur:
| Source Format | Target Format | Script | Notes |
|--------------|---------------|--------|-------|
| .pdf | .docx | scripts/pdf_to_word.py | Extracts text, tables, layout |
| .docx, .doc | .pdf | scripts/word_to_pdf.py | Requires MS Word on Windows |
| PNG/JPEG/WebP/BMP/GIF/TIFF | Any other image format | scripts/image_converter.py | Quality control, resize support |
| .csv | .xlsx | scripts/excel_csv_converter.py | Custom delimiter/encoding |
| .xlsx, .xls | .csv | scripts/excel_csv_converter.py | Multi-sheet support |
Install required packages before first use:
pip install Pillow pdf2docx docx2pdf openpyxl xlrd
Quick install command:
pip install Pillow pdf2docx docx2pdf openpyxl xlrd
docx2pdf uses COM automation)
pdf2docx
Pillow only, works on all platforms
Determine what the user wants to convert:
User says "convert this PDF" → PDF to Word
User says "turn this into PDF" → Word to PDF
User says "change this PNG to JPG" → Image conversion
User says "export this Excel as CSV" → Excel to CSV
Before running any script, verify required packages are installed:
python -c "import PIL; import pdf2docx; import docx2pdf; import openpyxl"
If imports fail, prompt user to run:
pip install Pillow pdf2docx docx2pdf openpyxl xlrd
Single file:
python scripts/pdf_to_word.py <input.pdf> [output.docx]
Batch mode:
python scripts/pdf_to_word.py --batch ./pdf_folder --output-dir ./output_folder
Options:
--start N: Start from page N (default: 0)
--end N: End at page N (default: all pages)
Single file:
python scripts/word_to_pdf.py <input.docx> [output.pdf]
Batch mode:
python scripts/word_to_pdf.py --batch ./docs_folder --output-dir ./pdfs_folder
Check requirements:
python scripts/word_to_pdf.py --check
Single file:
python scripts/image_converter.py <input_image> --format <target_format>
Batch mode:
python scripts/image_converter.py --batch ./images_dir --format webp --output-dir ./webp_dir
Target formats: png, jpg/jpeg, webp, bmp, gif, tiff/tif
Options:
-q/--quality N: Quality for lossy formats (1-100, default: 95)
--resize WIDTH HEIGHT: Resize dimensions
--no-optimize: Disable optimization
--info: Display image info without converting
Examples:
# High-quality JPEG compression
python scripts/image_converter.py photo.png --format jpg -q 90
# Convert to WebP with smaller size
python scripts/image_converter.py photo.png --format webp -q 80
# Resize while converting
python scripts/image_converter.py large.png --format jpg --resize 1920 1080
# View image info
python scripts/image_converter.py image.png --info
CSV → Excel:
python scripts/excel_csv_converter.py data.csv --format xlsx [options]
Excel → CSV:
python scripts/excel_csv_converter.py data.xlsx --format csv [options]
Batch mode:
python scripts/excel_csv_converter.py --batch ./data_dir --format csv
Options:
-d/--delimiter CHAR: Custom separator (default: comma)
-e/--encoding ENC: File encoding (default: utf-8)
--sheet NAME: Specific sheet name (Excel→CSV)
--sheet-name NAME: Output sheet name (CSV→Excel)
--no-header: Skip header row
--info: Show Excel structure info
Examples:
# European-style CSV (semicolon delimited)
python scripts/excel_csv_converter.py data.csv --format xlsx -d ";"
# Chinese encoding support
python scripts/excel_csv_converter.py data.xlsx --format csv -e gbk
# Export specific sheet
python scripts/excel_csv_converter.py workbook.xlsx --format csv -s "Sales Report"
# View Excel info
python scripts/excel_csv_converter.py data.xlsx --info
After each conversion:
| Error | Cause | Solution |
|-------|-------|----------|
| ModuleNotFoundError | Missing package | Run pip install |
| COMError (Word→PDF) | MS Word not installed | Install MS Office or use alternative |
| UnicodeDecodeError | Wrong encoding | Specify correct --encoding |
| FileNotFoundError | Wrong path | Verify input path is absolute or relative correctly |
| Permission denied | Read-only directory | Use writable output directory |
When dealing with non-UTF-8 files, common encodings:
gbk / gb18030
latin-1 / cp1252
shift-jis / cp932
utf-8 first, then fallback
共 1 个版本