This skill provides a pipeline for transforming web-based slide decks (HTML/CSS) into professional, native PowerPoint files. It uses a hybrid approach: Python for content extraction (BeautifulSoup) and Node.js (PptxGenJS) for layout reconstruction.
# Python dependencies
pip install beautifulsoup4 lxml
# Node.js dependencies
npm install -g pptxgenjs
Parse the HTML to extract semantic slide data (titles, bullet points, metrics, tables, and layouts).
from bs4 import BeautifulSoup
import json
def parse_slides(html_path):
soup = BeautifulSoup(open(html_path), 'lxml')
slides = []
for div in soup.find_all('div', class_='slide'):
# Extract based on classes like 'cover-slide', 'content-box', 'two-column'
# Return a structured JSON list
pass
return slides
Map the extracted JSON data to PptxGenJS objects using a 10" x 5.625" coordinate system.
const pptxgen = require("pptxgenjs");
let pres = new pptxgen();
// Define a consistent color palette
const COLORS = { PRIMARY: "0F3460", ACCENT: "00D4FF", TEXT: "FFFFFF" };
// Iterate through parsed data and add slides
data.forEach(item => {
let slide = pres.addSlide();
// Use slide.addText, slide.addShape, slide.addTable based on item.type
});
pres.writeFile({ fileName: "output.pptx" });
addShape and addText instead of screenshots to ensure the PPT is editable.grid or flex layouts to manual X/Y coordinates (e.g., Two-column = X:0.5 and X:5.2).background-color or linear-gradient equivalent.# prefix (e.g., 00D4FF).currentY based on row count.共 1 个版本