Use the bundled scripts to extract media, download it, transcribe it offline, and render DOCX output.
Prefer the deterministic scripts before hand-rolling new code.
Use {baseDir} when constructing file paths inside this skill so the instructions stay portable across agents and marketplaces.
python {baseDir}/scripts/bootstrap_env.py once in the target environment.python {baseDir}/scripts/pipeline_web_to_docx.py --output-dir .python {baseDir}/scripts/download_url.py and then python {baseDir}/scripts/transcribe_sensevoice.py --input --output-txt --output-docx .python {baseDir}/scripts/transcribe_sensevoice.py --input --output-txt --output-docx ..txt, and then render it with python {baseDir}/scripts/transcript_to_docx.py.Referer header for media download. Extract the media stream and convert it to DOCX."python {baseDir}/scripts/pipeline_web_to_docx.py first. If the page is especially stubborn and it is a Toutiao page, python {baseDir}/scripts/pipeline_toutiao_to_docx.py and python {baseDir}/scripts/extract_toutiao_media.py remain available as site-specific fallbacks.python {baseDir}/scripts/download_url.py, then transcribe..txt.python {baseDir}/scripts/transcript_to_docx.py for generic TXT-to-DOCX rendering.scripts/bootstrap_env.pyInstall or verify the Python packages used by this skill.
scripts/extract_web_media.pyOpen a generic web page in a real browser, capture likely media URLs plus common download headers, and emit a JSON manifest.
scripts/extract_toutiao_media.pyOpen a Toutiao page in a real browser, capture audio/video URLs, and emit a JSON manifest with the same schema as the generic extractor.
scripts/download_url.pyDownload a direct media URL to disk with a stable user agent, optional headers, and HLS/DASH handling.
scripts/transcribe_sensevoice.pyDownload the SenseVoice model on demand, segment media, run offline ASR, and emit TXT and optional DOCX.
scripts/transcript_to_docx.pyRender timestamped transcripts or chapterized notes into a Word document.
scripts/pipeline_web_to_docx.pyRun the generic end-to-end pipeline: extract, download, transcribe, and render.
scripts/pipeline_toutiao_to_docx.pyRun the Toutiao-specialized end-to-end pipeline for cases where the generic extractor is not preferred.
python {baseDir}/scripts/bootstrap_env.py before first use in a fresh environment.skill-creator/scripts/quick_validate.py.--help and one representative happy path after changing the scripts.共 1 个版本