Compare two versions of a document with structure-aware precision. SoMark parses both files into clean Markdown first, then a diff is generated at the text level. The result tells you exactly what changed between two versions of a contract, report, policy document, or any other file.
Raw PDF/Word binary diffing is meaningless. By parsing both documents into clean Markdown first, the diff captures semantic changes — actual content additions, deletions, and modifications — not binary noise.
In short: parse both documents with SoMark, then diff the structured output.
Example requests:
Important: Before starting, tell the user that SoMark will parse both documents into clean Markdown first, enabling an accurate content-level diff rather than a raw binary comparison.
python document_diff.py \
-f1 <original_file> \
-f2 <new_file> \
-o <output_dir> \
--output-formats '["markdown", "json"]' \
--element-formats '{"image": "url", "formula": "latex", "table": "html", "cs": "image"}' \
--feature-config '{"enable_text_cross_page": false, "enable_table_cross_page": false, "enable_title_level_recognition": false, "enable_inline_image": true, "enable_table_image": true, "enable_image_understanding": true, "keep_header_footer": false}'
Script location: document_diff.py in the same directory as this SKILL.md
Supported formats: .pdf .png .jpg .jpeg .bmp .tiff .webp .heic .heif .gif .doc .docx .ppt .pptx
--output-formats (Optional)This argument is optional in the current script. Pass a JSON array of one or more output formats.
If omitted, the default value is:
["markdown", "json"]
Supported values:
| Value | Description |
|---|---|
| ------------ | -------------------------------------------------- |
markdown | Save the parsed document as a Markdown file |
json | Save the parsed document as a JSON output |
Example:
--output-formats '["markdown", "json"]'
--element-formats (Optional)This argument controls how specific element types are rendered during parsing. The same configuration is applied to both documents so the comparison stays consistent.
If omitted, the default value is:
{ "image": "url", "formula": "latex", "table": "html", "cs": "image" }
If you provide this argument, pass the full JSON object.
Supported keys, allowed values, and defaults:
| Key | Allowed values | Default |
|---|---|---|
| ------- | --------------------- | ------- |
| image | url, base64, none | url |
| formula | latex, mathml, ascii | latex |
| table | html, image, markdown | html |
| cs | image | image |
Example:
--element-formats '{"image": "base64", "formula": "latex", "table": "html", "cs": "image"}'
--feature-config (Optional)This argument controls parser feature switches. The same feature configuration is applied to both documents before diffing.
If omitted, the default value is:
{
"enable_text_cross_page": false,
"enable_table_cross_page": false,
"enable_title_level_recognition": false,
"enable_inline_image": true,
"enable_table_image": true,
"enable_image_understanding": true,
"keep_header_footer": false
}
If you provide this argument, pass the full JSON object. All values must be boolean (true or false).
Supported keys and defaults:
| Key | Default | Description |
|---|---|---|
| -------------------------------- | ------- | ----------------------------------------- |
enable_text_cross_page | false | Merge text content across page boundaries |
enable_table_cross_page | false | Merge tables across page boundaries |
enable_title_level_recognition | false | Recognize heading and title levels |
enable_inline_image | true | Include inline image output |
enable_table_image | true | Include table image output |
enable_image_understanding | true | Enable image understanding features |
keep_header_footer | false | Preserve header and footer content |
Example:
--feature-config '{"enable_text_cross_page": false, "enable_table_cross_page": false, "enable_title_level_recognition": false, "enable_inline_image": true, "enable_table_image": true, "enable_image_understanding": true, "keep_header_footer": false}'
The script writes these files to the output directory:
diff_report.md — unified diff with added/removed/unchanged line counts.md — parsed Markdown of the original document.md — parsed Markdown of the new documentdiff_summary.json — metadata (file paths, elapsed time)After the script finishes, read diff_report.md and both parsed Markdown files, then provide a human-readable summary:
Present the summary in this structure:
## 文档对比结果
### 变更概览
- 新增:X 行
- 删除:Y 行
- 未变更:Z 行
### 主要变更内容
[按重要性列出关键变更,引用具体文本]
### 需要关注的变更
[标注可能影响权利义务、金额、日期、条款的变更]
### 未变更的主要部分
[简要说明哪些重要章节保持不变]
If the user has not configured an API key, follow the same setup steps as the somark-document-parser skill.
Step 1: Ask whether it is already configured — do not ask the user to paste the key in chat.
Step 2: Direct them to https://somark.tech/login to create a key in the format sk-.
Step 3: Ask them to run:
export SOMARK_API_KEY=your_key_here
Step 4: Mention free quota is available at https://somark.tech/workbench/purchase.
--output-formats, --element-formats, or --feature-config: ask the user to provide valid JSON syntax.markdown, json.image, formula, table, and cs.feature-config values must be booleans.1107 / Invalid API Key: ask the user to verify SOMARK_API_KEY.共 1 个版本