← 返回
未分类 中文

Word DOCX处理

Create, inspect, and edit Microsoft Word documents and DOCX files with reliable styles, numbering, tracked changes, tables, sections, and compatibility check...
创建、检查和编辑 Microsoft Word 文档和 DOCX 文件,支持样式、编号、修订追踪、表格、分节及兼容性检查
linbo405 linbo405 来源
未分类 clawhub v1.0.0 1 版本 99871.3 Key: 无需
★ 0
Stars
📥 776
下载
💾 2
安装
1
版本
#latest

概述

When to Use

Use when the main artifact is a Microsoft Word document or .docx file, especially when tracked changes, comments, headers, numbering, fields, tables, templates, or compatibility matter.

Core Rules

1. Treat DOCX as OOXML, not plain text

  • A .docx file is a ZIP of XML parts, so structure matters as much as visible text.
  • The critical parts are usually word/document.xml, styles.xml, numbering.xml, headers, footers, and relationship files.
  • Text may be split across multiple runs; never assume one word or sentence lives in one XML node.
  • Use different workflows on purpose: structured extraction for quick reading, style-driven generation for new files, and OOXML-aware editing for fragile existing documents.
  • If the job is mainly reading, extracting, or reviewing, prefer a structure-preserving read path before touching OOXML.
  • For deep edits, inspect the package layout instead of relying only on rendered output.
  • Reading, generating, and preserving an existing reviewed document are different jobs even when the format is the same.
  • Legacy .doc inputs usually need conversion before you can trust modern .docx assumptions.

2. Preserve styles and direct formatting deliberately

  • Prefer named styles over direct formatting so the document stays editable.
  • Styles layer: paragraph styles, character styles, and direct formatting do not behave the same.
  • Removing direct formatting is often safer than stacking more inline formatting on top.
  • When editing an existing file, extend the current style system instead of inventing a parallel one.
  • Copying content between documents can silently import foreign styles, theme settings, and numbering definitions.

3. Lists and numbering are their own system

  • Bullets and numbering belong to Word's numbering definitions, not pasted Unicode characters.
  • abstractNum, num, and paragraph numbering properties all matter, so restart behavior is rarely "visual only".
  • Indentation and numbering are related but not identical; a list can have broken numbering even if the indent looks right.
  • A list that looks correct in one editor can restart, flatten, or renumber itself later if the underlying numbering state is wrong.

4. Page layout lives in sections

  • Margins, orientation, headers, footers, and page numbering are section-level behavior.
  • First-page and odd/even headers can differ inside the same document, so one header fix may not fix the document.
  • Set page size explicitly because A4 and US Letter defaults change pagination and table widths.
  • Use section breaks for layout changes; manual spacing and stray page breaks usually create drift.
  • Header and footer media use part-specific relationships, so copied IDs often break images or links.
  • Tables, page breaks, and headers often drift together, so treat layout fixes as document-wide, not local cosmetic edits.
  • Table geometry depends on page width, margins, and fixed widths, so "close enough" table edits often break later in Google Docs or LibreOffice.

5. Track changes, comments, and fields need precise edits

  • Visible text is not the full document when tracked changes are enabled.
  • Insertions, deletions, and comments carry metadata that can survive careless edits.
  • Deleted text may still exist in the XML even when it no longer appears on screen.
  • Comment anchors and review ranges can break if edits move text without preserving the surrounding structure.
  • Comment markers and review wrappers do not behave like inline formatting, so moving text carelessly can orphan or misplace them.
  • Comments, footnotes, bookmarks, and linked media may live in separate parts, not only in the main document body.
  • Tables of contents, page numbers, dates, cross-references, and mail merge placeholders are fields.
  • Edit the field source carefully and expect cached display values to lag until refresh.
  • Hyperlinks, bookmarks, and references can break if IDs or relationships stop matching.
  • Bookmarks, footnotes, comment ranges, and cross-references depend on stable anchors even when the visible text seems untouched.
  • A document can look correct while still containing stale field output that refreshes later into something different.
  • For review workflows, make minimal replacements instead of rewriting whole paragraphs.
  • In tracked-change workflows, only the changed span should look changed; broad rewrites create noisy reviews and can destroy the original formatting context.
  • For legal, academic, or business review documents, default to review-style edits over wholesale paragraph rewrites unless the user explicitly wants a rewrite.

6. Verify round-trip compatibility before delivery

  • Complex documents can shift between Word, LibreOffice, Google Docs, and conversion tools.
  • Tables, headers, embedded fonts, and copied styles are common sources of layout drift.
  • Treat .docm as macro-bearing and higher risk; treat .doc as legacy input that may need conversion first.
  • When layout matters, explicit table widths are safer than auto-fit or percentage-style behavior that different editors reinterpret.
  • A document that passes a text check can still fail on pagination, table widths, or reference refresh after the recipient opens it.

Common Traps

  • Copy-paste can import unwanted styles and numbering definitions.
  • Header or footer images use part-specific relationships, so reusing IDs blindly breaks them.
  • Empty paragraphs used as spacing make templates fragile; spacing belongs in paragraph settings.
  • A clean-looking export can still hide unresolved revisions, comments, or stale field values.
  • Restarting lists "by eye" usually fails because numbering state lives outside the paragraph text.
  • One visible phrase can be split across several runs, bookmarks, revision tags, or field boundaries.
  • Replacing a whole paragraph to change one clause often breaks review quality, bookmarks, comments, or nearby inline formatting.
  • Deleting all visible text from a paragraph or list item can still leave behind an empty paragraph mark, empty bullet, or unstable numbering.
  • Table auto-fit and percentage-like width behavior can look acceptable in Word and still drift in Google Docs or LibreOffice.
  • LibreOffice and Google Docs can shift complex tables, section behavior, and embedded fonts even when Word looks perfect.
  • Compatibility mode can silently cap newer features or change pagination behavior.
  • A single change in page size or margin defaults can ripple through tables, headers, TOC, and cross-references.
  • A revision workflow can look accepted on screen while leftover metadata, comments, or field caches still make the file unstable later.
  • TOC entries, footnotes, and cross-references can look correct until the recipient updates fields and exposes broken anchors.

Related Skills

Install with clawhub install if user confirms:

  • documents — General document handling and format conversion.
  • brief — Concise business writing and structured summaries.
  • article — Long-form drafting and editorial structure.

Feedback

  • If useful: clawhub star word-docx
  • Stay updated: clawhub sync

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-02 01:53 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

PowerPoint PPTX处理

linbo405
使用可靠的布局、模板、占位符、备注、图表和可视化质量检查功能来创建、检查和编辑 Microsoft PowerPoint 演示文稿和 PPTX 文件。
★ 0 📥 981

AI浏览器WebSocket控制

linbo405
通过 WebSocket 控制真实浏览器,实现导航、点击、输入、截图、DOM 获取等完整自动化操作。特点:真正的 Chromium 内核、WebSocket 实时控制、支持无头/有头模式、自动重连。
★ 0 📥 568

Excel XLSX处理

linbo405
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,具备可靠的公式、日期、类型、格式、重新计算和模板保留功能...
★ 0 📥 796