Document DOCX Skill - Quick Reference

This skill enables creation, editing, and analysis of .docx files for reports, contracts, proposals, documentation, and template-driven outputs.

Modern best practices (2026):

Prefer templates + styles over manual formatting. Treat .docx as the editable source; treat PDF as a release artifact. If distributing externally, include basic accessibility hygiene (headings, table headers, alt text). Quick Reference Task Tool/Library Language When to Use Create DOCX python-docx Python Reports, contracts, proposals Create DOCX docx Node.js Server-side document generation Convert to HTML mammoth.js Node.js Web display, content extraction Parse DOCX python-docx Python Extract text, tables, metadata Template fill docxtpl Python Mail merge, template-based generation Review workflow Word compare, comments/highlights Any Human review without OOXML surgery Tracked changes OOXML inspection, docx4j/OpenXML SDK/Aspose Any True redlines or parsing tracked changes Tool Selection Prefer docxtpl when non-developers must edit layout/design in Word. Prefer python-docx for structural edits (paragraphs/tables/headers/footers) when formatting complexity is moderate. Prefer docx (Node.js) for server-side generation in TypeScript-heavy stacks. Prefer mammoth for text-first extraction or DOCX-to-HTML (best effort; may drop some layout fidelity). Known Limits (Plan Around These) .doc (legacy) is not supported by these libraries; convert to .docx first (e.g., LibreOffice). python-docx cannot reliably create true tracked changes; use Word compare or specialized OOXML tooling. Tables of Contents and many fields are placeholders until opened/updated in Word. Core Operations Create Document (Python - python-docx) from docx import Document from docx.shared import Inches, Pt from docx.enum.text import WD_ALIGN_PARAGRAPH

doc = Document()

Title

title = doc.add_heading('Document Title', 0) title.alignment = WD_ALIGN_PARAGRAPH.CENTER

Paragraph with formatting

para = doc.add_paragraph() run = para.add_run('Bold and ') run.bold = True run = para.add_run('italic text.') run.italic = True

Table

table = doc.add_table(rows=3, cols=3) table.style = 'Table Grid' for i, row in enumerate(table.rows): for j, cell in enumerate(row.cells): cell.text = f'Row {i+1}, Col {j+1}'

Image

doc.add_picture('image.png', width=Inches(4))

Save

doc.save('output.docx')

Create Document (Node.js - docx) import { Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell } from 'docx'; import * as fs from 'fs';

const doc = new Document({ sections: [{ properties: {}, children: [ new Paragraph({ children: [ new TextRun({ text: 'Bold text', bold: true }), new TextRun({ text: ' and normal text.' }), ], }), new Table({ rows: [ new TableRow({ children: [ new TableCell({ children: [new Paragraph('Cell 1')] }), new TableCell({ children: [new Paragraph('Cell 2')] }), ], }), ], }), ], }], });

Packer.toBuffer(doc).then((buffer) => { fs.writeFileSync('output.docx', buffer); });

Template-Based Generation (Python - docxtpl) from docxtpl import DocxTemplate

doc = DocxTemplate('template.docx') context = { 'company_name': 'Acme Corp', 'date': '2025-01-15', 'items': [ {'name': 'Widget A', 'price': 100}, {'name': 'Widget B', 'price': 200}, ] } doc.render(context) doc.save('filled_template.docx')

Extract Content (Python - python-docx) from docx import Document

doc = Document('input.docx')

Extract all text

full_text = [] for para in doc.paragraphs: full_text.append(para.text)

Extract tables

for table in doc.tables: for row in table.rows: row_data = [cell.text for cell in row.cells] print(row_data)

Styling Reference Element Python Method Node.js Class Heading 1 add_heading(text, 1) HeadingLevel.HEADING_1 Bold run.bold = True TextRun({ bold: true }) Italic run.italic = True TextRun({ italics: true }) Font size run.font.size = Pt(12) TextRun({ size: 24 }) (half-points) Alignment WD_ALIGN_PARAGRAPH.CENTER AlignmentType.CENTER Page break doc.add_page_break() new PageBreak() Do / Avoid (Dec 2025) Do Use consistent heading levels and a table of contents for long docs. Capture decisions and action items with owners and due dates. Store docs in a versioned, searchable system. Avoid Manual formatting instead of styles (breaks consistency). Docs with no owner or review cadence (stale quickly). Copy/pasting without updating definitions and links. Output Quality Checklist Structure: consistent heading hierarchy, styles, and (when needed) an auto-generated table of contents. Decisions: decisions/actions captured with owner + due date (not buried in prose). Versioning: doc ID + version + change summary; review cadence defined. Accessibility hygiene: headings/reading order are correct; table headers are marked; alt text for non-decorative images. Reuse: use assets/doc-template-pack.md for decision logs and recurring doc types. Optional: AI / Automation

Use only when explicitly requested and policy-compliant.

Summarize meeting notes into decisions/actions; humans verify accuracy. Draft first-pass docs from outlines; do not invent facts or quotes. Navigation

Resources

references/docx-patterns.md - Advanced formatting, styles, headers/footers references/template-workflows.md - Mail merge, batch generation references/tracked-changes.md - Tracked changes: what is feasible, and what is not data/sources.json - Library documentation links

Scripts

scripts/docx_inspect_ooxml.py - Dependency-free OOXML inspection (including tracked changes signals) scripts/docx_extract.py - Extract text/tables to JSON (requires python-docx) scripts/docx_render_template.py - Render a docxtpl template (requires docxtpl) scripts/docx_to_html.mjs - Convert .docx to HTML (requires mammoth)

Templates

assets/report-template.md - Standard report structure assets/contract-template.md - Legal document structure assets/doc-template-pack.md - Decision log, meeting notes, changelog templates

Related Skills

../document-pdf/SKILL.md - PDF generation and conversion ../docs-codebase/SKILL.md - Technical writing patterns

document-docx

安装