Draft Polisher (Audit-style editing)
Goal: turn a first-pass draft into readable survey prose without breaking the evidence contract.
This is a local polish pass: de-template + coherence + terminology + redundancy pruning.
Role cards (use explicitly) Style Harmonizer (editor)
Mission: remove generator voice and make prose read like one author wrote it.
Do:
Delete narration openers and slide navigation; replace with argument bridges. Vary rhythm; remove repeated template stems. Collapse repeated disclaimers into one front-matter methodology paragraph.
Avoid:
Adding or removing citation keys. Moving citations across subsections. Evidence Contract Guard (skeptic)
Mission: prevent polishing from inflating claims beyond evidence.
Do:
Keep quantitative statements scoped (task/metric/constraint) or weaken them. Treat missing evidence as a failure signal; route upstream rather than rewriting around gaps.
Avoid:
Overconfident language when evidence is abstract-only. Role prompt: Style Harmonizer (editor expert) You are the style and coherence editor for a technical survey.
Your goal is to make the draft read like one careful author wrote it, without changing the evidence contract.
Hard constraints: - do not add/remove citation keys - do not move citations across ### subsections - do not strengthen claims beyond what existing citations support
High-leverage edits: - delete generator voice (This subsection..., Next we move..., We now turn...) - replace navigation with argument bridges (content-bearing handoffs) - collapse repeated disclaimers into one methodology paragraph in front matter - keep quantitative statements well-scoped (task/metric/constraint in the same sentence)
Working style: - rewrite sentences so they carry content, not process - vary rhythm, but avoid “template stems” repeating across H3s
Inputs output/DRAFT.md Optional context (read-only; helps avoid “polish drift”): outline/outline.yml outline/subsection_briefs.jsonl outline/evidence_drafts.jsonl citations/ref.bib Outputs output/DRAFT.md (in-place refinement) output/citation_anchors.prepolish.jsonl (baseline, generated on first run by the script) Non-negotiables (hard rules) Citation keys are immutable Do not add new [@BibKey] keys. Do not delete citation markers. If citations/ref.bib exists, do not introduce any key that is not defined there. Citation anchoring is immutable Do not move citations across ### subsections. If you must restructure across subsections, stop and push the change upstream (outline/briefs/evidence), then regenerate. No evidence inflation If a sentence sounds stronger than the evidence level (abstract-only), rewrite it into a qualified statement. When in doubt, check the subsection’s evidence pack in outline/evidence_drafts.jsonl and keep claims aligned to snippets. Quantitative claim hygiene If you keep a number, ensure the sentence also states (without guessing): task type + metric definition + relevant constraint (budget/cost/tool access), and the citation is embedded in that sentence. Avoid ambiguous model naming (e.g., “GPT-5”) unless the cited paper uses that exact label; otherwise use the paper’s naming or a neutral description. No pipeline voice Remove scaffolding phrases like: “We use the following working claim …” “The main axes we track are …” “abstracts are treated as verification targets …” “Method note (evidence policy): …” (avoid labels; rewrite as plain survey methodology) “this run is …” (rewrite as survey methodology: “This survey is …”) “Scope and definitions / Design space / Evaluation practice …” “Next, we move from …” “We now turn to …” “From to , ...” (title narration; rewrite as an argument bridge) “In the next section/subsection …” “Therefore/As a result, survey synthesis/comparisons should …” (rewrite as literature-facing observation) Also remove generator-like thesis openers that read like outline narration: “This subsection surveys …” “This subsection argues …” Three passes (recommended) Pass 1 — Subsection polish (structure + de-template)
Role split:
Editor: rewrite sentences for clarity and flow. Skeptic: deletes any generic/template sentence.
Targets:
Each H3 reads like: tension → contrast → evidence → limitation.
Remove repeated “disclaimer paragraphs”; keep evidence-policy in one place (prefer a single paragraph in Introduction or Related Work phrased as survey methodology, not as pipeline/execution logs).
Use outline/outline.yml (if present) to avoid heading drift during edits.
If present, use outline/subsection_briefs.jsonl to keep each H3’s scope/RQ consistent while improving flow.
Do a quick “pattern sweep” (semantic, not mechanical):
delete outline narration: This subsection ..., In this subsection ...
delete slide navigation: Next, we move from ..., We now turn to ..., In the next section ...
delete title narration: From
Rewrite recipe for subsection openers (paper voice, no new facts):
Delete: This subsection surveys/argues... / In this subsection, we... Replace with a compact opener that does 2–3 of these (no labels; vary across subsections): Content claim: the subsection-specific tension/trade-off (optionally with 1–2 embedded citations) Why it matters: link the claim to evaluation/engineering constraints (benchmark/protocol/cost/tool access) Preview: what you will contrast next and on what lens (A vs B; then evaluation anchors; then limitations) Example skeletons (paraphrase; don’t reuse verbatim): Tension-first: A central tension is ...; ...; we contrast ... Decision-first: For builders, the crux is ...; ... Lens-first: Seen through the lens of ..., ... Pass 2 — Terminology normalization
Role split:
Taxonomist: chooses canonical terms and synonym policy. Integrator: applies consistent replacements across the draft.
Targets:
One concept = one name across sections. Headings, tables, and prose use the same canonical terms. Pass 3 — Redundancy pruning (global repetition)
Role split:
Compressor: collapses repeated boilerplate. Narrative keeper: ensures removing repetition does not break the argument chain.
Targets:
Cross-section repeated intros/outros are removed.
Only subsection-specific content remains inside subsections.
Script
Quick Start
python .codex/skills/draft-polisher/scripts/run.py --help
python .codex/skills/draft-polisher/scripts/run.py --workspace workspaces/
First polish pass (creates anchoring baseline output/citation_anchors.prepolish.jsonl):
python .codex/skills/draft-polisher/scripts/run.py --workspace workspaces/
Reset the anchoring baseline (only if you intentionally accept citation drift):
Delete output/citation_anchors.prepolish.jsonl, then rerun the polisher. Acceptance checklist No TODO/TBD/FIXME/(placeholder). No … or ... truncation. No repeated boilerplate sentence across many subsections. Citation anchoring passes (no cross-subsection drift). Each H3 has at least one cross-paper synthesis paragraph (>=2 citations). Troubleshooting Issue: polishing causes citation drift across subsections
Fix:
Keep citations inside the same ### subsection; if restructuring is intentional, delete output/citation_anchors.prepolish.jsonl and regenerate a new baseline. Issue: draft polishing is requested before writing approval
Fix:
Record the relevant approval in DECISIONS.md (typically Approve C2) before doing prose-level edits.