- Wiki Lint — Health Audit
- You are performing a health check on an Obsidian wiki. Your goal is to find and fix structural issues that degrade the wiki's value over time.
- Before scanning anything:
- follow the Retrieval Primitives table in
- llm-wiki/SKILL.md
- . Prefer frontmatter-scoped greps and section-anchored reads over full-page reads. On a large vault, blindly reading every page to lint it is exactly what this framework is built to avoid.
- Before You Start
- Read
- .env
- to get
- OBSIDIAN_VAULT_PATH
- Read
- index.md
- for the full page inventory
- Read
- log.md
- for recent activity context
- Lint Checks
- Run these checks in order. Report findings as you go.
- 1. Orphaned Pages
- Find pages with zero incoming wikilinks. These are knowledge islands that nothing connects to.
- How to check:
- Glob all
- .md
- files in the vault
- For each page, Grep the rest of the vault for
- [[page-name]]
- references
- Pages with zero incoming links (except
- index.md
- and
- log.md
- ) are orphans
- How to fix:
- Identify which existing pages should link to the orphan
- Add wikilinks in appropriate sections
- 2. Broken Wikilinks
- Find
- [[wikilinks]]
- that point to pages that don't exist.
- How to check:
- Grep for
- [[.*?]]
- across all pages
- Extract the link targets
- Check if a corresponding
- .md
- file exists
- How to fix:
- If the target was renamed, update the link
- If the target should exist, create it
- If the link is wrong, remove or correct it
- 3. Missing Frontmatter
- Every page should have: title, category, tags, sources, created, updated.
- How to check:
- Grep frontmatter blocks (scope to
- ^---
- at file heads) instead of reading every page in full
- Flag pages missing required fields
- How to fix:
- Add missing fields with reasonable defaults
- 3a. Missing Summary (soft warning)
- Every page
- should
- have a
- summary:
- frontmatter field — 1–2 sentences, ≤200 chars. This is what cheap retrieval (e.g.
- wiki-query
- 's index-only mode) reads to avoid opening page bodies.
- How to check:
- Grep frontmatter for
- ^summary:
- across the vault
- Flag pages without it,
- but as a soft warning, not an error
- — older pages predating this field are fine; the check exists to nudge ingest skills into filling it on new writes.
- Also flag pages whose summary exceeds 200 chars.
- How to fix:
- Re-ingest the page, or manually write a short summary (1–2 sentences of the page's content).
- 4. Stale Content
- Pages whose
- updated
- timestamp is old relative to their sources.
- How to check:
- Compare page
- updated
- timestamps to source file modification times
- Flag pages where sources have been modified after the page was last updated
- 5. Contradictions
- Claims that conflict across pages.
- How to check:
- This requires reading related pages and comparing claims
- Focus on pages that share tags or are heavily cross-referenced
- Look for phrases like "however", "in contrast", "despite" that may signal existing acknowledged contradictions vs. unacknowledged ones
- How to fix:
- Add an "Open Questions" section noting the contradiction
- Reference both sources and their claims
- 6. Index Consistency
- Verify
- index.md
- matches the actual page inventory.
- How to check:
- Compare pages listed in
- index.md
- to actual files on disk
- Check that summaries in
- index.md
- still match page content
- 7. Provenance Drift
- Check whether pages are being honest about how much of their content is inferred vs extracted. See the Provenance Markers section in
- llm-wiki
- for the convention.
- How to check:
- For each page with a
- provenance:
- block or any
- ^[inferred]
- /
- ^[ambiguous]
- markers, count sentences/bullets and how many end with each marker
- Compute rough fractions (
- extracted
- ,
- inferred
- ,
- ambiguous
- )
- Apply these thresholds:
- AMBIGUOUS > 15%
-
- flag as "speculation-heavy" — even 1-in-7 claims being genuinely uncertain is a signal the page needs tighter sourcing or should be moved to
- synthesis/
- INFERRED > 40% with no
- sources:
- in frontmatter
-
- flag as "unsourced synthesis" — the page is making connections but has nothing to cite
- Hub pages
- (top 10 by incoming wikilink count) with INFERRED > 20%: flag as "high-traffic page with questionable provenance" — errors on hub pages propagate to every page that links to them
- Drift
-
- if the page has a
- provenance:
- frontmatter block, flag it when any field is more than 0.20 off from the recomputed value
- Skip
- pages with no
- provenance:
- frontmatter and no markers — treated as fully extracted by convention
- How to fix:
- For ambiguous-heavy: re-ingest from sources, resolve the uncertain claims, or split speculative content into a
- synthesis/
- page
- For unsourced synthesis: add
- sources:
- to frontmatter or clearly label the page as synthesis
- For hub pages with INFERRED > 20%: prioritize for re-ingestion — errors here have the widest blast radius
- For drift: update the
- provenance:
- frontmatter to match the recomputed values
- 8. Fragmented Tag Clusters
- Checks whether pages that share a tag are actually linked to each other. Tags imply a topic cluster; if those pages don't reference each other, the cluster is fragmented — knowledge islands that should be woven together.
- How to check:
- For each tag that appears on ≥ 5 pages:
- n
- = count of pages with this tag
- actual_links
- = count of wikilinks between any two pages in this tag group (check both directions)
- cohesion = actual_links / (n × (n−1) / 2)
- Flag any tag group where cohesion < 0.15 and n ≥ 5
- How to fix:
- Run the
- cross-linker
- skill targeted at the fragmented tag — it will surface and insert the missing links
- If a tag group is large (n > 15) and still fragmented, consider splitting it into more specific sub-tags
- 9. Visibility Tag Consistency
- Checks that
- visibility/
- tags are applied correctly and aren't silently missing where they matter.
- How to check:
- Untagged PII patterns:
- Grep page bodies for patterns that commonly indicate sensitive data — lines containing
- password
- ,
- api_key
- ,
- secret
- ,
- token
- ,
- ssn
- ,
- email:
- ,
- phone:
- followed by an actual value (not a field description). If a page matches and lacks
- visibility/pii
- or
- visibility/internal
- , flag it as a likely mis-classification.
- visibility/pii
- without
- sources:
- :
- A page tagged
- visibility/pii
- should always have a
- sources:
- frontmatter field — if there's no provenance, there's no way to verify the classification. Flag any
- visibility/pii
- page missing
- sources:
- .
- Visibility tags in taxonomy:
- visibility/
- tags are system tags and must
- not
- appear in
- _meta/taxonomy.md
- . If found there, flag as misconfigured — they'd be counted toward the 5-tag limit on pages that include them.
- How to fix:
- For untagged PII patterns: add
- visibility/pii
- (or
- visibility/internal
- if it's team-context rather than personal data) to the page's frontmatter tags
- For missing
- sources:
- add provenance or escalate to the user — don't auto-fill For taxonomy contamination: remove the visibility/ entries from _meta/taxonomy.md Output Format Report findings as a structured list:
Wiki Health Report
Orphaned Pages (N found)
concepts/foo.md
— no incoming links
Broken Wikilinks (N found)
entities/bar.md:15
— links to [[nonexistent-page]]
Missing Frontmatter (N found)
skills/baz.md
— missing: tags, sources
Stale Content (N found)
references/paper-x.md
— source modified 2024-03-10, page last updated 2024-01-05
Contradictions (N found)
concepts/scaling.md
claims "X" but
synthesis/efficiency.md
claims "not X"
Index Issues (N found)
concepts/new-page.md
exists on disk but not in index.md
Missing Summary (N found — soft)
concepts/foo.md
— no
summary:
field
-
entities/bar.md
— summary exceeds 200 chars
Provenance Issues (N found)
concepts/scaling.md
— AMBIGUOUS > 15%: 22% of claims are ambiguous (re-source or move to synthesis/)
-
entities/some-tool.md
— drift: frontmatter says inferred=0.10, recomputed=0.45
-
concepts/transformers.md
— hub page (31 incoming links) with INFERRED=28%: errors here propagate widely
-
synthesis/speculation.md
— unsourced synthesis: no
sources:
field, 55% inferred
Fragmented Tag Clusters (N found)
**
systems
** — 7 pages, cohesion=0.06 ⚠️ — run cross-linker on this tag - **
databases
** — 5 pages, cohesion=0.10 ⚠️
Visibility Issues (N found)
entities/user-records.md
— contains
email:
value pattern but no
visibility/pii
tag
-
concepts/auth-flow.md
— tagged
visibility/pii
but missing
sources:
frontmatter
-
_meta/taxonomy.md
— contains
visibility/internal
entry (system tag must not be in taxonomy)
After Linting
Append to
log.md
:
- [TIMESTAMP] LINT issues_found=N orphans=X broken_links=Y stale=Z contradictions=W prov_issues=P missing_summary=S fragmented_clusters=F visibility_issues=V
Offer to fix issues automatically or let the user decide which to address.