canghe-url-to-markdown

安装量: 102
排名: #8178

安装

npx skills add https://github.com/freestylefly/canghe-skills --skill canghe-url-to-markdown
URL to Markdown
Fetches any URL via Chrome CDP and converts HTML to clean markdown.
Script Directory
Important
All scripts are located in the scripts/ subdirectory of this skill. Agent Execution Instructions : Determine this SKILL.md file's directory path as SKILL_DIR Script path = ${SKILL_DIR}/scripts/.ts Replace all ${SKILL_DIR} in this document with the actual path Script Reference : Script Purpose scripts/main.ts CLI entry point for URL fetching Preferences (EXTEND.md) Use Bash to check EXTEND.md existence (priority order):

Check project-level first

test -f .canghe-skills/canghe-url-to-markdown/EXTEND.md && echo "project"

Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)

test
-f
"
$HOME
/.canghe-skills/canghe-url-to-markdown/EXTEND.md"
&&
echo
"user"
┌────────────────────────────────────────────────────────┬───────────────────┐
│ Path │ Location │
├────────────────────────────────────────────────────────┼───────────────────┤
│ .canghe-skills/canghe-url-to-markdown/EXTEND.md │ Project directory │
├────────────────────────────────────────────────────────┼───────────────────┤
│ $HOME/.canghe-skills/canghe-url-to-markdown/EXTEND.md │ User home │
└────────────────────────────────────────────────────────┴───────────────────┘
┌───────────┬───────────────────────────────────────────────────────────────────────────┐
│ Result │ Action │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Found │ Read, parse, apply settings │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Not found │ Use defaults │
└───────────┴───────────────────────────────────────────────────────────────────────────┘
EXTEND.md Supports
Default output directory | Default capture mode | Timeout settings Features Chrome CDP for full JavaScript rendering Two capture modes: auto or wait-for-user Clean markdown output with metadata Handles login-required pages via wait mode Usage

Auto mode (default) - capture when page loads

npx -y bun ${SKILL_DIR} /scripts/main.ts < url

Wait mode - wait for user signal before capture

npx -y bun ${SKILL_DIR} /scripts/main.ts < url

--wait

Save to specific file

npx
-y
bun
${SKILL_DIR}
/scripts/main.ts
<
url
>
-o
output.md
Options
Option
Description
URL to fetch
-o
Output file path (default: auto-generated)
--wait
Wait for user signal before capturing
--timeout
Page load timeout (default: 30000)
Capture Modes
Mode
Behavior
Use When
Auto (default)
Capture on network idle
Public pages, static content
Wait (
--wait
)
User signals when ready
Login-required, lazy loading, paywalls
Wait mode workflow
:
Run with
--wait
→ script outputs "Press Enter when ready"
Ask user to confirm page is ready
Send newline to stdin to trigger capture
Output Format
YAML front matter with
url
,
title
,
description
,
author
,
published
,
captured_at
fields, followed by converted markdown content.
Output Directory
url-to-markdown//.md
From page title or URL path (kebab-case, 2-6 words)
Conflict resolution: Append timestamp
-YYYYMMDD-HHMMSS.md
Environment Variables
Variable
Description
URL_CHROME_PATH
Custom Chrome executable path
URL_DATA_DIR
Custom data directory
URL_CHROME_PROFILE_DIR
Custom Chrome profile directory
Troubleshooting
Chrome not found → set URL_CHROME_PATH . Timeout → increase --timeout . Complex pages → try --wait mode. Extension Support Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
返回排行榜