csctf

安装

npx skills add https://github.com/dicklesworthstone/agent_flywheel_clawdbot_skills_and_integrations --skill csctf

CSCTF — Chat Shared Conversation To File

A Bun-native CLI that turns public ChatGPT, Gemini, Grok, and Claude share links into clean Markdown + HTML transcripts with preserved code fences, stable filenames, and optional GitHub Pages publishing.

Why This Exists

Copy/pasting AI share links often:

Breaks fenced code blocks — loses formatting and structure Loses language hints — no syntax highlighting Produces messy filenames — random or unreadable names Requires manual cleanup — inconsistent formatting

CSCTF fixes this with:

Stable slugs — deterministic, collision-proof filenames Language-preserving fences — code blocks retain syntax hints Normalized whitespace — clean, consistent output Static HTML twin — no JS, ready for hosting/archiving One-command GitHub Pages — instant shareable microsite Quick Start

Install

curl -fsSL https://raw.githubusercontent.com/Dicklesworthstone/chat_shared_conversation_to_file/main/install.sh | bash

Convert any share link

csctf https://chatgpt.com/share/69343092-91ac-800b-996c-7552461b9b70 csctf https://gemini.google.com/share/66d944b0e6b9 csctf https://grok.com/share/bGVnYWN5_d5329c61-f497-40b7-9472-c555fa71af9c csctf https://claude.ai/share/549c846d-f6c8-411c-9039-a9a14db376cf

Output:

.md — Clean Markdown with preserved code fences .html — Styled static HTML (zero JavaScript) Supported Providers Provider URL Pattern Method Notes ChatGPT chatgpt.com/share/ Headless Chromium Public shares only Gemini gemini.google.com/share/ Headless Chromium Public shares only Grok grok.com/share/ Headless Chromium Public shares only Claude claude.ai/share/ Your Chrome session Requires login Claude.ai Special Handling

Claude.ai uses Cloudflare protection that blocks standard browser automation. CSCTF handles this automatically:

Copies your Chrome session cookies to a temporary profile Launches Chrome with remote debugging enabled Connects via Chrome DevTools Protocol to extract conversation If Chrome is running, offers to save tabs, restart, and restore afterward

Requirements: Chrome installed + logged into claude.ai in your regular Chrome session.

Design Principles Principle Implementation Determinism Explicit slugging and collision handling Minimal network Only share URL fetched (update checks/publish opt-in) Safety Static HTML (inline CSS/HLJS), no scripts emitted Clarity Colorized step-based logging, confirmation gates Atomicity Temp+rename writes prevent partial files How It Works ChatGPT, Gemini, Grok (End-to-End) 1. Launch headless Playwright Chromium with stealth config (spoofed navigator properties, realistic headers) 2. Navigate twice (domcontentloaded → networkidle) for late-loading assets 3. Detect provider from URL hostname 4. Wait for provider-specific selectors with retry/fallback 5. Extract each role's inner HTML (assistant/user), traverse Shadow DOM 6. Clean pills/metadata, run Turndown with fenced-code rule 7. Normalize whitespace and newlines 8. Write Markdown to temp file, rename atomically 9. Render HTML twin with inline CSS/TOC/HLJS

Claude.ai 1. Copy Chrome session cookies to temporary profile 2. Launch Chrome with remote debugging 3. Connect via Chrome DevTools Protocol 4. Extract conversation HTML 5. Process through same Turndown/normalization pipeline 6. Clean up temporary profile

Processing Algorithms Selector Strategy

Provider-specific selectors with fallback chains:

ChatGPT: article [data-message-author-role] Gemini: Custom web components (share-turn-viewer, response-container) Grok: Flexible data-testid patterns Claude: [data-testid="user-message"] and streaming indicators

Each has multiple fallbacks tried with short timeouts.

Turndown Customization Injects fenced code blocks Detects language via class="language-*" Strips citation pills and data-start/data-end attributes Normalization Converts newlines to \n Removes Unicode LS/PS characters Collapses excessive blank lines Slugging Algorithm Title → lowercase → non-alphanumerics → "_" → trim → max 120 chars → Windows reserved-name suffix → collision suffix (_2, _3, ...)

HTML Rendering Markdown-it + highlight.js Heading slug de-dupe for TOC Inline CSS for light/dark/print Zero JavaScript Command Reference csctf [options]

Output Options Flag Default Description --outfile auto Override output path --no-html / --md-only off Skip HTML output --html-only off Skip Markdown output --quiet off Minimal logging --timeout-ms 60000 Navigation + selector timeout GitHub Pages Publishing Flag Default Description --publish-to-gh-pages off Publish to GitHub Pages --gh-pages-repo my_shared_conversations Target repo --gh-pages-branch gh-pages Target branch --gh-pages-dir

csctf Subdirectory in repo --remember off Save GH settings --forget-gh-pages off Clear saved settings --dry-run off Simulate publish (build index, no push) --yes / --no-confirm off Skip PROCEED confirmation prompt --gh-install off Auto-install gh CLI Other Flag Description --check-updates Print latest release tag --version Print version and exit Output Format Markdown Structure

Conversation: </h1> <p><strong>Source:</strong> https://chatgpt.com/share/... <strong>Retrieved:</strong> 2026-01-08T15:30:00Z</p> <h2 id="user">User</h2> <p>How do I sort an array in Python?</p> <h2 id="assistant">Assistant</h2> <p>Here's how to sort an array in Python:</p> <p>```python</p> <h1 id="sort-in-place">Sort in place</h1> <p>my_list.sort()</p> <h1 id="return-new-sorted-list">Return new sorted list</h1> <p>sorted_list = sorted(my_list)</p> <h3 id="html-features">HTML Features</h3> <ul> <li><strong>Standalone</strong> — No external dependencies</li> <li><strong>Zero JavaScript</strong> — Safe for any hosting</li> <li><strong>Inline CSS</strong> — Light/dark mode via <code>prefers-color-scheme</code></li> <li><strong>Syntax highlighting</strong> — highlight.js themes inline</li> <li><strong>Table of contents</strong> — Auto-generated from headings</li> <li><strong>Language badges</strong> — Code block language indicators</li> <li><strong>Print-friendly</strong> — Optimized print styles</li> </ul> <h2 id="filename-generation">Filename Generation</h2> <p>"How to Build a REST API" → how_to_build_a_rest_api.md "Python Tips & Tricks!" → python_tips_tricks.md "File exists already" → file_exists_already_2.md</p> <p>Rules: - Lowercase - Non-alphanumerics → <code>_</code> - Trimmed leading/trailing <code>_</code> - Max 120 characters - Windows reserved names suffixed - Collisions: <code>_2</code>, <code>_3</code>, ...</p> <h2 id="github-pages-publishing">GitHub Pages Publishing</h2> <h3 id="quick-recipe">Quick Recipe</h3> <p>```bash</p> <h1 id="publish-with-defaults">Publish with defaults</h1> <p>csctf <url> --publish-to-gh-pages --yes</p> <h1 id="creates-my_shared_conversations-repo">Creates: <gh-username>/my_shared_conversations repo</h1> <h1 id="branch-gh-pages">Branch: gh-pages</h1> <h1 id="directory-csctf">Directory: csctf/</h1> <p>Remembered Settings</p> <h1 id="first-time-save-settings">First time: save settings</h1> <p>csctf <url> --publish-to-gh-pages --remember --yes</p> <h1 id="subsequent-just-use-yes">Subsequent: just use --yes</h1> <p>csctf <url> --yes</p> <h1 id="clear-remembered-settings">Clear remembered settings</h1> <p>csctf --forget-gh-pages</p> <p>Custom Configuration csctf <url> --publish-to-gh-pages \ --gh-pages-repo myuser/my-chats \ --gh-pages-branch main \ --gh-pages-dir exports \ --yes</p> <p>Requirements GitHub CLI (gh) installed and authenticated Verify with: gh auth status Publish Flow Resolve repo/branch/dir (use remembered or defaults) Clone (or create via gh) Copy MD + HTML files Regenerate manifest.json and index.html Commit + push Print viewer URL Recipes Quiet CI Scrape (MD only) csctf <url> --md-only --quiet --outfile /tmp/chat.md</p> <p>HTML-only for Embedding csctf <url> --html-only --outfile site/chat.html</p> <p>Slow/Large Conversations csctf <url> --timeout-ms 90000</p> <p>Custom Browser Cache PLAYWRIGHT_BROWSERS_PATH=/opt/ms-playwright csctf <url></p> <p>Batch Archive for url in $URLS; do csctf "$url" --outfile ~/archive/ --quiet done</p> <p>Security & Privacy Network Behavior Only fetches: The share URL itself Opt-in: Update checks, GitHub publish flows Auth: GitHub CLI (gh) for publishing—no tokens stored HTML Safety Zero JavaScript in output Inline styles only Citation pills and data attributes stripped highlight.js used statically Filesystem Temp+rename write pattern (atomic) Collision-proof naming Config: ~/.config/csctf/config.json Claude.ai Cookies Copied to temporary directory only Used for single scraping session Original Chrome profile never modified Performance Phase Time First run (Chromium download) 30-60s Subsequent runs 5-15s Claude.ai (uses local Chrome) 5-10s Characteristics Playwright browsers cached after first run 60s default timeout, 3-attempt backoff Single page/context, linear processing Atomic writes prevent partial outputs Failure Modes & Remedies Symptom Fix "No messages found" Link is private or layout changed; verify public share, retry with --timeout-ms 90000 Bot detection / challenge page Stealth techniques used; retry or verify link in browser Timeout or blank page Raise --timeout-ms, verify connectivity Publish fails (auth) Ensure gh auth status passes Publish fails (branch/dir) Pass --gh-pages-branch / --gh-pages-dir; use --remember Filename collisions Expected; tool appends _2, _3, ... Claude.ai Cloudflare challenge Complete verification in Chrome window, press Enter Claude.ai won't load Ensure logged into claude.ai in Chrome; close Chrome if prompted Environment Variables Runtime Variable Description PLAYWRIGHT_BROWSERS_PATH Reuse cached Chromium bundle Installer Variable Description Default VERSION Pin release tag latest DEST Install directory ~/.local/bin CHECKSUM_URL Override checksum location — File Locations Path Purpose ~/.local/bin/csctf Binary ~/.config/csctf/config.json GitHub Pages settings ~/.cache/ms-playwright/ Playwright Chromium cache Installation</p> <h1 id="one-liner-recommended">One-liner (recommended)</h1> <p>curl -fsSL https://raw.githubusercontent.com/Dicklesworthstone/chat_shared_conversation_to_file/main/install.sh | bash</p> <h1 id="pin-version">Pin version</h1> <p>VERSION=v1.0.0 curl -fsSL .../install.sh | bash</p> <h1 id="custom-directory">Custom directory</h1> <p>DEST=/opt/bin curl -fsSL .../install.sh | bash</p> <h1 id="verify-checksum">Verify checksum</h1> <p>curl -fsSL .../install.sh | bash -s -- --verify</p> <p>From Source bun install bun run build</p> <h1 id="binary-at-distcsctf">Binary at dist/csctf</h1> <p>Comparison Feature Copy/Paste csctf Code blocks preserved Often broken Always preserved Language hints Lost Detected and kept Filenames Random/messy Deterministic slugs HTML output None Styled, no-JS twin GitHub Pages Manual One command Collision handling Overwrite Auto-suffix Limitations Requires public share links (except Claude.ai which uses your session) Provider layouts may change (selectors maintained with fallbacks) Markdown/HTML exports require share to be available at scrape time Claude.ai requires Chrome installed with active login session First run downloads Playwright Chromium (~200MB) Integration with Flywheel Tool Integration CASS Archive conversations for session search CM Extract procedural memory from exported chats Agent Mail Attach conversation exports to agent messages NTM Export multi-agent session transcripts</p> </article> <a href="/" class="back-link">← <span data-i18n="detail.backToLeaderboard">返回排行榜</span></a> </div> <aside class="sidebar"> <section class="related-skills" id="relatedSkillsSection"> <h2 class="related-title" data-i18n="detail.relatedSkills">相关 Skills</h2> <div class="related-list" id="relatedSkillsList"> <div class="skeleton-card"></div> <div class="skeleton-card"></div> <div class="skeleton-card"></div> </div> </section> </aside> </div> </div> <script src="https://unpkg.com/i18next@23.11.5/i18next.min.js" defer></script> <script src="https://unpkg.com/i18next-browser-languagedetector@7.2.1/i18nextBrowserLanguageDetector.min.js" defer></script> <script defer> // Language resources - same pattern as index page const resources = { 'zh-CN': null, 'en': null, 'ja': null, 'ko': null, 'zh-TW': null, 'es': null, 'fr': null }; // Load language files (only current + fallback for performance) async function loadLanguageResources() { const savedLang = localStorage.getItem('i18nextLng') || 'en'; const langsToLoad = new Set([savedLang, 'en']); // current + fallback await Promise.all([...langsToLoad].map(async (lang) => { try { const response = await fetch(`/locales/${lang}.json`); if (response.ok) { resources[lang] = { translation: await response.json() }; } } catch (error) { console.warn(`Failed to load ${lang} language file:`, error); } })); } // Load a single language on demand (for language switching) async function loadLanguage(lang) { if (resources[lang]) return; try { const response = await fetch(`/locales/${lang}.json`); if (response.ok) { resources[lang] = { translation: await response.json() }; i18next.addResourceBundle(lang, 'translation', resources[lang].translation); } } catch (error) { console.warn(`Failed to load ${lang} language file:`, error); } } // Initialize i18next async function initI18n() { try { await loadLanguageResources(); // Filter out null values from resources const validResources = {}; for (const [lang, data] of Object.entries(resources)) { if (data !== null) { validResources[lang] = data; } } console.log('Loaded languages:', Object.keys(validResources)); console.log('zh-CN resource:', validResources['zh-CN']); console.log('detail.home in resource:', validResources['zh-CN']?.translation?.detail?.home); // 检查是否有保存的语言偏好 const savedLang = localStorage.getItem('i18nextLng'); // 如果没有保存的语言偏好,默认使用英文 const defaultLang = savedLang && ['zh-CN', 'en', 'ja', 'ko', 'zh-TW', 'es', 'fr'].includes(savedLang) ? savedLang : 'en'; await i18next .use(i18nextBrowserLanguageDetector) .init({ lng: defaultLang, // 强制设置初始语言 fallbackLng: 'en', supportedLngs: ['zh-CN', 'en', 'ja', 'ko', 'zh-TW', 'es', 'fr'], resources: validResources, detection: { order: ['localStorage'], // 只使用 localStorage,不检测浏览器语言 caches: ['localStorage'], lookupLocalStorage: 'i18nextLng' }, interpolation: { escapeValue: false } }); console.log('i18next initialized, language:', i18next.language); console.log('Test translation:', i18next.t('detail.home')); // Set initial language in selector const langSwitcher = document.getElementById('langSwitcher'); langSwitcher.value = i18next.language; // Update page language updatePageLanguage(); // Language switch event langSwitcher.addEventListener('change', async (e) => { await loadLanguage(e.target.value); // load on demand i18next.changeLanguage(e.target.value).then(() => { updatePageLanguage(); localStorage.setItem('i18nextLng', e.target.value); }); }); } catch (error) { console.error('i18next init failed:', error); } } // Translation helper function t(key, options = {}) { return i18next.t(key, options); } // Update all translatable elements function updatePageLanguage() { // Update HTML lang attribute document.documentElement.lang = i18next.language; // Update elements with data-i18n attribute document.querySelectorAll('[data-i18n]').forEach(el => { const key = el.getAttribute('data-i18n'); el.textContent = t(key); }); } // Copy command function function copyCommand() { const command = document.getElementById('installCommand').textContent; const btn = document.getElementById('copyBtn'); navigator.clipboard.writeText(command).then(() => { btn.textContent = t('copied'); btn.classList.add('copied'); setTimeout(() => { btn.textContent = t('copy'); btn.classList.remove('copied'); }, 2000); }).catch(() => { // Fallback for non-HTTPS const textArea = document.createElement('textarea'); textArea.value = command; textArea.style.position = 'fixed'; textArea.style.left = '-9999px'; document.body.appendChild(textArea); textArea.select(); document.execCommand('copy'); document.body.removeChild(textArea); btn.textContent = t('copied'); btn.classList.add('copied'); setTimeout(() => { btn.textContent = t('copy'); btn.classList.remove('copied'); }, 2000); }); } // Initialize document.getElementById('copyBtn').addEventListener('click', copyCommand); initI18n(); // 异步加载相关 Skills async function loadRelatedSkills() { const owner = 'dicklesworthstone'; const skillName = 'csctf'; const currentLang = 'ja'; const listContainer = document.getElementById('relatedSkillsList'); const section = document.getElementById('relatedSkillsSection'); try { const response = await fetch(`/api/related-skills/${encodeURIComponent(owner)}/${encodeURIComponent(skillName)}?limit=6`); if (!response.ok) { throw new Error('Failed to load'); } const data = await response.json(); const relatedSkills = data.related_skills || []; if (relatedSkills.length === 0) { // 没有相关推荐时隐藏整个区域 section.style.display = 'none'; return; } // 渲染相关 Skills listContainer.innerHTML = relatedSkills.map(skill => { const desc = skill.description || ''; const truncatedDesc = desc.length > 60 ? desc.substring(0, 60) + '...' : desc; return ` <a href="${currentLang === 'en' ? '' : '/' + currentLang}/skill/${skill.owner}/${skill.repo}/${skill.skill_name}" class="related-card"> <div class="related-name">${escapeHtml(skill.skill_name)}</div> <div class="related-meta"> <span class="related-owner">${escapeHtml(skill.owner)}</span> <span class="related-installs">${skill.installs}</span> </div> <div class="related-desc">${escapeHtml(truncatedDesc)}</div> </a> `; }).join(''); } catch (error) { console.error('Failed to load related skills:', error); // 加载失败时显示提示或隐藏 listContainer.innerHTML = '<div class="related-empty">暂无相关推荐</div>'; } } // HTML 转义 function escapeHtml(text) { const div = document.createElement('div'); div.textContent = text; return div.innerHTML; } // 页面加载完成后异步加载相关 Skills if (document.readyState === 'loading') { document.addEventListener('DOMContentLoaded', loadRelatedSkills); } else { loadRelatedSkills(); } </script> </body> </html>