paper-slide-deck

安装量: 60
排名: #12444

安装

npx skills add https://github.com/luwill/research-skills --skill paper-slide-deck

Transform academic papers and content into professional slide deck images with automatic figure extraction.

Usage

/paper-slide-deck path/to/paper.pdf
/paper-slide-deck path/to/paper.pdf --style academic-paper
/paper-slide-deck path/to/content.md --style sketch-notes
/paper-slide-deck path/to/content.md --audience executives
/paper-slide-deck path/to/content.md --lang zh
/paper-slide-deck path/to/content.md --slides 10
/paper-slide-deck path/to/content.md --outline-only
/paper-slide-deck  # Then paste content

Script Directory

Important: All scripts are located in the scripts/ subdirectory of this skill.

Agent Execution Instructions:

  • Determine this SKILL.md file's directory path as SKILL_DIR

  • Script path = ${SKILL_DIR}/scripts/<script-name>.ts

  • Replace all ${SKILL_DIR} in this document with the actual path

Script Reference:

| scripts/generate-slides.py | Generate AI slides via Gemini API (Python)

| scripts/merge-to-pptx.ts | Merge slides into PowerPoint

| scripts/merge-to-pdf.ts | Merge slides into PDF

| scripts/detect-figures.ts | Auto-detect figures/tables in PDF

| scripts/extract-figure.ts | Extract figure from PDF page (uses PyMuPDF fallback)

| scripts/apply-template.ts | Apply figure container template

Options

| --style <name> | Visual style (see Style Gallery)

| --audience <type> | Target audience: beginners, intermediate, experts, executives, general

| --lang <code> | Output language (en, zh, ja, etc.)

| --slides <number> | Target slide count

| --outline-only | Generate outline only, skip image generation

| academic-paper | Clean professional, precise charts | Conference talks, thesis defense

| blueprint (Default) | Technical schematics, grid texture | Architecture, system design

| chalkboard | Black chalkboard, colorful chalk | Education, tutorials, classroom

| notion | SaaS dashboard, card-based layouts | Product demos, SaaS, B2B

| bold-editorial | Magazine cover, bold typography, dark | Product launches, keynotes

| corporate | Navy/gold, structured layouts | Investor decks, proposals

| dark-atmospheric | Cinematic dark mode, glowing accents | Entertainment, gaming

| editorial-infographic | Magazine explainers, flat illustrations | Tech explainers, research

| fantasy-animation | Ghibli/Disney style, hand-drawn | Educational, storytelling

| intuition-machine | Technical briefing, bilingual labels | Technical docs, academic

| minimal | Ultra-clean, maximum whitespace | Executive briefings, premium

| pixel-art | Retro 8-bit, chunky pixels | Gaming, developer talks

| scientific | Academic diagrams, precise labeling | Biology, chemistry, medical

| sketch-notes | Hand-drawn, warm & friendly | Educational, tutorials

| vector-illustration | Flat vector, retro & cute | Creative, children's content

| vintage | Aged-paper, historical styling | Historical, heritage, biography

| watercolor | Hand-painted textures, natural warmth | Lifestyle, wellness, travel

Auto Style Selection

| paper, thesis, defense, conference, ieee, acm, icml, neurips, cvpr, acl, aaai, iclr | academic-paper

| tutorial, learn, education, guide, intro, beginner | sketch-notes

| classroom, teaching, school, chalkboard, blackboard | chalkboard

| architecture, system, data, analysis, technical | blueprint

| creative, children, kids, cute, illustration | vector-illustration

| briefing, bilingual, infographic, concept | intuition-machine

| executive, minimal, clean, simple, elegant | minimal

| saas, product, dashboard, metrics, productivity | notion

| investor, quarterly, business, corporate, proposal | corporate

| launch, marketing, keynote, bold, impact, magazine | bold-editorial

| entertainment, music, gaming, creative, atmospheric | dark-atmospheric

| explainer, journalism, science communication | editorial-infographic

| story, fantasy, animation, magical, whimsical | fantasy-animation

| gaming, retro, pixel, developer, nostalgia | pixel-art

| biology, chemistry, medical, pathway, scientific | scientific

| history, heritage, vintage, expedition, historical | vintage

| lifestyle, wellness, travel, artistic, natural | watercolor

| Default | blueprint

Optional layout hints for individual slides. Specify in outline's // LAYOUT section.

Slide-Specific Layouts

| title-hero | Large centered title + subtitle | Cover slides, section breaks

| quote-callout | Featured quote with attribution | Testimonials, key insights

| key-stat | Single large number as focal point | Impact statistics, metrics

| split-screen | Half image, half text | Feature highlights, comparisons

| icon-grid | Grid of icons with labels | Features, capabilities, benefits

| two-columns | Content in balanced columns | Paired information, dual points

| three-columns | Content in three columns | Triple comparisons, categories

| image-caption | Full-bleed image + text overlay | Visual storytelling, emotional

| agenda | Numbered list with highlights | Session overview, roadmap

| bullet-list | Structured bullet points | Simple content, lists

Infographic-Derived Layouts

| linear-progression | Sequential flow left-to-right | Timelines, step-by-step

| binary-comparison | Side-by-side A vs B | Before/after, pros-cons

| comparison-matrix | Multi-factor grid | Feature comparisons

| hierarchical-layers | Pyramid or stacked levels | Priority, importance

| hub-spoke | Central node with radiating items | Concept maps, ecosystems

| bento-grid | Varied-size tiles | Overview, summary

| funnel | Narrowing stages | Conversion, filtering

| dashboard | Metrics with charts/numbers | KPIs, data display

| venn-diagram | Overlapping circles | Relationships, intersections

| circular-flow | Continuous cycle | Recurring processes

| winding-roadmap | Curved path with milestones | Journey, timeline

| tree-branching | Parent-child hierarchy | Org charts, taxonomies

| iceberg | Visible vs hidden layers | Surface vs depth

| bridge | Gap with connection | Problem-solution

Academic-Specific Layouts

| paper-title | Title, authors, affiliations, venue | Conference paper cover

| outline-agenda | Numbered section list with highlights | Talk structure overview

| methods-diagram | Central architecture/pipeline diagram | Methods, system design

| results-chart | Chart area + data annotations | Quantitative results

| equation-focus | Centered equation + variable definitions | Mathematical derivations

| qualitative-grid | 2x2 or 3x2 image comparison grid | Visual results, ablations

| references-list | Numbered citation list | Key references slide

| contributions | Numbered contribution points | Contributions summary

Usage: Add Layout: <name> in slide's // LAYOUT section to guide visual composition.

Design Philosophy

This deck is designed for reading and sharing, not live presentation:

  • Each slide must be self-explanatory without verbal commentary

  • Structure content for logical flow when scrolling

  • Include all necessary context within each slide

  • Optimize for social media sharing and offline reading

File Management

Output Directory

Each session creates an independent directory named by content slug:

slide-deck/{topic-slug}/
├── source-{slug}.{ext}    # Source files (text, images, etc.)
├── outline.md
├── outline-{style}.md     # Style variant outlines
├── prompts/
│   └── 01-slide-cover.md, 02-slide-{slug}.md, ...
├── 01-slide-cover.png, 02-slide-{slug}.png, ...
├── {topic-slug}.pptx
└── {topic-slug}.pdf

Slug Generation:

  • Extract main topic from content (2-4 words, kebab-case)

  • Example: "Introduction to Machine Learning" → intro-machine-learning

Conflict Resolution

If slide-deck/{topic-slug}/ already exists:

  • Append timestamp: {topic-slug}-YYYYMMDD-HHMMSS

  • Example: intro-ml exists → intro-ml-20260118-143052

Source Files

Copy all sources with naming source-{slug}.{ext}:

  • source-article.md (main text content)

  • source-diagram.png (image from conversation)

  • source-data.xlsx (additional file)

Multiple sources supported: text, images, files from conversation.

Workflow

Step 1: Analyze Content

  • Save source content (if pasted, save as source.md)

  • Follow references/analysis-framework.md for deep content analysis

  • Determine style (use --style or auto-select from signals)

  • Detect languages (source vs. user preference)

  • Plan slide count (--slides or dynamic)

  • For academic papers (PDF with figures): Run automatic figure detection:

npx -y bun ${SKILL_DIR}/scripts/detect-figures.ts --pdf source-paper.pdf --output figures.json

This outputs a JSON file with all detected figures/tables, their page numbers, and captions.

Step 2: Generate Outline Variants

  • Generate 3 style variant outlines based on content analysis

  • Follow references/outline-template.md for structure

  • Auto-populate IMAGE_SOURCE for academic papers:

Read figures.json from Step 1

  • Map figures to slides using rules in references/analysis-framework.md Section 8

  • Automatically add // IMAGE_SOURCE blocks to appropriate slides:

Architecture/pipeline figures → Methods slides (Source: extract)

  • Results tables → Quantitative results slides (Source: extract)

  • Comparison images → Qualitative results slides (Source: extract)

  • Conceptual/simple diagrams → Leave for AI generation (Source: generate or omit)

  • Save as outline-{style}.md for each variant

Step 3: User Confirmation

Single AskUserQuestion with all applicable options:

| Style variant | Always (3 options + custom)

| Language | Only if source ≠ user language

After selection:

  • Copy selected outline-{style}.md to outline.md

  • Regenerate in different language if requested

  • User may edit outline.md for fine-tuning

If --outline-only, stop here.

Step 4: Generate Prompts

  • Read references/base-prompt.md

  • Combine with style instructions from outline

  • Add slide-specific content

  • If Layout: specified in outline, include layout guidance in prompt:

Reference layout characteristics for image composition

  • Example: Layout: hub-spoke → "Central concept in middle with related items radiating outward"

  • Save to prompts/ directory

Step 5: Image Generation Method Selection

Before generating images, ask user to choose generation method:

Use AskUserQuestion with options:

| 1 | Gemini API (Recommended) | Official Google API via Python. Requires GOOGLE_API_KEY env var.

| 2 | Gemini Web (Browser-based) | ⚠️ Uses reverse-engineered web API. No API key needed but may break.

Based on selection:

Option 1: Gemini API (Python)

  • Verify API key: Check GOOGLE_API_KEY or GEMINI_API_KEY environment variable

  • Run generation script:

python ${SKILL_DIR}/scripts/generate-slides.py <slide-deck-dir> --model gemini-3-pro-image-preview

Script Features:

  • Auto-installs google-genai package if missing

  • Retry logic with exponential backoff (3 retries)

  • Skips already-generated slides (> 10KB)

  • Supports custom model via --model flag

  • Outputs to slides/ subdirectory

Troubleshooting:

  • If server disconnection errors occur, script auto-retries

  • For persistent failures, re-run the script (it skips completed slides)

  • Check API quota if many failures occur

Option 2: Gemini Web Skill

  • Consent Check: Read consent file at:

Windows: $APPDATA/baoyu-skills/gemini-web/consent.json

  • macOS: ~/Library/Application Support/baoyu-skills/gemini-web/consent.json

  • Linux: ~/.local/share/baoyu-skills/gemini-web/consent.json

  • If no consent or version mismatch, display disclaimer and ask:

⚠️ DISCLAIMER: This uses a reverse-engineered Gemini Web API (NOT official).
Risks: May break anytime, no support, possible account risk.
  • For each slide, run:
npx -y bun ${GEMINI_WEB_SKILL_DIR}/scripts/main.ts \
  --promptfiles prompts/01-slide-cover.md \
  --image 01-slide-cover.png \
  --sessionId slides-{topic-slug}-{timestamp}

Where GEMINI_WEB_SKILL_DIR = path to baoyu-danger-gemini-web skill directory.

  • Proxy support: If user is in restricted network, prepend:
HTTP_PROXY=http://127.0.0.1:7890 HTTPS_PROXY=http://127.0.0.1:7890

Step 5.5: Process IMAGE_SOURCE (Automatic Figure Extraction)

For academic presentations, IMAGE_SOURCE metadata was auto-populated in Step 2 based on figure detection from Step 1.

Automatic Execution:

  • Parse outline to identify slides with Source: extract

  • Create figures directory: mkdir -p figures

  • For each extract slide, automatically:

Read the Figure number, Page, and Caption from metadata

  • Run figure extraction script:
npx -y bun ${SKILL_DIR}/scripts/extract-figure.ts \
  --pdf source-paper.pdf \
  --page <page-number> \
  --output figures/figure-<N>.png
  • Run template application script:
npx -y bun ${SKILL_DIR}/scripts/apply-template.ts \
  --figure figures/figure-<N>.png \
  --title "<slide-headline>" \
  --caption "Figure <N>: <caption-text>" \
  --output <NN>-slide-<slug>.png
  • Report: "Extracted: Figure N → slide NN"

  • For slides with Source: generate (or no IMAGE_SOURCE):

Proceed to Step 6 for AI generation

Note: Source PDF must be saved as source-paper.pdf in output directory.

Troubleshooting:

  • If figure detection missed a figure: manually add // IMAGE_SOURCE block to outline

  • If wrong figure mapped: edit the Figure: and Page: values in outline

  • If extraction fails: check PDF page number (1-indexed)

PyMuPDF Fallback for Page Extraction: If extract-figure.ts fails with "Image or Canvas expected" error (common with complex PDFs), use PyMuPDF:

import fitz
doc = fitz.open("source-paper.pdf")
page = doc[page_num - 1]  # 0-indexed
mat = fitz.Matrix(3, 3)  # 3x scale for high resolution
pix = page.get_pixmap(matrix=mat)
pix.save(f"extracted/page-{page_num}.png")

Then apply template using apply-template.ts.

Step 6: Generate Images

  • Use selected method from Step 5

  • Skip slides already processed in Step 5.5 (those with Source: extract)

  • Generate session ID: slides-{topic-slug}-{timestamp}

  • Generate each remaining slide with same session ID

  • Report progress: "Generated X/N"

  • Auto-retry once on generation failure

Step 7: Merge to PPTX and PDF

npx -y bun ${SKILL_DIR}/scripts/merge-to-pptx.ts <slide-deck-dir>
npx -y bun ${SKILL_DIR}/scripts/merge-to-pdf.ts <slide-deck-dir>

Step 8: Output Summary

Slide Deck Complete!

Topic: [topic]
Style: [style name]
Location: [directory path]
Slides: N total

- 01-slide-cover.png ✓ Cover
- 02-slide-intro.png ✓ Content
- ...
- {NN}-slide-back-cover.png ✓ Back Cover

Outline: outline.md
PPTX: {topic-slug}.pptx
PDF: {topic-slug}.pdf

Slide Modification

See references/modification-guide.md for:

  • Edit single slide workflow

  • Add new slide (with renumbering)

  • Delete slide (with renumbering)

  • File naming conventions

Image Generation Dependencies

Requires:

  • GOOGLE_API_KEY or GEMINI_API_KEY environment variable

  • Python 3.8+ with pip

  • google-genai package (auto-installed by script)

Model: gemini-3-pro-image-preview (default)

Gemini Web Skill (Option 2)

Requires:

  • baoyu-danger-gemini-web skill installed at .claude/skills/baoyu-danger-gemini-web

  • Google Chrome browser with logged-in Google account

  • User consent for reverse-engineered API disclaimer

PDF Figure Extraction

Requires:

  • Primary: pdfjs-dist npm package (use legacy build for Node.js)

  • Fallback: pymupdf Python package (more reliable for complex PDFs)

  • canvas npm package for apply-template.ts

References

| references/analysis-framework.md | Deep content analysis for presentations

| references/outline-template.md | Outline structure and STYLE_INSTRUCTIONS format

| references/modification-guide.md | Edit, add, delete slide workflows

| references/content-rules.md | Content and style guidelines

| references/base-prompt.md | Base prompt for image generation

| references/figure-container-template.md | Visual specs for extracted figure containers

| references/styles/<style>.md | Full style specifications

Notes

Image Generation

  • Nano Banana Pro API: Recommended. Stable, reliable, requires API key

  • Gemini Web: No API key needed, but uses reverse-engineered API with account risk

  • Generation time: 10-30 seconds per slide

  • Auto-retry once on generation failure

  • Maintain style consistency via session ID

Content Guidelines

  • Use stylized alternatives for sensitive public figures

  • Both methods use the same underlying Gemini model for image generation

Extension Support

Custom styles and configurations via EXTEND.md.

Check paths (priority order):

  • .paper-skills/paper-slide-deck/EXTEND.md (project)

  • ~/.paper-skills/paper-slide-deck/EXTEND.md (user)

If found, load before Step 1. Extension content overrides defaults.

返回排行榜