video processor

安装量: 64
排名: #11799

安装

npx skills add https://github.com/iamzhihuix/happy-claude-skills --skill 'Video Processor'
Video Processor
Instructions
This skill provides comprehensive video processing utilities including YouTube video download, audio extraction, format conversion, and audio transcription using yt-dlp, FFmpeg, and OpenAI's Whisper model.
Prerequisites
Required tools
(must be installed in your environment):
yt-dlp
Video downloader for YouTube and thousands of other sites

Install via pip

pip install -U yt-dlp

Verify installation

yt-dlp
--version
FFmpeg
Multimedia framework for video/audio processing

macOS

brew install ffmpeg

Ubuntu/Debian

apt-get install ffmpeg

Verify installation

ffmpeg
-version
OpenAI Whisper
Speech-to-text transcription model

Install via pip

pip install -U openai-whisper

Verify installation

whisper --help Python packages (included in script via PEP 723): click (CLI framework) ffmpeg-python (Python wrapper for FFmpeg) yt-dlp (video downloader) Workflow Use the scripts/video_processor.py script for all video processing tasks. The script provides a simple CLI with the following commands: 0. Download Video from YouTube or Other Platforms (NEW!) Download videos from YouTube and thousands of other supported websites:

Download video

uv run .claude/skills/video-processor/scripts/video_processor.py download "https://youtube.com/watch?v=..." output.mp4

Download audio only (as MP3)

uv run .claude/skills/video-processor/scripts/video_processor.py download "https://youtube.com/watch?v=..." --audio-only

Show video info without downloading

uv run .claude/skills/video-processor/scripts/video_processor.py download "https://youtube.com/watch?v=..." --info

Download with subtitles

uv run .claude/skills/video-processor/scripts/video_processor.py download
"https://youtube.com/watch?v=..."
output.mp4
--subtitle
Options:
--audio-only
Download audio only (extracts to MP3)
--subtitle
Download and embed subtitles (supports en, zh-Hans, zh-Hant)
--info
Show video information without downloading
--format
Specify video format preference (default: best quality)
1.
Extract Audio from Video
Extract the audio track from a video file:
uv run .claude/skills/video-processor/scripts/video_processor.py extract-audio input.mp4 output.wav
Options:
--format
Output audio format (default: wav). Supports: wav, mp3, aac, flac
Output is suitable for transcription or standalone audio use
2.
Convert Video to MP4
Convert any video file to MP4 format:
uv run .claude/skills/video-processor/scripts/video_processor.py to-mp4 input.avi output.mp4
Options:
--codec
Video codec (default: libx264). Common options: libx264, libx265, h264
--preset
Encoding speed/quality preset (default: medium). Options: ultrafast, fast, medium, slow, veryslow
3.
Convert Video to WebM
Convert any video file to WebM format (web-optimized):
uv run .claude/skills/video-processor/scripts/video_processor.py to-webm input.mp4 output.webm
Options:
--codec
Video codec (default: libvpx-vp9). Options: libvpx, libvpx-vp9 WebM is optimized for web playback and streaming 4. Transcribe Audio with Whisper Transcribe audio or video files to text using OpenAI's Whisper model:

Transcribe video file (audio will be extracted automatically)

uv run .claude/skills/video-processor/scripts/video_processor.py transcribe input.mp4 transcript.txt

Transcribe audio file directly

uv run .claude/skills/video-processor/scripts/video_processor.py transcribe audio.wav transcript.txt
Options:
--model
Whisper model size (default: base). Options:
tiny
Fastest, lowest accuracy (~1GB RAM)
base
Fast, good accuracy (~1GB RAM)
[DEFAULT]
small
Balanced (~2GB RAM)
medium
High accuracy (~5GB RAM)
large
Best accuracy, slowest (~10GB RAM)
--language
Language code (default: auto-detect). Examples: en, es, fr, de, zh
--format
Output format (default: txt). Options: txt, srt, vtt, json Transcription workflow: If input is video, FFmpeg extracts audio to temporary WAV file Whisper processes the audio file Transcription is saved in requested format Temporary files are cleaned up automatically 5. Combined Workflow Example Process a video end-to-end:

1. Extract audio for analysis

uv run .claude/skills/video-processor/scripts/video_processor.py extract-audio lecture.mp4 lecture.wav

2. Transcribe to SRT subtitles

uv run .claude/skills/video-processor/scripts/video_processor.py transcribe lecture.mp4 lecture.srt --format srt --model small

3. Convert to web format

uv run .claude/skills/video-processor/scripts/video_processor.py to-webm lecture.mp4 lecture.webm
Key Technical Details
FFmpeg and Whisper Integration:
FFmpeg doesn't transcribe audio itself - it prepares audio for external transcription
The workflow is: Extract audio (FFmpeg) → Transcribe (Whisper) → Optional: Re-integrate with video
FFmpeg can pipe audio directly to Whisper for real-time processing (advanced use case)
Audio Format for Transcription:
Whisper works best with WAV or MP3 formats
Sample rate: 16kHz is optimal (script handles conversion automatically)
The script extracts audio with optimal settings for Whisper
Output Formats:
txt
Plain text transcript
srt
SubRip subtitle format (includes timestamps)
vtt
WebVTT subtitle format (web standard)
json
Detailed JSON with word-level timestamps Error Handling The script includes comprehensive error handling: Validates input files exist Checks FFmpeg and Whisper are installed Provides clear error messages for missing dependencies Handles temporary file cleanup on errors Performance Tips Use tiny or base models for quick drafts Use small or medium for production transcriptions Use large only when maximum accuracy is required For long videos, consider extracting audio first, then transcribe in segments WebM conversion with VP9 takes longer but produces smaller files Examples Example 1: Quick Video to MP4 Conversion User request: I have an AVI file from my old camera. Can you convert it to MP4? You would: Use the to-mp4 command with default settings: uv run .claude/skills/video-processor/scripts/video_processor.py to-mp4 old_video.avi output.mp4 Confirm the conversion completed successfully Inform the user about the output file location Example 2: Extract Audio and Transcribe User request: I recorded a lecture video and need a transcript. Can you extract the audio and transcribe it? You would: First extract the audio: uv run .claude/skills/video-processor/scripts/video_processor.py extract-audio lecture.mp4 lecture.wav Then transcribe using the base model (good balance of speed/accuracy): uv run .claude/skills/video-processor/scripts/video_processor.py transcribe lecture.mp4 transcript.txt --model base Share the transcript.txt file with the user Example 3: Create Web-Optimized Video with Subtitles User request: I need to put this video on my website with subtitles. Can you help? You would: Convert to WebM for web optimization: uv run .claude/skills/video-processor/scripts/video_processor.py to-webm presentation.mp4 presentation.webm Generate SRT subtitle file: uv run .claude/skills/video-processor/scripts/video_processor.py transcribe presentation.mp4 subtitles.srt --format srt --model small Inform user they now have: presentation.webm (web-optimized video) subtitles.srt (subtitle file for embedding) Example 4: High-Quality Transcription with Language Specification User request: I have a Spanish interview video that needs an accurate transcript for publication. You would: Use a larger model with language specified for best accuracy: uv run .claude/skills/video-processor/scripts/video_processor.py transcribe interview.mp4 transcript.txt --model medium --language es Optionally create SRT for review: uv run .claude/skills/video-processor/scripts/video_processor.py transcribe interview.mp4 transcript.srt --format srt --model medium --language es Review the transcript with the user and make any necessary corrections Example 5: Batch Processing Multiple Videos User request: I have a folder of training videos that all need to be converted to WebM and transcribed. You would: List all video files in the directory: ls training_videos/*.mp4 For each video file, run the conversion and transcription:

For each video: video1.mp4, video2.mp4, etc.

uv run .claude/skills/video-processor/scripts/video_processor.py to-webm training_videos/video1.mp4 output/video1.webm uv run .claude/skills/video-processor/scripts/video_processor.py transcribe training_videos/video1.mp4 output/video1.txt --model base

Repeat for each file

Confirm all conversions and transcriptions completed
Provide summary of output files
Summary
The video-processor skill provides a unified interface for common video processing tasks:
Audio extraction
Extract audio tracks in various formats
Format conversion
Convert to MP4 (universal) or WebM (web-optimized)
Transcription
Speech-to-text with multiple output formats
Flexible
CLI arguments for model selection, language, and output formats All operations are handled through a single, well-documented script with sensible defaults and comprehensive error handling.
返回排行榜