- Skill: Raw Video Processing
- Post-process raw screen recordings to improve pacing — remove silent segments, then speed up the result.
- Prerequisite
- FFmpeg and uv must be installed.
When to Use
The user has recorded a screencast and wants to clean it up before publishing. Typical issues in raw recordings:
Long pauses / dead air while thinking or waiting for loading
Keyboard typing sounds and other low-level background noise that should be treated as silence
Overall pacing feels slow and could benefit from a slight speed boost
Default Workflow
When the user provides a raw video file,
run both scripts in sequence
by default:
Step 1: Remove Silent Segments
uv run
--python
3.12
/path/to/skills/raw-video-processing/scripts/remove_silence.py
<
input.mp
4
-t
"-20dB" -d 0.5 This detects and cuts out silent portions (including keyboard sounds), producing _nosilence.mp4 . Always pass these parameters (tuned for screen recordings with keyboard noise): -t="-20dB" — aggressive threshold that filters out keyboard typing and background noise (use = syntax to avoid argparse treating negative values as flags) -d 0.5 — remove short silences too (0.5s minimum) -p 0.2 — seconds of breathing room kept around speech boundaries (default, usually no need to pass) The script prints a detailed summary: number of silent segments found, total silence removed, and all kept segments with timestamps. Review this output to confirm the result looks reasonable. Step 2: Speed Up the Video uv run --python 3.12 /path/to/skills/raw-video-processing/scripts/speed_video.py < input
nosilence.mp4 This applies a speed multiplier to the silence-removed video, producing _nosilence_1.2x.mp4 . Default parameters : --speed 1.2 — 1.2x playback speed (a subtle boost that doesn't feel rushed) Script Options remove_silence.py Flag Default Description -o , --output _nosilence.mp4 Custom output path -t , --threshold -30dB Silence threshold in dB (higher = more aggressive). Always use -20dB for screencasts — pass as -t="-20dB" to avoid argparse issues with negative values -d , --duration 0.8 Minimum silence duration in seconds to remove. Use 0.5 for screencasts -p , --padding 0.2 Padding kept around non-silent segments --dry-run off Only print detected segments, don't export speed_video.py Flag Default Description -o , --output
x.mp4 Custom output path -s , --speed 1.2 Playback speed multiplier Custom Scenarios Only remove silence — run just Step 1. Only speed up — run just Step 2 directly on the input file. Conservative cleanup — use -t="-30dB" -d 0.8 if the default is cutting too much speech. Extra aggressive cleanup — use -t="-15dB" -d 0.3 and --speed 1.5 for maximum compression. Preview before committing — use --dry-run on remove_silence.py to see what would be cut without creating a file. Custom output name — use -o on either script to control the output path. Important Notes Always run remove_silence before speed_video. Silence detection works on the original audio; speeding up first would alter the audio characteristics and make silence detection less accurate. For long videos (>30 min), the silence removal step may take a few minutes as it processes each segment individually. Both scripts preserve video quality — remove_silence uses stream copy (no re-encoding), while speed_video re-encodes with FFmpeg defaults.
raw-video-processing
安装
npx skills add https://github.com/zc277584121/marketing-skills --skill raw-video-processing