runpod

安装量: 47
排名: #15871

安装

npx skills add https://github.com/digitalsamba/claude-code-video-toolkit --skill runpod

RunPod Cloud GPU Run open-source AI models on cloud GPUs via RunPod serverless. Pay-per-second, no minimums. Setup

1. Create account at https://runpod.io

2. Add API key to .env

echo "RUNPOD_API_KEY=your_key_here"

.env

3. Deploy any tool with --setup

python tools/image_edit.py --setup python tools/upscale.py --setup python tools/dewatermark.py --setup python tools/sadtalker.py --setup python tools/qwen3_tts.py --setup Each --setup command: Creates a RunPod template from the Docker image Creates a serverless endpoint with appropriate GPU Saves the endpoint ID to .env (e.g. RUNPOD_QWEN_EDIT_ENDPOINT_ID ) Available Images All images are public on GHCR — no authentication needed. Tool Docker Image GPU VRAM Typical Cost image_edit ghcr.io/conalmullan/video-toolkit-qwen-edit:latest A6000/L40S 48GB+ ~$0.05-0.15/job upscale ghcr.io/conalmullan/video-toolkit-realesrgan:latest RTX 3090/4090 24GB ~$0.01-0.05/job dewatermark ghcr.io/conalmullan/video-toolkit-propainter:latest RTX 3090/4090 24GB ~$0.05-0.30/job sadtalker ghcr.io/conalmullan/video-toolkit-sadtalker:latest RTX 4090 24GB ~$0.05-0.15/job qwen3_tts ghcr.io/conalmullan/video-toolkit-qwen3-tts:latest ADA 24GB 24GB ~$0.01-0.05/job Total monthly cost: Rarely exceeds $10 even with heavy use. How It Works All tools follow the same pattern: Local CLI → Upload input to cloud storage → RunPod API → Poll for result → Download output File transfer: Tools use Cloudflare R2 when configured ( R2_ACCOUNT_ID , R2_ACCESS_KEY_ID , R2_SECRET_ACCESS_KEY , R2_BUCKET_NAME ), falling back to free upload services RunPod API: Tools call the /run endpoint, then poll /status/{job_id} until complete Cold vs warm start: First request after idle spins up a worker (~30-90s). Subsequent requests are fast (~5-15s) Endpoint Management Workers workersMin: 0 — Scale to zero when idle (no cost) workersMax: 1 — Max concurrent jobs (increase for throughput) idleTimeout: 5 — Seconds before worker scales down Across all endpoints, you share a total worker pool based on your RunPod plan. If you hit limits, reduce workersMax on endpoints you're not actively using. Checking Endpoint Status Each tool stores its endpoint ID in .env : Tool Env Var image_edit RUNPOD_QWEN_EDIT_ENDPOINT_ID upscale RUNPOD_UPSCALE_ENDPOINT_ID dewatermark RUNPOD_DEWATERMARK_ENDPOINT_ID sadtalker RUNPOD_SADTALKER_ENDPOINT_ID qwen3_tts RUNPOD_QWEN3_TTS_ENDPOINT_ID Disabling an Endpoint To free worker slots without deleting the endpoint, set workersMax=0 via the RunPod dashboard or GraphQL API. Troubleshooting Force Image Pull When you push a new Docker image version, RunPod may still use the cached old one. To force a pull: Update the template's imageName to use @sha256:DIGEST notation Wait for the worker to restart Revert to :latest tag after confirming Cold Start Too Slow qwen3-tts: ~70s cold start, ~7s warm sadtalker: ~60s cold start, ~10s warm image_edit: ~90s cold start, ~15s warm If cold starts are a problem, set workersMin: 1 (costs money when idle). Job Fails with OOM The model needs more VRAM than the GPU provides. Options: Use a larger GPU tier For dewatermark: reduce --resize-ratio (default 0.5 for safety) For image_edit: reduce --steps "No workers available" You've hit your plan's concurrent worker limit. Either: Wait for a running job to finish Set workersMax=0 on endpoints you're not using Upgrade your RunPod plan Docker Images All Dockerfiles live in docker/runpod-*/ . Images use runpod/pytorch as the base to share layers across tools. Building for RunPod (from Apple Silicon Mac): docker buildx build --platform linux/amd64 -t ghcr.io/conalmullan/video-toolkit- < name

:latest docker/runpod- < name

/ docker push ghcr.io/conalmullan/video-toolkit- < name

:latest GHCR packages default to private — you must manually make them public for RunPod to pull them. Go to GitHub > Packages > Package Settings > Change Visibility. Cost Optimization Keep workersMin: 0 on all endpoints (scale to zero) Only deploy endpoints you actively need Use workersMax=0 to disable idle endpoints without deleting them Qwen3-TTS is significantly cheaper than ElevenLabs for voiceovers Check the RunPod dashboard for usage and billing

返回排行榜