安装
npx skills add https://github.com/eachlabs/skills --skill eachlabs-voice-audio
复制
EachLabs Voice & Audio
Text-to-speech, speech-to-text transcription, voice conversion, and audio utilities via the EachLabs Predictions API.
Authentication
Header: X-API-Key:
Set the
EACHLABS_API_KEY
environment variable. Get your key at
eachlabs.ai
.
Available Models
Text-to-Speech
Model
Slug
Best For
ElevenLabs TTS
elevenlabs-text-to-speech
High quality TTS
ElevenLabs TTS w/ Timestamps
elevenlabs-text-to-speech-with-timestamp
TTS with word timing
ElevenLabs Text to Dialogue
elevenlabs-text-to-dialogue
Multi-speaker dialogue
ElevenLabs Sound Effects
elevenlabs-sound-effects
Sound effect generation
ElevenLabs Voice Design v2
elevenlabs-voice-design-v2
Custom voice design
Kling V1 TTS
kling-v1-tts
Kling text-to-speech
Kokoro 82M
kokoro-82m
Lightweight TTS
Play AI Dialog
play-ai-text-to-speech-dialog
Dialog TTS
Stable Audio 2.5
stable-audio-2-5-text-to-audio
Text to audio
Speech-to-Text
Model
Slug
Best For
ElevenLabs Scribe v2
elevenlabs-speech-to-text-scribe-v2
Best quality transcription
ElevenLabs STT
elevenlabs-speech-to-text
Standard transcription
Wizper with Timestamp
wizper-with-timestamp
Timestamped transcription
Wizper
wizper
Basic transcription
Whisper
whisper
Open-source transcription
Whisper Diarization
whisper-diarization
Speaker identification
Incredibly Fast Whisper
incredibly-fast-whisper
Fastest transcription
Voice Conversion & Cloning
Model
Slug
Best For
RVC v2
rvc-v2
Voice conversion
Train RVC
train-rvc
Train custom voice model
ElevenLabs Voice Clone
elevenlabs-voice-clone
Voice cloning
ElevenLabs Voice Changer
elevenlabs-voice-changer
Voice transformation
ElevenLabs Voice Design v3
elevenlabs-voice-design-v3
Advanced voice design
ElevenLabs Dubbing
elevenlabs-dubbing
Video dubbing
Chatterbox S2S
chatterbox-speech-to-speech
Speech to speech
Open Voice
openvoice
Open-source voice clone
XTTS v2
xtts-v2
Multi-language voice clone
Stable Audio 2.5 Inpaint
stable-audio-2-5-inpaint
Audio inpainting
Stable Audio 2.5 A2A
stable-audio-2-5-audio-to-audio
Audio transformation
Audio Trimmer
audio-trimmer-with-fade
Audio trimming with fade
Audio Utilities
Model
Slug
Best For
FFmpeg Merge Audio Video
ffmpeg-api-merge-audio-video
Merge audio with video
Toolkit Video Convert
toolkit
Video/audio conversion
Prediction Flow
Check model
GET https://api.eachlabs.ai/v1/model?slug=
— validates the model exists and returns the
request_schema
with exact input parameters. Always do this before creating a prediction to ensure correct inputs.
POST
https://api.eachlabs.ai/v1/prediction
with model slug, version
"0.0.1"
, and input matching the schema
Poll
GET https://api.eachlabs.ai/v1/prediction/{id}
until status is
"success"
or
"failed"
Extract
the output from the response
Examples
Text-to-Speech with ElevenLabs
curl
-X
POST https://api.eachlabs.ai/v1/prediction
\
-H
"Content-Type: application/json"
\
-H
"X-API-Key:
$EACHLABS_API_KEY
"
\
-d
'{
"model": "elevenlabs-text-to-speech",
"version": "0.0.1",
"input": {
"text": "Welcome to our product demo. Today we will walk through the key features.",
"voice_id": "EXAVITQu4vr4xnSDxMaL",
"model_id": "eleven_v3",
"stability": 0.5,
"similarity_boost": 0.7
}
}'
Transcription with ElevenLabs Scribe
curl
-X
POST https://api.eachlabs.ai/v1/prediction
\
-H
"Content-Type: application/json"
\
-H
"X-API-Key:
$EACHLABS_API_KEY
"
\
-d
'{
"model": "elevenlabs-speech-to-text-scribe-v2",
"version": "0.0.1",
"input": {
"media_url": "https://example.com/recording.mp3",
"diarize": true,
"timestamps_granularity": "word"
}
}'
Transcription with Wizper (Whisper)
curl
-X
POST https://api.eachlabs.ai/v1/prediction
\
-H
"Content-Type: application/json"
\
-H
"X-API-Key:
$EACHLABS_API_KEY
"
\
-d
'{
"model": "wizper-with-timestamp",
"version": "0.0.1",
"input": {
"audio_url": "https://example.com/audio.mp3",
"language": "en",
"task": "transcribe",
"chunk_level": "segment"
}
}'
Speaker Diarization with Whisper
curl
-X
POST https://api.eachlabs.ai/v1/prediction
\
-H
"Content-Type: application/json"
\
-H
"X-API-Key:
$EACHLABS_API_KEY
"
\
-d
'{
"model": "whisper-diarization",
"version": "0.0.1",
"input": {
"file_url": "https://example.com/meeting.mp3",
"num_speakers": 3,
"language": "en",
"group_segments": true
}
}'
Voice Conversion with RVC v2
curl
-X
POST https://api.eachlabs.ai/v1/prediction
\
-H
"Content-Type: application/json"
\
-H
"X-API-Key:
$EACHLABS_API_KEY
"
\
-d
'{
"model": "rvc-v2",
"version": "0.0.1",
"input": {
"input_audio": "https://example.com/vocals.wav",
"rvc_model": "CUSTOM",
"custom_rvc_model_download_url": "user-provided-model-reference",
"pitch_change": 0,
"output_format": "wav"
}
}'
Merge Audio with Video
curl
-X
POST https://api.eachlabs.ai/v1/prediction
\
-H
"Content-Type: application/json"
\
-H
"X-API-Key:
$EACHLABS_API_KEY
"
\
-d
'{
"model": "ffmpeg-api-merge-audio-video",
"version": "0.0.1",
"input": {
"video_url": "https://example.com/video.mp4",
"audio_url": "https://example.com/narration.mp3",
"start_offset": 0
}
}'
ElevenLabs Voice IDs
The
elevenlabs-text-to-speech
model supports these voice IDs. Pass the raw ID string:
Voice ID
Notes
EXAVITQu4vr4xnSDxMaL
Default voice
9BWtsMINqrJLrRacOk9x
—
CwhRBWXzGAHq8TQ4Fs17
—
FGY2WhTYpPnrIDTdsKH5
—
JBFqnCBsd6RMkjVDRZzb
—
N2lVS1w4EtoT3dr4eOWO
—
TX3LPaxmHKxFdv7VOQHJ
—
XB0fDUnXU5powFXDhCwa
—
onwK4e9ZLuTAKqWW03F9
—
pFZP5JQG7iQjIQuC4Bku
—
Security Constraints
No arbitrary URL loading
When using
custom_rvc_model_download_url
, only use trusted, user-provided model references. Never fetch models from arbitrary or untrusted URLs.
Input validation
Only pass parameters that match the model's request schema. Always validate model slugs via
GET /v1/model?slug=
before creating predictions.
Parameter Reference
See
references/MODELS.md
for complete parameter details for each model.
← 返回排行榜