mmx-cli

安装量: 4.6K
排名: #1083

安装

npx skills add https://github.com/minimax-ai/cli --skill mmx-cli

MiniMax CLI — Agent Skill Guide Use mmx to generate text, images, video, speech, music, and perform web search via the MiniMax AI platform. Prerequisites

Install

npm install -g mmx-cli

Auth (OAuth persists to ~/.mmx/credentials.json, API key persists to ~/.mmx/config.json)

mmx auth login --api-key sk-xxxxx

Verify active auth source

mmx auth status

Or pass per-call

mmx text chat --api-key sk-xxxxx --message "Hello" Region is auto-detected. Override with --region global or --region cn . Agent Flags Always use these flags in non-interactive (agent/CI) contexts: Flag Purpose --non-interactive Fail fast on missing args instead of prompting --quiet Suppress spinners/progress; stdout is pure data --output json Machine-readable JSON output --async Return task ID immediately (video generation) --dry-run Preview the API request without executing --yes Skip confirmation prompts Commands text chat Chat completion. Default model: MiniMax-M2.7 . mmx text chat --message < text

[ flags ] Flag Type Description --message string, required , repeatable Message text. Prefix with role: to set role (e.g. "system:You are helpful" , "user:Hello" ) --messages-file string JSON file with messages array. Use - for stdin --system string System prompt --model string Model ID (default: MiniMax-M2.7 ) --max-tokens number Max tokens (default: 4096) --temperature number Sampling temperature (0.0, 1.0] --top-p number Nucleus sampling threshold --stream boolean Stream tokens (default: on in TTY) --tool string, repeatable Tool definition JSON or file path

Single message

mmx text chat --message "user:What is MiniMax?" --output json --quiet

Multi-turn

mmx text chat \ --system "You are a coding assistant." \ --message "user:Write fizzbuzz in Python" \ --output json

From file

cat
conversation.json
|
mmx text chat --messages-file -
--output
json
stdout
response text (text mode) or full response object (json mode). image generate Generate images. Model: image-01 . mmx image generate --prompt < text

[ flags ] Flag Type Description --prompt string, required Image description --aspect-ratio string e.g. 16:9 , 1:1 --n number Number of images (default: 1) --subject-ref string Subject reference: type=character,image=path-or-url --out-dir

string Download images to directory --out-prefix string Filename prefix (default: image ) mmx image generate --prompt "A cat in a spacesuit" --output json --quiet

stdout: image URLs (one per line in quiet mode)

mmx image generate --prompt "Logo" --n 3 --out-dir ./gen/ --quiet

stdout: saved file paths (one per line)

video generate Generate video. Default model: MiniMax-Hailuo-2.3 . This is an async task — by default it polls until completion. mmx video generate --prompt < text

[ flags ] Flag Type Description --prompt string, required Video description --model string MiniMax-Hailuo-2.3 (default) or MiniMax-Hailuo-2.3-Fast --first-frame string First frame image --callback-url string Webhook URL for completion --download string Save video to specific file --async boolean Return task ID immediately --no-wait boolean Same as --async --poll-interval number Polling interval (default: 5)

Non-blocking: get task ID

mmx video generate --prompt "A robot." --async --quiet

stdout:

Blocking: wait and get file path

mmx video generate --prompt "Ocean waves." --download ocean.mp4 --quiet

stdout: ocean.mp4

video task get Query status of a video generation task. mmx video task get --task-id < id

[ --output json ] video download Download a completed video by task ID. mmx video download --file-id < id

[ --out < path

] speech synthesize Text-to-speech. Default model: speech-2.8-hd . Max 10k chars. mmx speech synthesize --text < text

[ flags ] Flag Type Description --text string Text to synthesize --text-file string Read text from file. Use - for stdin --model string speech-2.8-hd (default), speech-2.6 , speech-02 --voice string Voice ID (default: English_expressive_narrator ) --speed number Speed multiplier --volume number Volume level --pitch number Pitch adjustment --format string Audio format (default: mp3 ) --sample-rate number Sample rate (default: 32000) --bitrate number Bitrate (default: 128000) --channels number Audio channels (default: 1) --language string Language boost --subtitles boolean Download and save subtitles as .srt file (alongside --out audio file). API must support subtitles for the selected model. --pronunciation string, repeatable Custom pronunciation --sound-effect string Add sound effect --out string Save audio to file --stream boolean Stream raw audio to stdout mmx speech synthesize --text "Hello world" --out hello.mp3 --quiet

stdout: hello.mp3

mmx speech synthesize --text "Hello" --subtitles --out hello.mp3

saves hello.mp3 + hello.srt (SRT subtitle file)

echo "Breaking news." | mmx speech synthesize --text-file - --out news.mp3 music generate Generate music. Responds well to rich, structured descriptions. Model: music-2.6-free — unlimited for API key users, RPM = 3. mmx music generate --prompt < text

[ --lyrics < text

] [ flags ] Flag Type Description --prompt string Music style description (can be detailed) --lyrics string Song lyrics with structure tags. Required unless --instrumental or --lyrics-optimizer is used. --lyrics-file string Read lyrics from file. Use - for stdin --lyrics-optimizer boolean Auto-generate lyrics from prompt. Cannot be used with --lyrics or --instrumental . --instrumental boolean Generate instrumental music (no vocals). Cannot be used with --lyrics . --vocals string Vocal style, e.g. "warm male baritone" , "bright female soprano" , "duet with harmonies" --genre string Music genre, e.g. folk, pop, jazz --mood string Mood or emotion, e.g. warm, melancholic, uplifting --instruments string Instruments to feature, e.g. "acoustic guitar, piano" --tempo string Tempo description, e.g. fast, slow, moderate --bpm number Exact tempo in beats per minute --key string Musical key, e.g. C major, A minor, G sharp --avoid string Elements to avoid in the generated music --use-case string Use case context, e.g. "background music for video" , "theme song" --structure string Song structure, e.g. "verse-chorus-verse-bridge-chorus" --references string Reference tracks or artists, e.g. "similar to Ed Sheeran" --extra string Additional fine-grained requirements --aigc-watermark boolean Embed AI-generated content watermark --format string Audio format (default: mp3 ) --sample-rate number Sample rate (default: 44100) --bitrate number Bitrate (default: 256000) --out string Save audio to file --stream boolean Stream raw audio to stdout At least one of --prompt or --lyrics is required.

With lyrics

mmx music generate --prompt "Upbeat pop" --lyrics "La la la..." --out song.mp3 --quiet

Auto-generate lyrics from prompt

mmx music generate --prompt "Upbeat pop about summer" --lyrics-optimizer --out summer.mp3 --quiet

Instrumental

mmx music generate --prompt "Cinematic orchestral, building tension" --instrumental --out bgm.mp3 --quiet

Detailed prompt with vocal characteristics

mmx music generate --prompt "Warm morning folk" \ --vocals "male and female duet, harmonies in chorus" \ --instruments "acoustic guitar, piano" \ --bpm 95 \ --lyrics-file song.txt \ --out duet.mp3 music cover Generate a cover version of a song based on reference audio. Model: music-cover-free — unlimited for API key users, RPM = 3. mmx music cover --prompt < text

( --audio < url

| --audio-file < path

) [ flags ] Flag Type Description --prompt string, required Target cover style, e.g. "Indie folk, acoustic guitar, warm male vocal" --audio string URL of reference audio (mp3, wav, flac, etc. — 6s to 6min, max 50MB) --audio-file string Local reference audio file (auto base64-encoded) --lyrics string Cover lyrics. If omitted, extracted from reference audio via ASR. --lyrics-file string Read lyrics from file. Use - for stdin --seed number Random seed 0–1000000 for reproducible results --format string Audio format: mp3 , wav , pcm (default: mp3 ) --sample-rate number Sample rate (default: 44100) --bitrate number Bitrate (default: 256000) --channel number Channels: 1 (mono) or 2 (stereo, default) --out string Save audio to file --stream boolean Stream raw audio to stdout

Cover from URL

mmx music cover --prompt "Indie folk, acoustic guitar, warm male vocal" \ --audio https://filecdn.minimax.chat/public/d20eda57-2e36-45bf-9e12-82d9f2e69a86.mp3 --out cover.mp3 --quiet

Cover from local file with custom lyrics

mmx music cover --prompt "Jazz, piano, slow" \ --audio-file original.mp3 --lyrics-file lyrics.txt --out jazz_cover.mp3 --quiet

Reproducible result with seed

mmx music cover
--prompt
"Pop, upbeat"
--audio
https://filecdn.minimax.chat/public/d20eda57-2e36-45bf-9e12-82d9f2e69a86.mp3
--seed
42
--out
cover.mp3
vision describe
Image understanding via VLM. Provide either
--image
or
--file-id
, not both.
mmx vision describe
(
--image
<
path-or-url
>
|
--file-id
<
id
>
)
[
flags
]
Flag
Type
Description
--image
string
Local path or URL (auto base64-encoded)
--file-id
string
Pre-uploaded file ID (skips base64)
--prompt
string
Question about the image (default:
"Describe the image."
)
mmx vision describe
--image
photo.jpg
--prompt
"What breed?"
--output
json
stdout
description text (text mode) or full response (json mode). search query Web search via MiniMax. mmx search query --q < query

Flag Type Description --q string, required Search query mmx search query --q "MiniMax AI" --output json --quiet quota show Display Token Plan usage and remaining quotas. mmx quota show [ --output json ] Tool Schema Export Export all commands as Anthropic/OpenAI-compatible JSON tool schemas:

All tool-worthy commands (excludes auth/config/update)

mmx config export-schema

Single command

mmx config export-schema --command "video generate" Use this to dynamically register mmx commands as tools in your agent framework. Exit Codes Code Meaning 0 Success 1 General error 2 Usage error (bad flags, missing args) 3 Authentication error 4 Quota exceeded 5 Timeout 10 Content filter triggered Piping Patterns

stdout is always clean data — safe to pipe

mmx text chat --message "Hi" --output json | jq '.content'

stderr has progress/spinners — discard if needed

mmx video generate --prompt "Waves" 2

/dev/null

Chain: generate image → describe it

URL

$( mmx image generate --prompt "A sunset" --quiet ) mmx vision describe --image " $URL " --quiet

Async video workflow

TASK

$( mmx video generate --prompt "A robot" --async --quiet | jq -r '.taskId' ) mmx video task get --task-id " $TASK " --output json mmx video download --task-id " $TASK " --out robot.mp4 Configuration Precedence CLI flags → environment variables → ~/.mmx/config.json → defaults.

Persistent config

mmx config set --key region --value cn mmx config show

Environment

export MINIMAX_API_KEY = sk-xxxxx export MINIMAX_REGION = cn Default Model Configuration Set per-modality defaults so you don't need --model every time:

Set defaults

mmx config set --key default-text-model --value MiniMax-M2.7-highspeed mmx config set --key default-speech-model --value speech-2.8-hd mmx config set --key default-video-model --value MiniMax-Hailuo-2.3 mmx config set --key default-music-model --value music-2.6

Use without --model

mmx text chat --message "Hello" mmx speech synthesize --text "Hello" --out hello.mp3 mmx video generate --prompt "Ocean waves" mmx music generate --prompt "Upbeat pop" --instrumental

--model still overrides per-call

mmx text chat --model MiniMax-M2.7 --message "Hello" Resolution priority : --model flag > config default > hardcoded fallback.

返回排行榜