User wants to generate an AI image from a text description
User says "generate image", "draw", "create picture", "配图"
User says "生成图片", "画一张", "AI图"
User needs a cover image, illustration, or concept art
When NOT to Use
User wants to create audio content (use
/podcast
,
/speech
)
User wants to create a video (use
/explainer
)
User wants to edit an existing image (not supported)
User wants to extract content from a URL (use
/content-parser
)
Purpose
Generate AI images using the ListenHub CLI. Supports text prompts with optional reference images (local files or URLs), multiple resolutions, and aspect ratios. Images are saved as local files.
Hard Constraints
Always check CLI auth following
shared/cli-authentication.md
Follow
shared/cli-patterns.md
for command execution and error handling
Always read config following
shared/config-pattern.md
before any interaction
Output saved to
.listenhub/image-gen/YYYY-MM-DD-{jobId}/
— never
~/Downloads/
Step -1: CLI Auth Check
Follow
shared/cli-authentication.md
§ Auth Check. If CLI is not installed or not logged in, auto-install and auto-login — never ask the user to run commands manually.
Always use English keywords (models trained on English)
Show optimized prompt before submitting
API Reference
CLI authentication:
shared/cli-authentication.md
CLI execution patterns:
shared/cli-patterns.md
Config pattern:
shared/config-pattern.md
Output mode:
shared/output-mode.md
Composability
Invokes
nothing (direct CLI call)
Invoked by
platform skills for cover images (Phase 2)
Example
User
"Generate an image: cyberpunk city at night"
Agent workflow
:
Prompt is short → offer enrichment → user declines
Ask model → "pro"
Ask resolution → "2K"
Ask ratio → "16:9"
No references
listenhub image create
\
--prompt
"cyberpunk city at night"
\
--model
"gemini-3-pro-image-preview"
\
--lang
en
\
--aspect-ratio
16
:9
\
--size
2K
\
--json
Parse CLI JSON output per
outputMode
(see
shared/output-mode.md
).
Example 2 — With Reference Images
User
"Generate an image in this style" (provides local files and a URL)
Agent workflow
:
Ask prompt → "a serene mountain lake at dawn"
Ask model → "pro"
Ask resolution → "2K"
Ask ratio → "16:9"
References →
/path/to/style-reference.png
,
https://example.com/photo.jpg
listenhub image create
\
--prompt
"a serene mountain lake at dawn"
\
--model
"gemini-3-pro-image-preview"
\
--lang
en
\
--aspect-ratio
16
:9
\
--size
2K
\
--reference
/path/to/style-reference.png
\
--reference
https://example.com/photo.jpg
\
--json
Parse CLI JSON output per
outputMode
(see
shared/output-mode.md
).