- 🎨 VideoAgent Image Studio
- Use when:
- User asks to generate, draw, create, or make any kind of image, photo, illustration, icon, logo, or artwork.
- Generate images with 8 state-of-the-art AI models. This skill automatically picks the best model for the job and handles all the complexity — including Midjourney's async polling — so you can focus on the conversation.
- Quick Reference
- User Intent
- Model
- Speed
- Artistic, cinematic, painterly
- midjourney
- ~15s
- Photorealistic, portrait, product
- flux-pro
- ~8s
- General purpose, balanced
- flux-dev
- ~10s
- Quick draft, fast iteration
- flux-schnell
- ~2s
- Image with text, logo, poster
- ideogram
- ~10s
- Vector art, icon, flat design
- recraft
- ~8s
- Anime, stylized illustration
- sdxl
- ~5s
- Gemini-powered, consistent style
- nano-banana
- ~12s
- How to Generate an Image
- Step 1 — Enhance the prompt
- Before calling the script, expand the user's prompt with style, lighting, and quality descriptors appropriate for the chosen model.
- Midjourney
-
- Add
- cinematic lighting
- ,
- ultra detailed
- ,
- --v 7
- ,
- --style raw
- Flux
-
- Add
- masterpiece
- ,
- highly detailed
- ,
- sharp focus
- ,
- professional photography
- Ideogram
-
- Be explicit about text content, font style, and layout
- Recraft
- Specify
vector illustration
,
flat design
,
icon style
Step 2 — Run the script
node
{
baseDir
}
/tools/generate.js
\
--model
<
model_id
\ --prompt "
" \ --aspect-ratio < ratio All parameters: Parameter Default Description --model flux-dev Model ID from the table above --prompt (required) The image generation prompt --aspect-ratio 1:1 1:1 , 16:9 , 9:16 , 4:3 , 3:4 , 3:2 , 21:9 --num-images 1 Number of images (1–4; Midjourney always returns 4) --negative-prompt — Things to avoid (not supported by Midjourney) --seed — Seed for reproducibility Step 3 — Return the result The script always waits and returns the final image URL(s). No polling required. { "success" : true , "model" : "flux-pro" , "imageUrl" : "https://..." , "images" : [ "https://..." ] } Send the imageUrl to the user. Midjourney Actions After generating a 4-image grid with Midjourney, offer the user these options:
Upscale image #2 (subtle, preserves details)
node { baseDir } /tools/generate.js \ --model midjourney \ --action upscale \ --index 2 \ --job-id < job_id
Create a strong variation of image #3
node { baseDir } /tools/generate.js \ --model midjourney \ --action variation \ --index 3 \ --job-id < job_id
\ --variation-type 1
Regenerate with same prompt
node { baseDir } /tools/generate.js \ --model midjourney \ --action reroll \ --job-id < job_id
Upscale types: 0 = Subtle (default, best for photos), 1 = Creative (best for illustrations) Variation types: 0 = Subtle (default), 1 = Strong (dramatic changes) Example Conversations User: "Draw a snow leopard on a snowy mountain with cinematic lighting"
Choose midjourney for artistic quality
node { baseDir } /tools/generate.js \ --model midjourney \ --prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7" \ --aspect-ratio 16 :9 🎨 Done! Which one to upscale? (U1-U4) Or create a variant? (V1-V4) User: "Use Flux to generate a perfume product poster, white background"
Choose flux-pro for photorealistic product shots
node { baseDir } /tools/generate.js \ --model flux-pro \ --prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed" \ --aspect-ratio 3 :4 User: "Show me a quick draft"
flux-schnell for instant previews
node { baseDir } /tools/generate.js \ --model flux-schnell \ --prompt "..." \ --aspect-ratio 1 :1 User: "Make me an App icon, flat style, blue theme"
recraft for vector/icon style
- node
- {
- baseDir
- }
- /tools/generate.js
- \
- --model
- recraft
- \
- --prompt
- "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"
- Setup
- Zero API keys needed!
- All requests go through a hosted proxy that handles authentication server-side.
- The skill works out of the box — just install and use.
- Advanced: Custom proxy or token
- If you want to use your own proxy or a persistent token, set these environment variables:
- {
- "skills"
- :
- {
- "entries"
- :
- {
- "videoagent-image-studio"
- :
- {
- "enabled"
- :
- true
- ,
- "env"
- :
- {
- "IMAGE_STUDIO_PROXY_URL"
- :
- "https://your-proxy.vercel.app"
- ,
- "IMAGE_STUDIO_TOKEN"
- :
- "your_token_here"
- }
- }
- }
- }
- }
- Variable
- Required
- Description
- IMAGE_STUDIO_PROXY_URL
- No
- Custom proxy base URL (default:
- https://image-gen-proxy.vercel.app
- )
- IMAGE_STUDIO_TOKEN
- No
- Persistent token (auto-obtained if not set, 100 free uses per token)
- To deploy your own proxy, see the
- videoagent-audio-studio proxy
- as a reference implementation. You'll need
- FAL_KEY
- and
- LEGNEXT_KEY
- as Vercel environment variables.
- Changelog
- v2.0.0
- Simplified async
-
- The script now blocks until Midjourney completes. No more
- --async
- /
- --poll
- flags needed in SKILL.md instructions.
- Unified output format
-
- All models return the same
- { success, imageUrl, images }
- shape.
- Reference images for Nano Banana
- Pass --reference-images "url1,url2" for character/style consistency across generations. v1.3.0 Added non-blocking async mode for Midjourney ( --async + --poll ). v1.2.0 Midjourney turbo mode enabled by default (~10-20s). v1.1.0 Switched Midjourney provider from TTAPI to Legnext.ai for better stability. v1.0.0 Initial release with Midjourney, Flux, SDXL, Nano Banana, Ideogram, Recraft.