A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation.
Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs.
Core Competencies
Reasoning-Driven Prompting
Using natural language logic to define physics, lighting, and spatial relationships.
Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:
NO KEYWORD SOUP
Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences.
PHYSICAL CONSISTENCY
Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor").
TEXT PRECISION
If the user wants text, define it precisely:
featuring a sign that says "STORE NAME" in a weathered serif font
.
OPTICAL DIRECTIVES
Specify lens behavior:
Shallow Depth of Field (f/1.8)
,
Macro Lens
,
Anamorphic Flare
.
🚀 Protocol: Using Nano-Banana
Step 1: Define the Creative Logic
Provide the agent with a subject and a specific scenario.
Step 2: Invoke the Script
The
generate-nano-art.sh
script translates the logic into a structured Gemini 3-style prompt.
Generating a reasoning-driven image
bash
scripts/generate-nano-art.sh
\
--subject
"a glass chess piece"
\
--action
"shattering into liquid shards"
\
--context
"on a obsidian table"
\
--style
"macro photography"
⚠️ Constraints & Guardrails
No Keyword Soup
:
MANDATORY
- Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions.
Physics Logic
Ensure the prompt describes
physically possible
lighting and reflection interactions.
Full Sentences
The model parses relationships; use "light reflecting off the water" instead of "water, reflection".
⚙️ Implementation Details
This skill applies a "Logic Wrapper" around the
core/media/generate-image.sh
primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.