When to Use User wants to create a podcast episode on any topic User provides a URL or text and wants it turned into a podcast discussion User asks for a "debate", "dialogue", or "discussion" format User says "podcast", "播客", or "录一期节目" When NOT to Use User wants text-to-speech reading (use /speech ) User wants an explainer video with visuals (use /explainer ) User wants to generate an image (use /image-gen ) User only wants to extract content from a URL without generating audio (use /content-parser ) Purpose Generate podcast episodes with 1-2 AI speakers discussing a topic. Supports quick overviews, deep analysis, and debate formats. Input can be a topic description, URL(s), or text. Output is a full audio episode with transcript. Hard Constraints No shell scripts. Construct curl commands from the API reference files listed in Resources Always read shared/authentication.md for API key and headers Follow shared/common-patterns.md for polling, errors, and interaction patterns Never hardcode speaker IDs — always fetch from the speakers API Never fabricate API endpoints or parameters Always read config following shared/config-pattern.md before any interaction Always follow shared/speaker-selection.md for speaker selection (text table + free-text input) Never save files to ~/Downloads/ — use .listenhub/podcast/ from config Step -1: API Key Check Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately. Step 0: Config Setup Follow shared/config-pattern.md Step 0. If file doesn't exist — ask location, then create immediately: mkdir -p ".listenhub/podcast" echo '{"outputDir":".listenhub","outputMode":"inline","language":null,"defaultMode":null,"defaultMethod":null,"defaultSpeakers":{}}'
".listenhub/podcast/config.json" CONFIG_PATH = ".listenhub/podcast/config.json"
(or $HOME/.listenhub/podcast/config.json for global)
- Then run
- Setup Flow
- below.
- If file exists
- — read config, display summary, and confirm:
- 当前配置 (podcast):
- 输出方式:{inline / download / both}
- 语言偏好:{zh / en / 未设置}
- 默认模式:{quick / deep / debate / 未设置}
- 默认生成方式:{one-step / two-step / 未设置}
- 默认主播:{speakerName(s) / 未设置}
- Ask: "使用已保存的配置?" →
- 确认,直接继续
- /
- 重新配置
- Setup Flow (first run or reconfigure)
- Ask these questions in order, then save all answers to config at once:
- outputMode
- Follow shared/output-mode.md § Setup Flow Question. Language (optional): "默认语言?" "中文 (zh)" "English (en)" "每次手动选择" → keep null Mode (optional): "默认播客模式?" "Quick — 简短概述" "Deep — 深度分析" "Debate — 辩论对话" "每次手动选择" → keep null Method (optional): "默认生成方式?" "一步生成(推荐)" → defaultMethod: "one-step" "两步生成(先预览文本)" → defaultMethod: "two-step" "每次手动选择" → keep null After collecting answers, save immediately:
Follow shared/output-mode.md § Save to Config
NEW_CONFIG
- $(
- echo
- "
- $CONFIG
- "
- |
- jq
- --arg
- m
- "
- $OUTPUT_MODE
- "
- '. + {"outputMode": $m}'
- )
- echo
- "
- $NEW_CONFIG
- "
- >
- "
- $CONFIG_PATH
- "
- CONFIG
- =
- $(
- cat
- "
- $CONFIG_PATH
- "
- )
- Note:
- defaultSpeakers
- are saved after generation (see After Successful Generation section).
- Interaction Flow
- Step 1: Topic / Content Source
- Free text input. Ask the user:
- What topic or content would you like to turn into a podcast?
- Accept: topic description, URL, or pasted text.
- Step 2: Mode
- If
- config.defaultMode
- is set, pre-fill and show in summary — skip this question.
- Otherwise ask:
- Question: "What podcast generation mode?"
- Options:
- - "Quick" — Short, concise overview (~5 min)
- - "Deep" — Thorough analysis with more detail (~10-15 min)
- - "Debate" — Two speakers with opposing views (requires 2 speakers)
- Step 3: Language
- If
- config.language
- is set, pre-fill and show in summary — skip this question.
- Otherwise ask:
- Question: "What language?"
- Options:
- - "Chinese (zh)" — Content in Mandarin Chinese
- - "English (en)" — Content in English
- Step 4: Speaker Count
- Question: "How many speakers?"
- Options:
- - "1 speaker (solo)" — Monologue style
- - "2 speakers (dialogue)" — Conversation style
- Note: Debate mode automatically sets 2 speakers.
- Step 5: Speaker Selection
- Follow
- shared/speaker-selection.md
- for the full selection flow, including:
- Default from
- config.defaultSpeakers.{language}
- (skip step if set)
- Text table + free-text input
- Input matching and re-prompt on no match
- For 2-speaker mode (dialogue/debate): run selection twice (or until both are chosen).
- Step 6: Reference Materials (optional)
- Question: "Any reference materials to include?"
- Options:
- - "Yes, URL(s)" — Provide URLs to analyze
- - "Yes, text" — Paste reference text
- - "No references" — Generate from topic alone
- Step 7: Generation Method
- If
- config.defaultMethod
- is set, pre-fill and show in summary — skip this question.
- Otherwise ask:
- Question: "How would you like to generate?"
- Options:
- - "One step (recommended)" — Generate text + audio together, faster
- - "Two steps (review first)" — Generate text, review/edit, then generate audio
- Step 8: Confirm & Generate
- Summarize all choices:
- Ready to generate podcast:
- Topic:
- Mode:
- Language:
- Speakers:
- References:
- Method:
- Proceed?
- Wait for explicit confirmation before calling any API.
- Workflow
- One-Step Generation
- Submit (foreground)
- :
- POST /podcast/episodes
- with collected parameters → extract
- episodeId
- Tell the user the task is submitted
- Poll (background)
-
- Run the following
- exact
- bash command with
- run_in_background: true
- and
- timeout: 600000
- . Do NOT use python3, awk, or any other JSON parser — use
- jq
- as shown:
- EPISODE_ID
- =
- "
" - for
- i
- in
- $(
- seq
- 1
- 30
- )
- ;
- do
- RESULT
- =
- $(
- curl
- -sS
- "https://api.marswave.ai/openapi/v1/podcast/episodes/
- $EPISODE_ID
- "
- \
- -H
- "Authorization: Bearer
- $LISTENHUB_API_KEY
- "
- 2
- >
- /dev/null
- )
- STATUS
- =
- $(
- echo
- "
- $RESULT
- "
- |
- tr
- -d
- '\000-\037\177'
- |
- jq
- -r
- '.data.processStatus // "pending"'
- )
- case
- "
- $STATUS
- "
- in
- success
- |
- completed
- )
- echo
- "
- $RESULT
- "
- ;
- exit
- 0
- ;
- ;
- failed
- |
- error
- )
- echo
- "FAILED:
- $RESULT
- "
- >
- &2
- ;
- exit
- 1
- ;
- ;
- *
- )
- sleep
- 10
- ;
- ;
- esac
- done
- echo
- "TIMEOUT"
- >
- &2
- ;
- exit
- 2
- When notified of completion,
- Step 6: Present result
- Read
- OUTPUT_MODE
- from config. Follow
- shared/output-mode.md
- for behavior.
- inline
- or
- both
-
- Display
- audioUrl
- as a clickable link.
- Present:
- 播客已生成!
- 在线收听:{audioUrl}
- 字幕:{subtitlesUrl}(如有)
- 时长:{audioDuration / 1000}s
- 消耗积分:{credits}
- download
- or
- both
-
- Also download the file.
- DATE
- =
- $(
- date
- +%Y-%m-%d
- )
- JOB_DIR
- =
- ".listenhub/podcast/
- ${DATE}
- -{episodeId}"
- mkdir
- -p
- "
- $JOB_DIR
- "
- curl
- -sS
- -o
- "
- ${JOB_DIR}
- /{episodeId}.mp3"
- "{audioUrl}"
- Present the download path in addition to the above summary.
- Offer to show transcript or provide download URL on request
- Two-Step Generation
- Step 1 — Submit text (foreground)
- :
- POST /podcast/episodes/text-content
- → extract
- episodeId
- Poll text (background)
-
- Use the exact
- jq
- -based polling loop above (substitute endpoint
- podcast/episodes/text-content/{episodeId}
- if needed), with
- run_in_background: true
- and
- timeout: 600000
- When notified,
- save draft to config output dir
- :
- Create
- .listenhub/podcast/YYYY-MM-DD-{episodeId}/
- Write
- {episodeId}-draft.md
- (human-readable:
- {speakerName}:
- per line)
- Write
- {episodeId}-draft.json
- (raw
- scripts
- array)
- Present the draft location and content preview
- STOP
-
- Present the draft and wait for explicit user approval
- Step 2 — Submit audio (foreground, after approval)
- :
- No changes:
- POST /podcast/episodes/{episodeId}/audio
- with
- {}
- With edits:
- POST /podcast/episodes/{episodeId}/audio
- with modified
- {scripts: [...]}
- Poll audio (background)
-
- Same exact
- jq
- -based loop,
- run_in_background: true
- ,
- timeout: 600000
- When notified,
- download audio to same folder
- :
- curl -sS -o .listenhub/podcast/{dir}/{episodeId}.mp3
- Present final result (same format as one-step, folder now has draft + final files)
- After Successful Generation
- Update config with the choices made this session:
- NEW_CONFIG
- =
- $(
- echo
- "
- $CONFIG
- "
- |
- jq
- \
- --arg
- lang
- "{language}"
- \
- --arg
- mode
- "{mode}"
- \
- --arg
- method
- "{one-step/two-step}"
- \
- --argjson
- speakers
- '{"{language}": ["{speakerId}"]}'
- \
- '. + {"language": $lang, "defaultMode": $mode, "defaultMethod": $method, "defaultSpeakers": (.defaultSpeakers + $speakers)}'
- )
- echo
- "
- $NEW_CONFIG
- "
- >
- "
- $CONFIG_PATH
- "
- API Reference
- Speaker list:
- shared/api-speakers.md
- Speaker selection guide:
- shared/speaker-selection.md
- Episode creation:
- shared/api-podcast.md
- Polling:
- shared/common-patterns.md
- § Async Polling
- Config pattern:
- shared/config-pattern.md
- Composability
- Invokes
-
- speakers API (for speaker selection)
- Invoked by
-
- content-planner (Phase 3)
- Example
- User
- "Make a podcast about the latest AI developments" Agent workflow : Detect: podcast request, topic = "latest AI developments" Ask mode → user picks "Deep" Ask language → "English" Ask speakers → 1 speaker Fetch speakers list, user picks "cozy-man-english" No references One-step generation curl -sS -X POST "https://api.marswave.ai/openapi/v1/podcast/episodes" \ -H "Authorization: Bearer $LISTENHUB_API_KEY " \ -H "Content-Type: application/json" \ -d '{ "sources": [{"type": "text", "content": "The latest AI developments"}], "speakers": [{"speakerId": "cozy-man-english"}], "language": "en", "mode": "deep" }' Poll until complete, then present the result with title and listen link.