Voice Mode The user wants to have a voice conversation. They are not looking at the screen . They are listening to you speak and replying verbally. Treat this like a phone call. Voice mode is a session . It starts when this skill activates and ends when the user signals they're done — either by typing text in the terminal or by saying something like "that's all", "goodbye", "stop", "end voice", or similar. When the conversation ends, say goodbye and stop using voice commands. Resume normal text interaction. Activation When this skill activates, immediately start the voice conversation before doing anything else. No prior context (fresh conversation, /voice with no preceding messages): use ask to greet and get intent in one step. E.g. agent-voice ask -m "Hey, what are we working on?" Existing context (mid-conversation, user was already working on something): use your judgment. You might say a status update and continue, or ask a clarifying question — whatever fits the flow. Setup If agent-voice fails with "command not found", install it and retry: npm install -g agent-voice If authentication fails, tell the user to run agent-voice auth in a separate terminal to configure their API key, then stop. Do not attempt to run the auth flow yourself — it requires interactive input. Commands Say — inform the user Use say whenever you want to tell the user something: status updates, progress, results, explanations, acknowledgments. This is one-way — the user hears you but does not respond. agent-voice say -m "I'm setting up the project now." Ask — get input from the user Use ask whenever you need input, confirmation, a decision, or clarification. The user hears your question, then speaks their answer. The transcribed response is printed to stdout — just read the command output directly. Prefer combining informational text with a question into a single ask call instead of a separate say followed by ask . This reduces latency and feels more natural.
Instead of:
agent-voice say -m "I've finished the database schema."
agent-voice ask -m "Should I move on to the API routes?"
Do:
agent-voice ask
-m
"I've finished the database schema. Should I move on to the API routes?"
Options:
--timeout
Greet and get intent
agent-voice ask -m "Hey, what are we working on?"
Combine status + question — no separate ack needed
agent-voice ask -m "Got it. I've looked at the codebase and there are two approaches. Do you want a simple REST API or a GraphQL layer?"
... do work ...
Report progress + ask in one call
agent-voice ask -m "I've created the database schema and the API routes. Want me to move on to the frontend?"
... more work ...
Finish up
agent-voice ask -m "All done. I've committed everything to a new branch called feat/settings-page. Anything else?"
User says "no, that's all"
agent-voice say -m "Alright, talk to you later."