iOS Machine Learning Router
You MUST use this skill for ANY on-device machine learning or speech-to-text work.
When to Use
Use this router when:
Converting PyTorch/TensorFlow models to CoreML Deploying ML models on-device Compressing models (quantization, palettization, pruning) Working with large language models (LLMs) Implementing KV-cache for transformers Using MLTensor for model stitching Building speech-to-text features Transcribing audio (live or recorded) Routing Logic CoreML Work
Implementation patterns → /skill coreml
Model conversion workflow MLTensor for model stitching Stateful models with KV-cache Multi-function models (adapters/LoRA) Async prediction patterns Compute unit selection
API reference → /skill coreml-ref
CoreML Tools Python API MLModel lifecycle MLTensor operations MLComputeDevice availability State management APIs Performance reports
Diagnostics → /skill coreml-diag
Model won't load Slow inference Memory issues Compression accuracy loss Compute unit problems Speech Work
Implementation patterns → /skill speech
SpeechAnalyzer setup (iOS 26+) SpeechTranscriber configuration Live transcription File transcription Volatile vs finalized results Model asset management Decision Tree User asks about on-device ML or speech ├─ Machine learning? │ ├─ Implementing/converting? → coreml │ ├─ Need API reference? → coreml-ref │ └─ Debugging issues? → coreml-diag └─ Speech-to-text? └─ Any speech work → speech
Critical Patterns
coreml:
Model conversion (PyTorch → CoreML) Compression (palettization, quantization, pruning) Stateful KV-cache for LLMs Multi-function models for adapters MLTensor for pipeline stitching Async concurrent prediction
coreml-diag:
Load failures and caching Inference performance issues Memory pressure from models Accuracy degradation from compression
speech:
SpeechAnalyzer + SpeechTranscriber setup AssetInventory model management Live transcription with volatile results Audio format conversion Example Invocations
User: "How do I convert a PyTorch model to CoreML?" → Invoke: /skill coreml
User: "Compress my model to fit on iPhone" → Invoke: /skill coreml
User: "Implement KV-cache for my language model" → Invoke: /skill coreml
User: "Model loads slowly on first launch" → Invoke: /skill coreml-diag
User: "My compressed model has bad accuracy" → Invoke: /skill coreml-diag
User: "Add live transcription to my app" → Invoke: /skill speech
User: "Transcribe audio files with SpeechAnalyzer" → Invoke: /skill speech
User: "What's MLTensor and how do I use it?" → Invoke: /skill coreml-ref