From azure-agent-skills
Provides expert guidance for Azure AI Speech development: STT/TTS APIs, custom voice/avatars, Voice Live, batch transcription, and containerized services. Includes troubleshooting, best practices, limits, security, and deployment.
How this skill is triggered — by the user, by Claude, or both
Slash command
/azure-agent-skills:azure-speechThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides expert guidance for Azure AI Speech. Covers troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
This skill provides expert guidance for Azure AI Speech. Covers troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
IMPORTANT for Agent: Use the Category Index below to locate relevant sections. For categories with line ranges (e.g.,
L35-L120), useread_filewith the specified lines. For categories with file links (e.g.,[security.md](security.md)), useread_fileon the linked reference file
IMPORTANT for Agent: If
metadata.generated_atis more than 3 months old, suggest the user pull the latest version from the repository. Ifmcp_microsoftdocstools are not available, suggest the user install it: Installation Guide
This skill requires network access to fetch documentation content:
mcp_microsoftdocs:microsoft_docs_fetch with query string from=learn-agent-skill. Returns Markdown.fetch_webpage with query string from=learn-agent-skill&accept=text/markdown. Returns Markdown.| Category | Lines | Description |
|---|---|---|
| Troubleshooting | L36-L44 | Diagnosing and resolving Azure AI Speech issues: session/ID lookup, Foundry integration errors, SDK CRL/compatibility problems, container deployment failures, and common SDK runtime bugs. |
| Best Practices | L45-L61 | Best practices for collecting and labeling audio/video, training custom voices/avatars, tuning recognition (phrases/keywords), optimizing latency/memory, and handling Voice Live agent behavior. |
| Decision Making | L62-L79 | Guides for choosing Speech/Embedded/Voice Live options, planning large-scale use, checking availability, and migrating between Speech APIs, models, and personal/custom voice features. |
| Limits & Quotas | L80-L88 | Speech service limits, quotas, and throttling, plus lifecycle, training, deployment, and usage constraints for custom/professional voice and short-audio speech-to-text APIs. |
| Security | L89-L102 | Securing Azure AI Speech: auth (Entra, RBAC), network isolation (VNet, Private Link, sovereign clouds), encryption/BYOK, BYOS storage, and consent/ID flows for personal and professional voice. |
| Configuration | L103-L136 | Configuring Azure AI Speech behavior: audio I/O, logging, storage, SSML, pronunciation, batch TTS/STT, avatars, personal/pro voices, and Voice Live/SDK/CLI connection and telemetry settings. |
| Integrations & Coding Patterns | L137-L167 | Patterns and APIs for integrating Azure Speech/Voice Live with apps, agents, telephony, REST/WebSockets, SSML, avatars, transcription, translation, and text‑to‑speech workflows. |
| Deployment | L168-L179 | Deploying and scaling Azure AI Speech: Docker/Kubernetes containers, on-prem STT/TTS, custom speech models/endpoints, language ID, and batch/long-form synthesis workflows. |
| Topic | URL |
|---|---|
| Retrieve Speech to text session and transcription IDs for support | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-get-speech-session-id |
| Resolve common Azure Speech in Foundry issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/known-issues |
| Resolve Azure AI Speech SDK CRL compatibility issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-sdk-1-48-2 |
| Troubleshoot Azure Speech containers deployment issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-faq |
| Diagnose and fix common Azure Speech SDK issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/troubleshooting |
| Topic | URL |
|---|---|
| Text to speech FAQs including limits and behavior | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/faq-tts |
| Manage custom speech model and endpoint lifecycle | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle |
| Deploy professional voice models to custom endpoints | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-deploy-endpoint |
| Train professional voice models and understand duration | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-train-voice |
| Apply Azure Speech quotas, limits, and throttling guidance | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-quotas-and-limits |
npx claudepluginhub microsoftdocs/agent-skills --plugin azure-agent-skillsBuilds real-time voice AI applications with Azure AI Voice Live SDK using bidirectional WebSocket in JavaScript/TypeScript.
Builds real-time bidirectional voice AI apps with Azure AI Voice Live SDK using WebSockets in JavaScript/TypeScript for Node.js and browsers.
Uses MCP tools and SDKs for Azure AI: Search (vector/hybrid queries), Speech (STT/TTS/transcription), OpenAI models, Document Intelligence (OCR).