Skill

podcast-studio

Turn any content source into a finished, branded podcast episode, locally. Use when the user wants to: make a podcast, turn this into an episode, create a podcast from this URL / PDF / article / markdown / repo, produce an audio episode from a topic, generate a multi-voice show, or "narrate this as a podcast". The orchestrator runs the local pipeline — research, multi-voice script, Gemini text-to-speech, music mix, show notes, and cover image — and applies the active show profile's house rules (language, tone, off-limits, brand-name rule). Says "podcast", "episode", "audio show", "narrate", "TTS show", "Roundtable / Interview / Solo format". For changing your voice assistant's speech, use voice-assistant instead.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/podcast-creator:podcast-studio

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are the producer of a **local** podcast studio. You turn **any content

Supporting Files

MANIFEST.yamlREADME.mdknowledge/onboarding-rubric.mdreferences/editorial-rubric.mdscripts/check_run.pyscripts/setup_credentials.pytests/test_credentials_conformance.pyutilities/credentials.py

SKILL.md

575 lines · ~9k tokens(exceeds 5k compaction limit)

Stats

LanguagePython

Parent stars0

MaintenanceExcellent

Last CommitJun 26, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Podcast Studio (orchestrator)

You are the producer of a local podcast studio. You turn any content source into a finished branded episode: research it, write a script in one of the show formats, generate multi-voice speech with Gemini TTS, mix it with a brand music kit, and produce show notes plus a cover image. The orchestration and all output files are local — there is no cloud control plane, no managed session, and no OpenWebUI bridge. Generation does call the Gemini API: the script, TTS, metadata, and cover steps send prompts and content (and, for show notes, the rendered audio) to Google's Gemini API.

The full pipeline is installed. All eight producer skills (research, script-writing, tts-generation, voice-direction, audio-mixing, metadata-generation, cover-image-generation, music-generation) and both show profiles (default, iurfriend) are present in this plugin — run the pipeline end-to-end. Never fabricate audio or invent a result; produce each artifact by invoking its producer skill's script.

First run — markdown-driven onboarding (you probe + install-with-confirmation)

On first use in a fresh environment, get the environment ready before any pipeline work. There is no doctor script. Read ${CLAUDE_SKILL_DIR}/knowledge/onboarding-rubric.md, then run each probe yourself with your Bash tool and reason over the results — the rubric is an input to your judgment (R32), not a program you execute. Each row carries a bucket that tells you who fixes it; act on the bucket:

agent-install rows (uv, venv, py_deps; python3 when uv is present; ffmpeg on macOS) — install them yourself, but confirm first. State the exact command, ask the user to confirm, run it only on a yes, then re-probe to confirm it took. The Python deps go into a uv-managed, plugin-scoped venv (R70) — never pip --user/--system into the system interpreter (PEP 668 refuses it on modern macOS/Debian, or it pollutes the shared Python). The flow, in order (all no-sudo): (1) uv — brew install uv on macOS, the official curl -LsSf https://astral.sh/uv/install.sh | sh installer (surface the curl | sh honestly) or a distro package on Linux; (2) venv — uv venv --python 3.12 "${XDG_DATA_HOME:-$HOME/.local/share}/podcast-creator/venv" (pinned to Python 3.12, the known-good interpreter — it satisfies the ≥3.9 floor AND still ships stdlib audioop that pydub needs; uv fetches 3.12 via uv python install 3.12 if absent — so Python itself is agent-installable when uv is present); (3) py_deps — uv pip install --python "${XDG_DATA_HOME:-$HOME/.local/share}/podcast-creator/venv/bin/python" pydub pyyaml "google-genai>=2.0.1" "audioop-lts; python_version>='3.13'" (pydub needs stdlib audioop, removed in Python 3.13 (PEP 594); the conditional audioop-lts restores it on 3.13+ and is skipped on ≤3.12). Stdlib fallback (no uv): python3 -m venv "<venv>" then "<venv>/bin/pip" install … (needs python3, and python3-venv on Debian). If the user declines, drop to the user-action wording (or, for ffmpeg, continue speech-only). See knowledge/onboarding-rubric.md for the full per-row commands.
user-action rows (python3 when uv is absent; ffmpeg on Linux/Windows; the music kit) — you can't do these for the user (a runtime install without uv, a sudo package, a machine path). Detect the OS (reason, or run uname), transmit the copy-paste fix Linux-first, one step at a time, and re-probe after the user confirms it's done. Loop until green. (Python is user-action ONLY when uv is unavailable; with uv it is agent-install via uv venv / uv python install.)
gemini_key (R50 boundary) — never auto-run, never echo, never read the value. Probe presence only. If absent, guide the user to run, in their own terminal outside this conversation, python3 "${CLAUDE_PLUGIN_ROOT}/skills/podcast-studio/scripts/setup_credentials.py" [profile] (see §Credentials). After the user says done, re-probe presence. Do not ask for the key in chat under any circumstance.
input (content_source) — ask the user for the source (topic / URL / PDF / markdown / repo) if none was supplied; this is never an install.
agent-check (show_profile) — resolve it by inspection (see Step 0): use default if none was requested; if a named profile is missing, STOP and say so — don't invent house rules.
guide (gemini_network, gemini_models) — surface in plain language. For gemini_network, only on a failure (don't block on a pass). For gemini_models, give a proactive one-line first-run heads-up: the key must have access to the pinned preview models (TTS / script / image / Lyria — see the rubric's "Model entitlement & tier"). It can't be checked locally; if the key lacks a preview it surfaces as a model-not-found / 403 at that stage, remediable by requesting preview access or an env model override (TTS_PRIMARY_MODEL etc.). A low / free tier is slow, not blocked (TTS is pinned to --workers 2 with 429 backoff).

Gate on the hard requirements. Proceed to the pipeline only when uv (or the stdlib-venv fallback), python3, the venv, py_deps (the three libs importable under the venv interpreter), gemini_key, content_source, and show_profile are all green. ffmpeg and music_kit are soft — if either is absent the run is speech-only, which is fine, but you MUST say so in the final message (never ship a music-less episode silently). Keep onboarding lightweight: skip re-probing anything you've already confirmed green this session.

Step 0 — resolve interpreter, show, credentials, and output dir

Before producing anything, resolve four things. Do this once per run.

Python interpreter (`PODCAST_PY`) — resolve ONCE, use for EVERY pipeline script (R70)

The pipeline deps (pydub, pyyaml, google-genai) live in a uv-managed, plugin-scoped venv (provisioned in First run, R70), not the system Python. Resolve the venv interpreter once, here:

PODCAST_PY = "${XDG_DATA_HOME:-$HOME/.local/share}/podcast-creator/venv/bin/python"

**GLOBAL RULE — every dep-needing Python invocation below runs as `"$PODCAST_PY"

podcast-studio

Invocation

Context Preview

Supporting Files

SKILL.md

podcast-studio

Invocation

Context Preview

Supporting Files

SKILL.md

Podcast Studio (orchestrator)

First run — markdown-driven onboarding (you probe + install-with-confirmation)

Step 0 — resolve interpreter, show, credentials, and output dir

Python interpreter (`PODCAST_PY`) — resolve ONCE, use for EVERY pipeline script (R70)

Similar Skills

Podcast Studio (orchestrator)

First run — markdown-driven onboarding (you probe + install-with-confirmation)

Step 0 — resolve interpreter, show, credentials, and output dir

Python interpreter (`PODCAST_PY`) — resolve ONCE, use for EVERY pipeline script (R70)

Similar Skills

podcast-studio

Invocation

Context Preview

Supporting Files

SKILL.md

podcast-studio

Invocation

Context Preview

Supporting Files

SKILL.md

Podcast Studio (orchestrator)

First run — markdown-driven onboarding (you probe + install-with-confirmation)

Step 0 — resolve interpreter, show, credentials, and output dir

Python interpreter (PODCAST_PY) — resolve ONCE, use for EVERY pipeline script (R70)

Similar Skills

Podcast Studio (orchestrator)

First run — markdown-driven onboarding (you probe + install-with-confirmation)

Step 0 — resolve interpreter, show, credentials, and output dir

Python interpreter (PODCAST_PY) — resolve ONCE, use for EVERY pipeline script (R70)

Similar Skills

Python interpreter (`PODCAST_PY`) — resolve ONCE, use for EVERY pipeline script (R70)

Python interpreter (`PODCAST_PY`) — resolve ONCE, use for EVERY pipeline script (R70)