From hyperframes
Turns a music track into a beat-synced HyperFrames video by analyzing audio once, laying out frames, and building each as a composition. Supports optional user media cut on the beat grid.
How this skill is triggered — by the user, by Claude, or both
Slash command
/hyperframes:music-to-videoThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill to turn a **music track** into a beat-synced HyperFrames video. You analyze the track once, lay out the frames, fill in a per-frame plan, and build each frame as a composition. The input is a music track plus optional user images or videos — there is **no narration and no website capture**. Typography and templates are the floor (a complete video needs zero assets); any media the...
references/frame-skeleton.mdreferences/montage.mdreferences/motion-primitive-catalog.mdreferences/motion-primitives/3d-card-flip/index.htmlreferences/motion-primitives/assets/gsap.min.jsreferences/motion-primitives/bg-flow-field/index.htmlreferences/motion-primitives/binary-decrypt/index.htmlreferences/motion-primitives/blur-resolve/index.htmlreferences/motion-primitives/braam-punch/index.htmlreferences/motion-primitives/chromatic-split/index.htmlreferences/motion-primitives/chrome-sweep/index.htmlreferences/motion-primitives/counting-punch/index.htmlreferences/motion-primitives/crash-zoom-in/index.htmlreferences/motion-primitives/datamosh-smear/index.htmlreferences/motion-primitives/directional-fill/index.htmlreferences/motion-primitives/dolly-zoom/index.htmlreferences/motion-primitives/electric-arc/index.htmlreferences/motion-primitives/flash-cut/index.htmlreferences/motion-primitives/gooey-metaball/index.htmlreferences/motion-primitives/hard-cut/index.htmlUse this skill to turn a music track into a beat-synced HyperFrames video. You analyze the track once, lay out the frames, fill in a per-frame plan, and build each frame as a composition. The input is a music track plus optional user images or videos — there is no narration and no website capture. Typography and templates are the floor (a complete video needs zero assets); any media the user supplies is cut in on the same beat grid.
You are the orchestrator. Work in videos/<project>/. Run the steps in order and pass each Gate before moving on. Two steps need the user: Step 3 (plan approval) and Step 6 (render approval). Do every step yourself except Step 4, where you dispatch one sub-agent per frame. Keep design and motion rules out of this file — they live in references/ and the frame-worker sub-agent.
SKILL_DIR = this skill directory. PROJECT_DIR = videos/<project-name>/.
Workflow: Step 0 setup → hyperframes.json + assets/bgm.mp3; Step 1 analyze → audiomap.json; Step 2 skeleton → STORYBOARD.md (frames, groups TBD); Step 3 plan → complete STORYBOARD.md + frame.md; Step 4 build → compositions/frames/NN-*.html; Step 5 assemble → index.html; Step 6 render → renders/video.mp4.
analyze-beatgrid.py is the only beat analyzer — never re-measure beats with another tool or by ear. Its energy / density / rolls / onsets / silences are always reliable. Its bpm and beats_sec are reliable only when the music is genuinely rhythmic; on calm music the grid is a metronome the tracker imposed, so pace by phrases and energy instead and never hard-cut to it. Deciding which case you're in is each frame's pacing (Step 2).compositions/frames/NN-<frame_id>.html, built by one frame-worker. A frame can subdivide into groups (each a template or a motion-primitives combo). Extra density goes inside a group, so frame count tracks distinct treatments, not beats — a fast track does not blow up the number of sub-agents.Goal: Establish the music source, create the HyperFrames project, and note any user-supplied media.
The music is the spine — establish one track before anything else. This skill is tuned for fast, high-energy BGM: a strong beat grid drives the cuts (calm tracks work, but pace by phrase rather than beat). If the user gave you audio — a music file, or a video to pull the audio from — use it. If not, generate one: choose the mood from the user's description (e.g. "driving synthwave", "trap beat", "upbeat corporate") and produce a track via /hyperframes-media (references/bgm.md — HeyGen retrieval when credentialed, else local Lyria / MusicGen; ElevenLabs or another generator also works). Before generating, run npx hyperframes auth status and relay its output verbatim (don't paraphrase or rewrite it) — it shows whether BGM comes from HeyGen or local MusicGen and, if not signed in, how to sign in. If not signed in, STOP and wait for the user to choose — sign in, or continue offline with local MusicGen — before generating the track; don't write keys into a per-repo .env. (In autonomous mode, note the status and continue offline.) See /hyperframes-media → Preflight for the canonical guidance. Either way the track lands at assets/bgm.mp3. Stage any user-supplied images or videos so frames can weave them in on the beat grid; otherwise typography carries the whole video.
Initialize only if hyperframes.json is missing. Name <project> from the brief in kebab-case, such as midnight-drive-loop — never a timestamp.
npx hyperframes init "videos/<project>" --non-interactive --skip-skills --example=blank
mkdir -p "$PROJECT_DIR/assets" "$PROJECT_DIR/renders"
cp "<user-music>" "$PROJECT_DIR/assets/bgm.mp3" # extract from a video first if needed
# only if the user gave you images/videos:
node <SKILL_DIR>/scripts/stage-assets.mjs --from <dir> --hyperframes "$PROJECT_DIR" --into public
The brand (font + palette) is chosen at Step 3, not here. Don't pick a genre or a track type up front — assets are just an optional ingredient, and the genre emerges from the per-frame choices.
Gate: hyperframes.json + assets/bgm.mp3 exist; aspect / length / fps and (if any) the asset inventory are noted.
Goal: Produce the one canonical timing analysis the whole video is built on.
analyze-beatgrid.py is the only beat analyzer — never re-measure beats with another tool or by ear. It reads the track once and writes audiomap.json: energy phases (level / density / feel), onsets + onset_rate, rolls, silences, hard_stops, key_moments, phrases, tempo / grid, and audio.duration_sec. It's deterministic — the same file always gives the same map. Most fields are reliable on any music; bpm and beats_sec are reliable only when the music is genuinely rhythmic, and judging that is the call you make at Step 2.
Prerequisites: Python 3 with librosa, numpy, and soundfile available. If import fails, install them into the active Python environment before running the analyzer:
python3 -m pip install librosa numpy soundfile
python3 <SKILL_DIR>/scripts/analyze-beatgrid.py "$PROJECT_DIR/assets/bgm.mp3" \
-o "$PROJECT_DIR/audiomap.json" --print
Gate: audiomap.json exists; audio.duration_sec is known.
Goal: Read the music and lay out the frames — the skeleton of STORYBOARD.md.
Read references/frame-skeleton.md. Turn audiomap.json into the skeleton of STORYBOARD.md yourself — there is no intermediate JSON. Cut the track into frames at real musical changes (hard_stops, SURGE / DROP key_moments, the edges of a roll, a stretch with no onsets, a big energy jump), snapping every boundary to an audiomap anchor. For each frame set span_sec, pacing (the verdict from Step 1's trust call — beat_cut when the grid is real, phrase_flow when it's a metronome imposed on calm music), mood, and a one-line feel (the plain music situation Step 3 matches a template against). Only classify and lay out here: leave every frame's ### Groups as TBD (Step 3) and the frontmatter style blank — no templates, copy, color, or fonts. Expect ~1–6 frames.
Gate: frames tile the track (first at 0, last at duration_s); each carries span_sec + pacing + mood + feel; every ### Groups is TBD; no content anywhere.
Goal: Turn the skeleton into an approved, complete STORYBOARD.md.
Read references/planning.md, storyboard-format.md, template-catalog.md, motion-primitive-catalog.md, and montage.md (only if the user supplied assets). Editing the same file in place, do two things:
../hyperframes-creative/frame-presets/ using the table in ../hyperframes-creative/references/design-spec.md (match the track's mood; only its fonts and colors matter — templates own composition). Copy it into frame.md unmodified and fill the frontmatter style (font + a ≤4–6 swatch palette) from it.pacing. Write the copy. You own WHAT (template / primitives + content + anchors); the frame-worker owns HOW — never write millisecond tweens into the storyboard.node <SKILL_DIR>/scripts/validate-plan.mjs --storyboard "$PROJECT_DIR/STORYBOARD.md" \
--audiomap "$PROJECT_DIR/audiomap.json" --templates <SKILL_DIR>/references/templates
Fix every ✗ (hard errors: duration mismatch, frames not tiling the track, a missing src); warnings are best-effort. Then show the user a frame-by-frame summary and iterate until they approve.
Gate: frame.md is a verbatim preset copy; validate-plan.mjs exits 0; the user approved the plan.
Goal: Build every frame as a self-contained composition file.
Create compositions/frames/. Read sub-agents/frame-worker.md and ../hyperframes-core/references/subagent-dispatch.md. Dispatch one frame-worker per frame, in parallel where possible (otherwise in waves). Each worker gets exactly one frame and this context:
PROJECT_DIR: <abs path>
frame_id: <NN-frame_id> # = the frame file stem, e.g. 02-f2; the composition id
Your block: the `## Frame N — <frame_id>` block in PROJECT_DIR/STORYBOARD.md
audiomap: PROJECT_DIR/audiomap.json
frame.md: PROJECT_DIR/frame.md
Materials: for each group, <SKILL_DIR>/references/templates/<id>/index.html (templates) and
<SKILL_DIR>/references/motion-primitives/<id>/ (free); staged assets/ (asset groups)
Contracts: ../hyperframes-core/references/sub-compositions.md + determinism-rules.md
Canvas: <w>×<h> Pacing: <beat_cut|phrase_flow>
Write to: PROJECT_DIR/compositions/frames/<frame_id>.html
The worker forks the cited materials, converts every anchor to frame-local seconds (local_t = track_t − span_sec[0]), gates its groups with 0ms cuts, and writes one seek-safe frame file. The worker never runs the hyperframes CLI — those commands operate on the assembled project, which doesn't exist yet, so they'd report on the wrong files. The worker just writes to the contract and stops; you verify after assembly (Step 6). As each worker returns, you can confirm its file landed on disk.
Gate: every frame has its compositions/frames/NN-*.html on disk.
Goal: Wire the built frames + BGM into the playable index.html.
assemble-index.mjs is deterministic — no subagent, no judgment. It references each frame file at its cumulative data-start, mounts assets/bgm.mp3 on track 11, and hard-cuts frame → frame (frames tile the track with no gaps, so there is no transition injector).
node <SKILL_DIR>/scripts/assemble-index.mjs --storyboard "$PROJECT_DIR/STORYBOARD.md" \
--hyperframes "$PROJECT_DIR" --audiomap "$PROJECT_DIR/audiomap.json"
Fix any ✗ it reports — a missing or blank frame file means that worker wrote a partial file; re-dispatch it (Step 4) and re-assemble.
Gate: index.html exists; total duration == audiomap.audio.duration_sec.
Goal: Verify the assembled video, get user approval, and render the final MP4.
Run the CLI on the assembled project — that's the correct unit (the per-frame workers couldn't run it). lint checks structure, validate runs headless Chrome (catching JS errors and missing assets), inspect snapshots frames.
( cd "$PROJECT_DIR" && npx hyperframes lint . && npx hyperframes validate . && npx hyperframes inspect . )
Inspect at t=0, each frame start, the strongest DROP / SURGE, every hard_stops[].t, and the final frame. On failure, make the cheapest safe fix yourself: edit the offending compositions/frames/NN-*.html. Never change duration or audio timing to hide a sync issue. Once the gates pass, pause for user review, then render only on approval:
( cd "$PROJECT_DIR" && npx hyperframes render . --skill=music-to-video -q draft -o renders/video.mp4 --fps 30 )
Gate: lint / validate / inspect passed; the user approved; renders/video.mp4 exists with audio, duration == audiomap.audio.duration_sec. The final reply states the MP4 path and duration.
| You have | Continue from |
|---|---|
assets/bgm.mp3 only | Step 1 |
audiomap.json | Step 2 |
STORYBOARD.md (skeleton) | Step 3 |
STORYBOARD.md (complete) | Step 4 |
| all frame files | Step 5 |
index.html | Step 6 |
Formats: landscape 1920x1080 by default; portrait 1080x1920; square 1080x1080. Set the canvas once in the storyboard frontmatter (canvas: { w, h, fps }).
Scripts under scripts/: analyze-beatgrid.py (the one analyzer), validate-plan.mjs (plan check), assemble-index.mjs (index assembly), stage-assets.mjs (stage user media), lib/storyboard.mjs (vendored parser). Everything else is the hyperframes CLI.
| Read | When |
|---|---|
references/frame-skeleton.md | Step 2: read the music, lay out the frames, set pacing |
references/planning.md · storyboard-format.md | Step 3: pick the brand, fill each frame, write the plan |
references/template-catalog.md | Step 3: pick a template per group |
references/motion-primitive-catalog.md | Step 3/4: L0 recipes for free-compose |
references/montage.md | Step 3/4: asset treatments (beat-cut / ken-burns) |
sub-agents/frame-worker.md | Step 4: dispatch + build one frame |
../hyperframes-core/references/subagent-dispatch.md | Step 4: dispatch sub-agents safely |
../hyperframes-creative/references/design-spec.md | Step 3: pick the preset (the brand) |
music-to-video/
SKILL.md
references/ frame-skeleton.md · planning.md · storyboard-format.md
template-catalog.md · motion-primitive-catalog.md · montage.md
templates/<id>/ { index.html (+ assets/ · program.json) } ← L1 catalog impls
motion-primitives/<id>/ { index.html } (+ ../assets/gsap.min.js shared by recipes) ← L0 catalog impls
scripts/ analyze-beatgrid.py · assemble-index.mjs · validate-plan.mjs · stage-assets.mjs · lib/storyboard.mjs
sub-agents/ frame-worker.md ← the one subagent (one per frame)
npx claudepluginhub heygen-com/hyperframes --plugin hyperframesEncodes a specific aesthetic direction (angelcore / cloud-trance / hyperpop) for music videos and short-form edits, providing mood, color, light, and beat-synced editing grammar. Use when building a music video, fancam, reel, or moodboard where visual coherence and intentional feel matter.
Routes video and animation requests to specialized HyperFrames workflows. Entry point for authoring, animating, and rendering video from HTML compositions.
Orchestrates AI video production workflow: gathers specs interactively, generates scripts/storyboards, Gemini TTS voiceovers, Lyria music, Veo 3.1 clips or image animations, assembles with FFmpeg.