Programmatic Video Synthesis: Orchestrating Claude Code and Hyperframe for Agentic Post-Production Workflows
The traditional video editing paradigm—characterized by manual timeline manipulation in GUI-based applications like Adobe Premiere Pro or DaVinci Resolve—is undergoing a fundamental shift. We are moving toward a model of programmatic video synthesis, where the video is treated not as a collection of binary frames, but as a dynamic, code-driven DOM (Document Object Model) structure. By leveraging Claude Code (an agentic coding interface) in conjunction with the Hyperframe framework, we can automate the entire post-production pipeline—from raw footage cleanup to complex motion graphics orchestration—using nothing but HTML, CSS, and JavaScript.
The Hyperframe Framework: Video as Code
At the core of this workflow is Hyperframe, a framework designed to allow AI agents to generate video content by manipulating web technologies. Unlike traditional rendering engines, Hyperframe treats the video canvas as a browser viewport. The input is a set of instructions (prompts) processed by a coding agent, and the output is a structured HTML/CSS/JS codebase that represents the video's visual and temporal state.
The power of Hyperframe lies in its skill-based architecture. Rather than a monolithic application, Hyperframe operates through modular "skills" that can be installed into a local development environment or a specific project repository. These skills extend the capabilities of coding agents like Claude Code or Codex, providing them with the specialized context required to write video-centric code.
Key components of the Hyperframe skill ecosystem include:
- Hyperframe Core: The engine that writes the HTML/CSS/JS that constitutes the video.
- Hyperframe CLI: A command-line interface used to run the project, host a local preview server (typically on
localhost:3000), and export the final composition to an MP4 container. - Hyperframe Media: A specialized skill for asset preprocessing, including audio transcription, subtitle generation, and image optimization.
- Hyperframe Website-to-Video: A pipeline for converting existing web DOM structures into Hyperframe video assets.
The R2E Pipeline: Automating the "Last Take" Logic
One of the most computationally intensive parts of video editing is the "rough cut"—the removal of filler words, long pauses, and, most critically, "restarts." To solve this, I developed the R2E (Raw to Edit) skill.
The R2E workflow utilizes Claude Code to execute a highly specific logic-driven cleanup. The process follows a strict algorithmic rule: The Last Take Rule. When the agent detects a phrase or sentence that is repeated (a "restart"), it identifies the most recent iteration as the "correct" version and marks all preceding attempts for deletion.
The technical execution follows this sequence:
- Transcription & Alignment: The agent uses the Hyperframe Media skill to generate word-level timestamps for the raw audio.
- Pattern Recognition: Claude Code parses the transcript to identify linguistic redundancies and pauses.
- JSON Plan Generation: The agent generates a
cut_plan.jsonfile. This file acts as the source of truth, mapping specific timecodes to "keep" or "cut" instructions. - Automated Re-encoding: The agent executes the cuts, resulting in a
raw_clean.mp4that is structurally optimized for the next stage of the pipeline.
Agentic Storyboarding and Design System Integration
Once the raw footage is cleaned, the next challenge is the "assembly"—adding overlays, motion graphics, and transitions. This is achieved through Agentic Storyboarding.
Instead of manually placing text layers, I prompt Claude Code to generate a one-page HTML storyboard. This storyboard is a structured document that breaks the video into discrete scenes, each defined by:
- Temporal Bounds: Start and end timestamps.
- Visual Elements: Definitions for floating cards, subtitles, and image overlays.
- CSS Styling: Application of a predefined Design System.
By providing Claude Code with a reference to a local design system folder (containing CSS variables, typography, and component templates), the agent can ensure that every generated element—whether it's a "Riverside-style" talking head layout or a complex 3D shader—is visually consistent with the brand identity.
The agent can also handle complex layout transitions. For example, a prompt can instruct the agent to transition from a full-screen "talking head" view to a "split-screen" view where a video window collapses into a bottom-right corner, making room for centered motion graphics.
Advanced Motion Graphics via GSAP and Shaders
For high-fidelity animations, the Hyperframe workflow integrates the GreenSock Animation Platform (GSAP). Because the video is essentially a web application, Claude Code can write complex GSAP timelines that manipulate CSS properties (like opacity, transform: scale(), and clip-path) in sync with the audio timestamps.
Furthermore, the framework supports 3D Shader integration. By leveraging WebGL/GLSL within the Hyperframe canvas, the agent can inject 3D elements and complex particle effects directly into the video composition. This allows for a level of visual complexity that would be incredibly time-consuming to keyframe manually in a traditional NLE (Non-Linear Editor).
The Browser-Based Previewer and Inspection Loop
The final stage of the pipeline is the Hyperframe Previewer. By running the Hyperframe CLI, the agent spins up a local web server. This allows for a real-time, interactive preview of the video in a browser.
This browser-based approach introduces a powerful Inspection Loop. Because the video is rendered in a standard browser environment, you can use the Browser Developer Tools to inspect the DOM. If a specific animation or text overlay is misaligned, you can:
- Inspect the element in the DOM.
- Identify the specific CSS or JS instruction causing the issue.
- Prompt Claude Code to "refine" or "fix" that specific element.
- The agent updates the codebase, and the previewer reflects the change instantly.
This closed-loop system—moving from raw footage to a polished, animated MP4 via agentic code generation—represents the future of automated, scalable, and highly customizable video production.