Descriptions:
All About AI walks through building a landscape-to-vertical video converter from scratch using Claude Code (Opus 4.5), YOLO face detection, MediaPipe for lip-movement tracking, and FFmpeg for the final crop and encode. The creator uses Claude Code’s plan mode to scaffold the entire application before execution, letting the model coordinate multiple specialized Python libraries and generate FFmpeg commands dynamically based on detected face coordinates.
The build follows a real debugging cycle: the first output suffered from rapid flickering as the model switched between two detected speakers. The creator describes the fix in natural language to Claude Code โ enforce higher-confidence speaker selection and hold the crop until probability shifts โ and the second render is noticeably smoother. A Linus Tech Tips interview clip serves as the test source, with speaking-face detection confirmed across 1,900 frames.
The second phase of the project extends the workflow to longer input videos, adding clip selection logic so only the most relevant segments are extracted before conversion. This makes the pipeline suitable for repurposing long-form content into short vertical clips for TikTok or Instagram Reels. Developers and content creators will find it a practical reference for combining computer vision libraries with AI-orchestrated FFmpeg automation under a Claude Code agentic workflow.
๐บ Source: All About AI ยท Published December 05, 2025
๐ท๏ธ Format: Hands On Build







