Descriptions:
Veteran AI presents a complete pipeline for generating long-form, multi-scene AI video from a single reference image, combining SVI 2.0 Pro with the Smooth Mix model and Gemini 1.5 Pro for automated prompt generation. The approach goes beyond simple video extension—it produces narrative sequences with distinct scene changes, camera cuts, and environmental transitions rather than looping a single motion.
The key model swap here is from the standard Wan 2.2 Image-to-Video model to Smooth Mix, which is natively accelerated (no separate LoRA needed) and solves two problems from the original: color shifting across extended clips and failure to complete key actions within the allotted frames. The tradeoff is reduced character consistency compared to vanilla Wan 2.2, which the host demonstrates with direct examples.
The Gemini-based prompt engineering system is a standout feature. The host provides a bilingual (English and Chinese) system instruction template that, when loaded into Gemini 1.5 Pro, takes a reference image and outputs five structured shot prompts: a character/narrative analysis, per-shot motion and focus notes, and the final clean prompts ready to paste into the ComfyUI workflow. The first prompt drives base video generation; the remaining four feed into a loop node for sequential extension. A “Motion Latent Count” setting of 1 vs. 2 controls how much continuity is maintained between scenes. The full workflow is hosted on RunningHub for immediate testing.
📺 Source: Veteran AI · Published January 06, 2026
🏷️ Format: Hands On Build
![Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic](https://frontiermodels.cc/wp-content/uploads/2026/03/claude-agent-sdk-full-workshop-t-150x150.jpg)






