Descriptions:
This tutorial from Veteran AI covers Bernini, an open-source reference-based video generation and editing model released by ByteDance, built on the WAN 2.2 architecture. Unlike traditional text-to-video tools, Bernini’s core strength is controllable video editing — taking a source video and a reference image and applying transformations such as character replacement, background swapping, and character addition with natural interaction.
The video walks through seven distinct ComfyUI workflows in detail. Setup requires two custom node packages: the Bernini nodes from RunningHub (which supply the key BerniniConditioning node) and Kijai’s WanVideoWrapper extension. Model options include FP8 and FP16 versions, with a LightX2V acceleration LoRA available. The presenter covers a two-stage sampling pipeline (high noise renderer followed by low noise renderer), recommends LCM as the sampling algorithm, and provides specific working settings: 480×832 resolution, 145 frames (following the rule of 8n+1), and 6 total steps split across both renderers.
A notable section covers Bernini’s prompt structure, which requires three components: a task-specific prefix (documented per workflow type), an editing goal using action words like replace, add, or transform, and a clear description of the intended final visual output. Test results across subject replacement, character addition, and background replacement tasks are shown with commentary on consistency, motion fidelity, and edge cases. The workflow is also available to run on RunningHub for those without local GPU resources.
📺 Source: Veteran AI · Published June 10, 2026
🏷️ Format: Tutorial Demo







