Infinite AI Avatars from Audio! 🤯 Long Cat Video Avatar Full Guide|Auto-Loop Extension

Tutorials6 months ago

Infinite AI Avatars from Audio! 🤯 Long Cat Video Avatar Full Guide|Auto-Loop Extension

Descriptions:

Long Cat Video Avatar is an audio-driven AI avatar model accessible through Kijai’s Wan Video extension for ComfyUI, capable of generating realistic lip-synced talking head videos from a reference image and an audio clip. This guide from Veteran AI presents three progressively more capable workflows and addresses the mixed reception the model received in early community reviews—demonstrating that natural character motion, including gestures and expressions, is achievable with the right parameter tuning.

The technical walkthrough is thorough. The model uses a sliding window architecture where each generation window spans 93 frames (rather than the standard 81) to create a 13-frame overlap for seamless stitching during extension. The tutorial covers vocal separation from background music using a track separation node, the specialized Long Cat scheduler with a shift value of 12, and key parameter decisions: 480×832 resolution, CFG of 1.0, and 8 sampling steps (reduced from Kijai’s default 12 for speed without meaningful quality loss). An important audio stride quirk is explained—setting FPS to 32 in the node outputs video at 16 FPS due to an Audio Stride of 2. Both BF16 and FP8 model variants are covered for different VRAM budgets.

The second workflow automates the entire loop extension process, handling frame overlap calculations and stitching without manual intervention. The third removes automatic camera zoom to keep full-body framing stable across long generations. The complete workflows are hosted on RunningHub, and the guide is especially valuable for creators who want to move past the basic Kijai template toward production-ready infinite-loop avatar video generation.

📺 Source: Veteran AI · Published December 29, 2025
🏷️ Format: Tutorial Demo

1 Item

Channels

No Image Available

Veteran AI

Tags

ComfyUI Hugging Face kijai runningHub

Prev

Master Descript video editing in 10 minutes

Master Descript video editing in 10 minutes

Next

Your Brain Doesn’t Command Your Body. It Predicts It. [Max Bennett]

Your Brain Doesn’t Command Your Body. It Predicts It. [Max Bennett]

18 Related Posts

Related Posts

10:25

Tutorials

Krea2 Has No Good Reference Mode. LoRA Is the Fix|From Dataset to Turbo Output

22 hours ago

11:53

Tutorials

You’re Not Behind (Yet): Master Hermes In 12 Minutes

22 hours ago

08:18

Tutorials

Claude Code Artifacts Are Here (No Backend!)

22 hours ago

09:02

Tutorials

Needle: Finetune a 26M Tool-Calling Model Locally with Ollama

22 hours ago

14:35

Tutorials

Fable 5 + Karpathy’s LLM Wiki is Basically Cheating

22 hours ago

19:38

Tutorials

Finally, an Open Standard for the Karpathy LLM Wiki is HERE

2 days ago