Descriptions:
Veteran AI demonstrates a fully viable method for running LTX Video 2 on low-VRAM hardware, using GGUF quantization to dramatically reduce memory requirements without a meaningful drop in generation quality. The workflow is built around the ComfyUI_GGUF extension by city96—which must be updated to its latest version to correctly load LTX Two models—and Kijai’s Q4_K_M distilled model, a 4-bit K-means quantized variant that runs in just 8 sampling steps.
Key configuration choices include setting CFG to 1.0 (required for distilled models), selecting the LCM scheduler, and crucially eliminating the two-stage generate-then-upscale approach used in the official workflow. Instead, this optimized pipeline generates directly at 1280Ă—720 with 121 frames in a single pass, avoiding the VRAM spike that the two-stage method creates. Text encoding uses a Gemma 3 GGUF model loaded via Kijai’s dual encoder loader with the distilled embedding connector. On an RTX 4090, the full run completes in 59.76 seconds.
The video also resolves a recurring audio static problem: simply disconnecting the audio latent input entirely eliminates the noise. For GPUs with even less memory, switching to Q2 quantization or reducing resolution to 960Ă—540 at 81 frames further lowers VRAM usage with manageable quality trade-offs. The tutorial notes that GGUF models without embedded metadata will fail to load, a common pitfall when sourcing community-provided quantized models. All workflows are hosted on RunningHub.
📺 Source: Veteran AI · Published January 14, 2026
🏷️ Format: Tutorial Demo







