Run LTX Video on Low VRAM! 🚀 Fast GGUF Workflow & Audio Noise Fix for ComfyUI

Tutorials4 months ago

Run LTX Video on Low VRAM! 🚀 Fast GGUF Workflow & Audio Noise Fix for ComfyUI

Descriptions:

Veteran AI demonstrates a fully viable method for running LTX Video 2 on low-VRAM hardware, using GGUF quantization to dramatically reduce memory requirements without a meaningful drop in generation quality. The workflow is built around the ComfyUI_GGUF extension by city96—which must be updated to its latest version to correctly load LTX Two models—and Kijai’s Q4_K_M distilled model, a 4-bit K-means quantized variant that runs in just 8 sampling steps.

Key configuration choices include setting CFG to 1.0 (required for distilled models), selecting the LCM scheduler, and crucially eliminating the two-stage generate-then-upscale approach used in the official workflow. Instead, this optimized pipeline generates directly at 1280×720 with 121 frames in a single pass, avoiding the VRAM spike that the two-stage method creates. Text encoding uses a Gemma 3 GGUF model loaded via Kijai’s dual encoder loader with the distilled embedding connector. On an RTX 4090, the full run completes in 59.76 seconds.

The video also resolves a recurring audio static problem: simply disconnecting the audio latent input entirely eliminates the noise. For GPUs with even less memory, switching to Q2 quantization or reducing resolution to 960×540 at 81 frames further lowers VRAM usage with manageable quality trade-offs. The tutorial notes that GGUF models without embedded metadata will fail to load, a common pitfall when sourcing community-provided quantized models. All workflows are hosted on RunningHub.

📺 Source: Veteran AI · Published January 14, 2026
🏷️ Format: Tutorial Demo

1 Item

Channels

No Image Available

Veteran AI

Tags

ComfyUI kijai runningHub

Prev

Claude Code just EVOLVED

Claude Code just EVOLVED

Next

Qwen3 Multimodal Embeddings: Finally, RAG That Sees

Qwen3 Multimodal Embeddings: Finally, RAG That Sees

18 Related Posts

Related Posts

10:54

Tutorials

Talkie: I Ran a 1930 AI Model Locally and Talked to People from the Past

22 hours ago

03:02

Tutorials

Installing Claude Code

22 hours ago

08:17

Tutorials

OpenAI Codex Now Works from Anywhere (Dispatch Killer?)

22 hours ago

08:41

Tutorials

Luce DFlash Meets OpenClaw – Local AI Agents at 2x Speed with Qwen3.6-27B

2 days ago

24:07

Tutorials

Hermes Agent powered by local models on the DGX Spark is basically magic

2 days ago

03:21

Tutorials

Goal Mode Changes Everything for AI Coding

2 days ago