All News DISPATCH WORKFLOW

New Open-Source Models Now in ComfyUI: VOID, BiRefNet & Gemma 4

ComfyUI has integrated three new open-source models into its workflow platform: VOID for video inpainting, BiRefNet for precise background removal, and Gemma 4 for text generation. All three are accessible through the ComfyUI model library.

ComfyUI

What's new

ComfyUI now supports three open-source models that expand its capabilities across video, image, and text workflows:

  • VOID (Video Object Inpainting and Deletion) – Removes objects from video footage using AI inpainting. The model analyzes motion and fills in the removed area across frames, handling camera movement and occlusion.
  • BiRefNet (Bilateral Reference Network) – A background removal model designed for high-accuracy segmentation. It preserves fine details like hair and transparent edges better than older matting approaches.
  • Gemma 4 – Google's latest small language model, optimized for on-device text generation. It can generate prompts, captions, or metadata within ComfyUI workflows without calling external APIs.

All three models are available through the ComfyUI model library and can be dropped into existing node graphs.

How it fits your workflow

For VFX artists and editors, VOID offers a node-based alternative to rotoscoping or manual paint-out work in tools like After Effects or Nuke. Instead of frame-by-frame masking, you can remove unwanted objects—boom mics, rigging, passersby—directly inside ComfyUI and pass the cleaned footage downstream. It works best on shots with moderate motion; complex parallax or fast action may still need manual cleanup.

Compositors and motion designers will find BiRefNet useful for keying talent or isolating elements shot on practical locations. Compared to Rotobrush or Runway's background removal, BiRefNet runs locally and integrates with other ComfyUI nodes for color grading, relighting, or generative fills. It's particularly strong on fine detail—hair, fur, and semi-transparent fabrics—where chroma keying falls short.

Prompt engineers and asset managers can use Gemma 4 to automate metadata tagging, generate alt-text for renders, or build dynamic prompt variations inside the same canvas where image and video generation happens. Because it runs locally, there's no token cost or API latency. It's not a replacement for GPT-4 or Claude for complex reasoning, but it's fast enough for batch captioning or simple text transforms.

What it costs / how to try it

All three models are open-source and free to use. Download them through the ComfyUI model library or install via the ComfyUI Manager. Check the ComfyUI blog for example workflows and node configurations.

Read the original announcement on ComfyUI ↗

Help keep this running

Your tip funds servers, models, and the time it takes to ship new tools faster. Set any amount below — every bit helps.