This model does not currently work with native ComfyUI INT8, and used a sub-optimal FP8 base, which makes it perform worse and ended up doing a triple quantization.

I recommend using either a proper double quant from silver https://huggingface.co/silveroxides/ideogram4-dequant-and-int8-quant/ or these comfy int8 quants https://huggingface.co/Comfy-Org/Ideogram-4/


This is an avoidable triple quantization due to me being bad.

The FP8 weights were cast to FP32 with the FP8 scales, then downcast to BF16 before being converted to INT8.

For use in ComfyUI with https://github.com/BobJohnson24/ComfyUI-INT8-Fast

Speed is 1.78x faster(2.03s/it) than FP8(3.62s/it) on my 3090, without compile.

~2x faster with torch compile.

After further inspection, it appears there may be quality issues with torch compiling this model.

Quick comparison:

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support