top | item 46177259

(no title)

egeres | 2 months ago

Incredibly fast, on my 5090 with CUDA 13 (& the latest diffusers, xformers, transformers, etc...), 9 samplig steps and the "Tongyi-MAI/Z-Image-Turbo" model I get:

- 1.5s to generate an image at 512x512

- 3.5s to generate an image at 1024x1024

- 26.s to generate an image at 2048x2048

It uses almost all the 32Gb Gb of VRAM and GPU usage. I'm using the script from the HF post: https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

discuss

order

SV_BubbleTime|2 months ago

Weird, even at 2048 I don’t think it should be using all your 32GB VRAM.

egeres|2 months ago

It stays around 26Gb at 512x512. I still haven't profiled the execution or looked much into the details of the architecture but I would assume it trades off memory for speed by creating caches for each inference step