top | item 43544892

(no title)

grandmczeb | 11 months ago

Your home setup is much less efficient than production inference in a data center. Open source implementation of SDXL-Lightning runs at 12 images a second on TPU v5e-8, which uses ~2kW at full load. That’s 170J or about 1/400th the phone charge.

https://cloud.google.com/blog/products/compute/accelerating-...

https://arxiv.org/pdf/2502.01671

discuss

bayindirh|11 months ago

These models do not appear from thin air. Add in the training cost in terms of power. Yes it's capex and not opex, but it's not free by any means.

Plus, not all these models run on optimized TPUs, but mostly on nVIDIA cards. None of them are that efficient.

Otherwise I can argue that running these models are essentially free since my camera can do face recognition and tracking at 30fps w/o a noticeable power draw since it uses a dedicated, purpose built DSP for that stuff.

grandmczeb|11 months ago

GPU efficiency numbers in a real production environment are similar.

Saigonautica|11 months ago

Oh, that's way better! I guess the comparison only holds as approximately true with home setups -- thanks for the references.