With a RTX 2080ti, the performance is much worse. Getting around 1 it/s while with lstein/stable-diffusion I get ~10 it/s.
Could be that there is a bunch of low hanging fruits for optimization though, and it can reach higher performance, the amount of people who spent time doing any sort of optimizations are obviously much lower for this TensorFlow port (1 person) vs the pytorch implementation (too many to count manually).
Edit: it's unclear to me if it's actually running on the GPU at all.
capableweb|3 years ago
Could be that there is a bunch of low hanging fruits for optimization though, and it can reach higher performance, the amount of people who spent time doing any sort of optimizations are obviously much lower for this TensorFlow port (1 person) vs the pytorch implementation (too many to count manually).
Edit: it's unclear to me if it's actually running on the GPU at all.