(no title)
komuher | 4 years ago
34s on ultra (64 core gpu) vs 11s on rtx 3090
So 3 times slower but taking ~4 times less power (350W 3090 70W ultra [on 48 core gpu so about 90 on 64 cores])
komuher | 4 years ago
34s on ultra (64 core gpu) vs 11s on rtx 3090
So 3 times slower but taking ~4 times less power (350W 3090 70W ultra [on 48 core gpu so about 90 on 64 cores])
sirwhinesalot|4 years ago
komuher|4 years ago
But i can agree results are good but we need also to remember Samsung 8nm [rtx 3090 node] is 4 years old so 2 generations different from TSMC 5nm 4090 will be probably on 5nm node in few months and presumably will have 3x teraflops (probably 3x tf32 precision but it could be fp32 using 2/1.5x(?) more power).
(also I'm not sure about optimization it was write by apple employees and apple likes to drop open source support and focus on proprietary software)