(no title)
textlapse | 14 days ago
GPUs are still not practically-Turing-complete in the sense that there are strict restrictions on loops/goto/IO/waiting (there are a bunch of band-aids to make it pretend it's not a functional programming model).
So I am not sure retrofitting a Ferrari to cosplay an Amazon delivery van is useful other than for tech showcase?
Good tech showcase though :)
zozbot234|14 days ago
textlapse|13 days ago
I understand with newer GPUs, you have clever partitioning / pipelining in such a way block A takes branch A vs block B that takes branch B with sync/barrier essentially relying on some smart 'oracle' to schedule these in a way that still fits in the SIMT model.
It still doesn't feel Turing complete to me. Is there an nvidia doc you can refer me to?