top | item 39727770

(no title)

saiojd | 1 year ago

What does __syncthreads() do here exactly? I'm new to CUDA, could get the overall idea of the FlashAttention paper but not the details.

discuss

order

cavisne|1 year ago

Causes every thread in the block to wait until they have reached this point. Worth reading a cuda primer for more details on blocks/warps.

Since the threads are relying on each other to fill the SRAM with all needed data if you didn’t wait then values would be missing.

xrd|1 year ago

Any CUDA primer you recommend in particular? I had this same question.