top | item 40940425

(no title)

ex3ndr | 1 year ago

I am wondering why flash attention is like 5x slower with variable masking than without it? Lack of good masking support almost zeros out the optimizations

discuss

order

chillee|1 year ago

Where are you seeing these benchmarks?