top | item 39196523

(no title)

> now supports FlashAttention-2, yielding around 2x speedups

> torch.compile improvements

so far 2.1 didn't work well with MoE GPT, at least in my implementation, due to dynamism in data flow. will check how 2.2 does

discuss

No comments yet.