top | item 35858991

(no title)

aljungberg | 2 years ago

To an extent, but memory bandwidth soon becomes a bottleneck there too. The hidden state and the KV cache are large so it becomes a matter of how fast you can move data in and out of your L2 cache. If you don’t have a unified memory pool it gets even worse.

discuss

order

No comments yet.