(no title)
physicsguy | 1 month ago
They're optimising for different things really.
Intel/Nvidia have the resources to (a) optimise across a wide range of hardware in their libraries (b) often use less well documented things (c) don't have to make their source code publicly accessible.
Take MKL for example - it's a great library, but implementing dynamic dispatch for all the different processor types is why it gets such good performance across x86-64 machines, it's not running the same code on each processor. No academic team can really compete with that.
shihab|1 month ago