(no title)
onalark | 10 years ago
One opportunity I do find in this article is real-world performance tests and determining the importance of constants in performance tuning. Prakop et al. did some really interesting work in 1999 on cache-oblivious algorithms, with applications to kernels such as the FFT and matrix multiplication. The work is theoretically extremely interesting, but in practice, hand-written or machine-generated algorithms continue to dominate. The most famous example of this is probably Kazushige Goto, whose hand-optimized GotoBLAS (now under open source license as OpenBLAS) still provides some of the fastest linear algebra kernels in the world.
If you're interested in learning more about how the differences between the two approaches shake out in linear algebra, I recommend "Is Cache-Oblivious DGEMM Viable?" by Gunnels et al.
No comments yet.