top | item 29312189

(no title)

homerowilson | 4 years ago

I used OpenBLAS on my cheap last-generation AMD Ryzen 7 4700U laptop like so:

git clone https://github.com/xianyi/OpenBLAS && cd OpenBLAS && make PREFIX=/opt/openblas install && curl https://jott.live/code/blas_test.cc | sed -n "/<code>/,/code>/p" | tail -n +2 | head -n -1 > blas_test.cpp

inspect blas_test.cpp file, and then...

g++ -I/opt/openblas/include/ blas_test.cc -lopenblas -std=c++11 -O3 -L/opt/openblas/lib/ -o blas_test && ./blas_test 512 512 512 100 100

and got a peak of about 192 gflops, averaging closer to 180. So yeah, the M1 is > 6x faster in this simple single-precision matrix test.

discuss

order

matja|4 years ago

541 gflops here, following those steps. Well done Apple for making a laptop CPU over 2x faster than a 250W server CPU released this year :)

kitestramuort|4 years ago

With my Ryzen 7 5800U laptop I get around 530 gflops, with a peak of 596 if I compile the test against MKL with

g++ -I/opt/intel/mkl/include/ blas_test.cc -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -std=c++11 -O3 -march=native -L/opt/intel/mkl/lib/intel64 -o blas_test_mkl