top | item 37947096

(no title)

almostdigital | 2 years ago

Just whatever you get by default with pip install numpy... Changing the benchmark to run with a 1024x1024x1024 matrix instead of a 128x128x128 does speed up numpy significantly though

    Python           119.189 GFLOPS
    Naive:             6.275 GFLOPS            0.05x faster than Python
    Vectorized:       22.259 GFLOPS            0.19x faster than Python
    Parallelized:     50.258 GFLOPS            0.42x faster than Python
    Tiled:            59.692 GFLOPS            0.50x faster than Python
    Unrolled:         62.165 GFLOPS            0.52x faster than Python
    Accumulated:     565.240 GFLOPS            4.74x faster than Python
np.__config__:

    Build Dependencies:
      blas:
        detection method: pkgconfig
        found: true
        include directory: /opt/arm64-builds/include
        lib directory: /opt/arm64-builds/lib
        name: openblas64
        openblas configuration: USE_64BITINT=1 DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS=
          NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= SANDYBRIDGE MAX_THREADS=3
        pc file directory: /usr/local/lib/pkgconfig
        version: 0.3.23.dev
      lapack:
        detection method: internal
        found: true
        include directory: unknown
        lib directory: unknown
        name: dep4364960240
        openblas configuration: unknown
        pc file directory: unknown
        version: 1.26.1
    Compilers:
      c:
        commands: cc
        linker: ld64
        name: clang
        version: 14.0.0
      c++:
        commands: c++
        linker: ld64
        name: clang
        version: 14.0.0
      cython:
        commands: cython
        linker: cython
        name: cython
        version: 3.0.3
    Machine Information:
      build:
        cpu: aarch64
        endian: little
        family: aarch64
        system: darwin
      host:
        cpu: aarch64
        endian: little
        family: aarch64
        system: darwin
    Python Information:
      path: /private/var/folders/76/zy5ktkns50v6gt5g8r0sf6sc0000gn/T/cibw-run-27utctq_/cp310-macosx_arm64/build/venv/bin/python
      version: '3.10'
    SIMD Extensions:
      baseline:
      - NEON
      - NEON_FP16
      - NEON_VFPV4
      - ASIMD
      found:
      - ASIMDHP
      not found:
      - ASIMDFHM

discuss

order

elashri|2 years ago

If you are looking for improved performance, you will always go with NumPy + vectorization. That's what is important. So I don't know what is the argument here, am I missing something?