top | item 39297943

(no title)

chaisan | 2 years ago

nice. where is the the main part of the 4.4x boost on ARM coming from?

discuss

order

ashvardanian|2 years ago

It’s mostly coming from using the Arm NEON intrinsics, not much magic. While working on the library, I was shocked to see how under-vectorized LibC is on Arm. A lot of improvement potential beyond strings.

Amazon, Microsoft, Nvidia, Ampere, Apple, Qualcomm, and all the other Arm-based CPU vendors should really consider investing more into the ecosystem. The hardware is very capable, they shouldn’t be losing against x86 in so many benchmarks…

menaerus|2 years ago

I'd say that SIMD and even moreso CPU internals knowledge is not quite common and upmost performance is I think not among the highest priority goals in libc/libc++/libstdc++. The ones who need it will implement it themselves. The ones that don't need them won't even notice.

Implementation effort and maintanence is by several factors larger than usual "good enough" scalar implementation.