(no title)
janwas | 7 months ago
Can you afford to write and maintain a codepath per ISA (knowing that more keep coming, including RVV, LASX and HVX), to squeeze out the last X%? Is there no higher-impact use of developer time? If so, great.
If not, what's the alternative - scalar code? I'd think decent portable SIMD code is still better than nothing, and nothing (scalar) is all we have for new ISAs which have not yet been hand-optimized. So it seems we should anyway have a generic SIMD path, in addition to any hand-optimized specializations.
BTW, Highway indeed provides decent emulations of LD2..4, and at least 2-table lookups. Note that some Arm uarchs are anyway slow with 3 and 4.
anonymoushn|7 months ago
If we were already depending on highway or eve, I would think it's great to ship the generic SIMD version instead of the SSE version, which probably compiles down to the same thing on the relevant targets. This way, if future maintainers need to make changes and don't want to deal with the several implementations I have left behind, the presence of the generic implementation would allow them to delete them rather than making the same changes a bunch of times.
janwas|7 months ago