top | item 17058141

(no title)

tiehuis | 7 years ago

It certainly isn't out of reach to get a fairly close speed to GMP implementation-wise if you are willing to optimize the low-level loops in assembly. I think the simple cases are rather straight-forward to reach parity but once you start needing to optimize your algorithm thresholds, it requires much more testing to find the optimal values [1].

It is also easy to overlook how well optimized GMP is across a wide range of less common architectures and chips and I wouldn't be surprised if my particular implementation lost a bit of ground on other architectures like ARM (would be a good thing to test).

[1] https://gmplib.org/devel/thres/

discuss

order

No comments yet.