Note that soon after I posted this, Someone noticed[1,2] that it is possible avoid a whole lot of work, but I am still glad I realized how much performance one can gain by dropping GMP in favor of `__int128`.
It seems like for numbers on this side of 1024 bits, writing purpose-built routines using `__int128` or SSE/AVX instructions rather than general functionality is likely to pay off.
[+] [-] nanis|4 years ago|reply
It seems like for numbers on this side of 1024 bits, writing purpose-built routines using `__int128` or SSE/AVX instructions rather than general functionality is likely to pay off.
[1]: https://news.ycombinator.com/item?id=10915461
[2]: https://www.nu42.com/2016/01/excellent-numbers-explicit-solu...
[+] [-] tgsovlerkhgsel|4 years ago|reply
- GMP has limits (compare e.g. https://gmplib.org/list-archives/gmp-discuss/2004-April/0011...). 2^31 limbs of 64 bit each means you cannot process numbers larger than ~2^37 bits (2^34 bytes = 16 GiB) with the standard mpz functions.
- Multiplication of large number is its own science, with different algorithms optimal at different sizes. (https://gmplib.org/manual/Multiplication-Algorithms)
[+] [-] olliej|4 years ago|reply
My guess is the rest of the overhead is things like gmp needing to allocate heap storage as it doesn't get the aid of having fixed size storage.
[+] [-] madars|4 years ago|reply
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] anthk|4 years ago|reply
[+] [-] borramakot|4 years ago|reply