It's a fallacy to say that this can beat hand-optimized C++. If you were to write the equivalent code in hand-optimized C++ you could in 99% get at least a small speedup if not a larger one.
It's as if you are comparing singlethread C++ with multithreaded burst C# code?
No comments yet.