(no title)
ssokolow | 2 years ago
They tried doing a comparison between ReadySet compiled normally and ReadySet with bounds checking removed so thoroughly that they needed to use a patched toolchain to achieve it and found the difference to be within the noise threshold.
Their conclusion was:
> At the end of the day, it seems like at least for this kind of large-scale, complex application, the cost of pervasive runtime bounds checking is negligible. It’s tough to say precisely why this is, but my intuition is that CPU branch prediction is simply good enough in practice that the cost of the extra couple of instructions and a branch effectively ends up being zero - and compilers like LLVM are good enough at local optimizations to optimize most bounds checks away entirely. Not to mention, it’s likely that quite a few (if not the majority) of the bounds checks we removed are actually necessary, in that they’re validating some kind of user input or other edge conditions where we want to panic on an out of bounds access.
vlovich123|2 years ago
So the remaining bounds check is off the hot path where it doesn’t matter. But in the hot path where you should be bounded by HW limits, it can be a significant slowdown. Prediction can help but it’s not free.
So for most people, they only need to care about bounds checking when they’re doing something in the hot path and even then only when their hot path is running into HW limits. If their hot path is some complicated CPU computation, bounds checking should be but a blip unless you do something stupid like check the bounds too frequently.
So the general advice to not worry too much about bounds checks when writing Rust is directionally correct for the vast majority of people, but recognize it’s incorrect in places and it’s hard to notice because it’s such a small thing hidden in code gen without an easy flag to test the impact.
ssokolow|2 years ago
I still think the "because it's not in the fast path" part of "Most software will not see a bottleneck because of bounds checking because it's not in the fast path" is a bit too much of a blanket statement and could detract from the admonition to benchmark very carefully before optimizing but, otherwise, I agree.