(no title)
eloy | 5 years ago
> We have decades of odd random kernel oopses that could never be explained and were likely due to bad memory. And if it causes a kernel oops, I can guarantee that there are several orders of magnitude more cases where it just caused a bit-flip that just never ended up being so critical.
It might be false, but I think it's a reasonable assumption.
IgorPartola|5 years ago
simias|5 years ago
I can understand Linus's frustration from that point of view: without ECC RAM when you get some super weird crash report where some pointer got corrupted for no apparent reason you can't be sure if it's was just a random bitflip or if it's actually hiding a bigger problem.
chalst|5 years ago
> A large-scale study based on Google's very large number of servers was presented at the SIGMETRICS/Performance ’09 conference.[6] The actual error rate found was several orders of magnitude higher than the previous small-scale or laboratory studies, with between 25,000 (2.5 × 10−11 error/bit·h) and 70,000 (7.0 × 10−11 error/bit·h, or 1 bit error per gigabyte of RAM per 1.8 hours) errors per billion device hours per megabit. More than 8% of DIMM memory modules were affected by errors per year
reader_mode|5 years ago