top | item 10279511

(no title)

thesnider | 10 years ago

I wouldn't be so sure of that -- none of Google's clusters use ECC, for instance.

discuss

order

miah_|10 years ago

Really?

"This paper studied the incidence and characteristics of DRAM errors in a large fleet of commodity servers. Our study is based on data collected over more than 2 years and covers DIMMs of multiple vendors, generations, technolo- gies, and capacities. All DIMMs were equipped with error correcting logic (ECC) to correct at least single bit errors"

from conclusion 1.

"The conclusion we draw is that error correcting codes are crucial for reducing the large number of memory errors to a manageable number of uncorrectable errors. In fact, we found that platforms with more powerful error codes (chip- kill versus SECDED) were able to reduce uncorrectable er- ror rates by a factor of 4–10 over the less powerful codes."

DRAM Errors in the Wild: A Large-Scale Field Study : http://research.google.com/pubs/pub35162.html