FYI: https://github.com/mchehab/rasdaemon (replaced mcelog) is the daemon for watching for these ECC and other kernel reported "Reliability, Availability and Serviceability" errors.
rasdaemon also attempts to report which physical DIMM / slot triggered the ECC error.
No comments yet.