top | item 35983519

(no title)

mteigers | 2 years ago

Ditto, my managed service at Google threw close to 3k 500s / second when I was still there. Anything from cosmic rays, faulty hardware bit flipping, hard drive failures.

We did, however, aggregate and group similar 500s and those did get looked at, but no way could we have looked at all errors.

The other thing, is that with resilient infrastructure, who cares about an occasional 500. Back off and retry. No harm done.

discuss

order

samus|2 years ago

User experience might indeed not be influenced by these errors, but errors of a less stochastic nature will impact it. The former obscure visibility of the latter, and that's probably the point of TA.

MichaelZuo|2 years ago

It's likely some fraction of those errors would have had some tangible impact on user experience.

willcipriano|2 years ago

3k? What percentage of requests is that?