top | item 27560924

(no title)

bjacokes | 4 years ago

Great question, although I'm not sure there's a concrete answer to it other than "it depends". You can think of that metric as representing the number of logs that haven't been garbage collected, so as it goes up, performance will get worse.

If you're seeing spikes in RollbackSegmentHistoryListLength that coincide with dips in DB performance, you've probably identified the culprit. In the scenario described in our post, that metric would have grown monotonically for the duration of the long-lived ETL query – probably a more overt problem than what you're describing with short spikes to 100,000.

discuss

shepardrtc|4 years ago

A number of our 100k spikes spanned about a day, and a cluster of them seem to coincide with serious performance issues we have encountered. We "solved" the problem by increasing the instance size, but I'm starting to see spikes that get larger and larger, so I suspect we will run into this issue again. But now I have something to report on and watch out for. Thank you!