top | item 39074976

(no title)

JB_Dev | 2 years ago

I’ve done something similar in one of our production services. There was a problem with extremely long GC pauses during Gen2 garbage collection (.NET uses a multigenerational GC design). Pauses could be many seconds long or more than a minute in extreme cases.

We found the underlying issue that caused memory pressure in the Gen2 region but the fix was to change some very fundamental aspects of the service and would need to have some significant refactoring. Since this was a legacy service (.net framework) that we were refactoring anyway to run in new .NET (5+), we decided to ignore the issue.

Instead we adjusted the GC to just never do the expensive Gen2 collections (GCLatencyMode) and moved the service to run on higher memory VMs. It would hit OOM every 3 days or so, so we just set instances to auto-restart once a day.

Then 1 year later we deployed the replacement for the legacy service and the problem was solved.

discuss

order

No comments yet.