(no title)
stlava | 2 years ago
- Connection pooling / pipelining and circuit breaking is a must at scale. The clients are a lot better than they used to be but it's important developers understand the behavior of the client library they are using. Someone suggested using Envoy as sidecar proxy, I personally wouldn't after our experience with it with redis but it's an easy option. - Avoid changing the cluster topology if the CPU load is over 40%. This is primarily in case of unplanned failures during a change. - If something goes wrong shed load application side as quick as possible because Redis won't recover if it's being hammered. You'll need to either have feature flags of be able to scale down your application. - Having replicas won't protect you from data loss so don't treat it as a source of truth. Also, don't rely on consistency in clustered mode. - Remember Redis is single threaded so an 8xl isn't going to be super useful with all those unused cores.
Things we have alarms on by default: - Engine utilization - Anomalies in replication lag - Network throughput (relative to throughput of the underlying EC2 instance) - Bytes used for cache - Swap usage (this is the oh shit alarm)
rmbyrro|2 years ago
unlitioldem|2 years ago
[deleted]