(no title)
billisonline | 4 years ago
May I ask how you dealt with this? Were you able to explain it to Amazon support and get some of these charges forgiven? Also, how would you recommend monitoring for this type of issue with Lambda?
Btw, this reminds me a lot of one of my own early career screw-ups, where I had a batch job uploading images that was set up with unlimited retries. It failed halfway through, and the unlimited retries caused it to upload the same three images 100,000 times each. We emailed Cloudinary, the image CDN we were using, and they graciously forgave the costs we had incurred for my mistake.
calmlynarczyk|4 years ago
AWS support caught it before we did, so they did something on their end to throttle the Lambda invocations. We asked for billing forgiveness from them; last I heard that negotiation was still ongoing over a year after it occurred.
Part of the problem was we had temporarily disabled our billing alarms at the time for some reason, which caused our team to miss this spike. We've enabled alerts on both billing and Lambda invocation counts to see if either go outside of normal thresholds. It still doesn't hard-stop this from occurring again, but we at least get proactively notified about it before it gets as bad as it did. I don't think we've ever found a solution to cut off resource usage if something like this is detected.
genewitch|4 years ago
BackBlast|4 years ago
Just to give you nightmares. There's been DDoS in the news lately, I'm surprised nobody has yet leveraged those bot nets to bankrupt orgs they don't like who use cloud autoscaling services.
I don't know how you monitor it, part of the issue is the sheer complexity. How do you know what to monitor? The billing page is probably the place to start - but it is too slow for many of these events.
I guess you could start with the common problems. Keep watchdogs on the number of lambdas being evoked, or any resource you spin up or that has autoscaling utilization. Egress bandwidth is definitely another I'd watch.
Dunno, just seems to me you'd need to watch every metric and report any spikes to someone who can eyeball the system.
For me? I limit my exposure to AWS as much as I reasonably can. The possibilities combined with the known nightmare scenarios, with a "recourse" that isn't always effective doesn't make for good sleep at night.
aynsof|4 years ago
AWS Shield Advanced actually offers DDoS cost protection to mitigate this specific risk: https://aws.amazon.com/shield/features/
rileymat2|4 years ago
That’s interesting because I seems like it would happen, but what is in it for the attacker, whrn under threat they can implement caps?
jamesfinlayson|4 years ago
jamesfinlayson|4 years ago