The last time I tried this (4 years ago?), there were significant delays getting messages to some people. We were all on AT&T plans with work phones but some took 8-10 hours to get the alert.
Those can be good, not sure how reliable they are, and of course you've got to send it to the right carrier.
I'm volunteering for a non-profit right now on a project that sends SMS messages through Twilio, I think their current cost is $0.0075 (for a US to US message), and interfacing to Twilio is easy, if not a joy, their API is sane and the on-line documentation is excellent. This approach gives you the whole 160 character budget to describe the problem.
At https://t1mr.com we don't use SMS for notification, it's too unreliable. Besides delivery issues, there are benefits if the monitor knows if someone has really seen the notification - we know when somebody answers a call, but we can't know if he has actually seen the SMS.
SMS probably won't wake you up or get your attention if you're busy though. if your phone get push email, then email and SMS are more or less the same - you get push notification buzz and something on the homescreen.
We use PagerDuty and couldn't be happier with the product.
Ops in a modern startup (based on my experience at Crittercism) is about 1/3 automation (deploys, backups, cronjobs, etc.) 1/3 monitoring, and 1/3 vendor/product eval (hosting, various consultants for things like database tuning)
The hardest part about monitoring isn't making the tool go off, it's (1) knowing when something is broken and (2) knowing who needs to be alerted when that thing is not working. "Tell the whole team" breeds an attitude of "this is someone else's problem", and also prevents real work/progress from happening during incident response. You have to get away from the "all hands on deck" during an incident once your company gets beyond about 3-4 engineers or your feature velocity is going to get destroyed.
Also, as your company gets larger, you'll find that managing the communication around the incident is just as important as fixing the problem. Customers HATE being left in the dark, so it's important to figure out who needs to know things are broken (internally and externally) and how that's communicated.
Heroku did an excellent writeup on this topic recently: https://blog.heroku.com/archives/2014/5/9/incident-response-... -- even if you don't adopt the full system outlined there, at least ensure you're thinking about it, especially the communication part.
OpsGenie has a free tier as well. They recently added a unique feature that routes calls to a support number to the engineers on duty (can record a message and attack it to the incident). Outside of the butt-ugly UI (which they assured me are working to modernize), OpsGenie beats PagerDuty in many aspects, it's cheaper, but not as popular. They've had an Android app and push notifications way before PagerDuty and are cheaper.
It's a nice hack, but it's better to use external service for this. We run https://t1mr.com
One problem is you need separate infrastructure to host your monitor. You also need to monitor the monitoring service, or it is easy for it to be quietly failing until your real site fails without warning. We run a separate instance that only monitors the public instance of t1mr.
Attention to detail matters and you really want to be focusing on your product. We are quietly handling other stuff, like calling multiple people until someone really answers the phone, or checking if your ssl certificates are about to expire.
If all you need is monitoring a single end-point, then just signup to a Pingdom free account. Very reliable monitoring and 20 SMS notifications per month (no caps on email notifications) https://www.pingdom.com/free/
And if you need to monitor more than one system, then go for Pingdom "Starter" for only £6.99/month https://www.pingdom.com/pricing/
IMO that's fairly cheap and avoid yet another system to maintain..
[+] [-] saluki|12 years ago|reply
https://ifttt.com/p/saluki
You get 100 FREE SMS alerts from IFTTT.com each month.
If you run up against the 100 SMS limit you can set it up for iMessage instead of SMS.
[+] [-] jewel|12 years ago|reply
If you don't want to set up nagios, you can create a quick monitoring solution with cron along these lines:
This assumes you have working email delivery on the machine doing the checking.[+] [-] atwebb|12 years ago|reply
[+] [-] hga|12 years ago|reply
I'm volunteering for a non-profit right now on a project that sends SMS messages through Twilio, I think their current cost is $0.0075 (for a US to US message), and interfacing to Twilio is easy, if not a joy, their API is sane and the on-line documentation is excellent. This approach gives you the whole 160 character budget to describe the problem.
[+] [-] ntoshev|12 years ago|reply
[+] [-] loopback0|12 years ago|reply
[+] [-] leeoniya|12 years ago|reply
a more complete list is here: https://discussions.apple.com/thread/5913116
it's free, but with Sprint for example, it prepends "Subject: " to the front of the text. not great, but i guess might be an ok compromise for "free".
[+] [-] kbar13|12 years ago|reply
[0]http://www.pagerduty.com/
[+] [-] eldavido|12 years ago|reply
Ops in a modern startup (based on my experience at Crittercism) is about 1/3 automation (deploys, backups, cronjobs, etc.) 1/3 monitoring, and 1/3 vendor/product eval (hosting, various consultants for things like database tuning)
The hardest part about monitoring isn't making the tool go off, it's (1) knowing when something is broken and (2) knowing who needs to be alerted when that thing is not working. "Tell the whole team" breeds an attitude of "this is someone else's problem", and also prevents real work/progress from happening during incident response. You have to get away from the "all hands on deck" during an incident once your company gets beyond about 3-4 engineers or your feature velocity is going to get destroyed.
Also, as your company gets larger, you'll find that managing the communication around the incident is just as important as fixing the problem. Customers HATE being left in the dark, so it's important to figure out who needs to know things are broken (internally and externally) and how that's communicated.
Heroku did an excellent writeup on this topic recently: https://blog.heroku.com/archives/2014/5/9/incident-response-... -- even if you don't adopt the full system outlined there, at least ensure you're thinking about it, especially the communication part.
[+] [-] kolev|12 years ago|reply
[+] [-] ntoshev|12 years ago|reply
One problem is you need separate infrastructure to host your monitor. You also need to monitor the monitoring service, or it is easy for it to be quietly failing until your real site fails without warning. We run a separate instance that only monitors the public instance of t1mr.
Attention to detail matters and you really want to be focusing on your product. We are quietly handling other stuff, like calling multiple people until someone really answers the phone, or checking if your ssl certificates are about to expire.
[+] [-] msantos|12 years ago|reply
If all you need is monitoring a single end-point, then just signup to a Pingdom free account. Very reliable monitoring and 20 SMS notifications per month (no caps on email notifications) https://www.pingdom.com/free/
And if you need to monitor more than one system, then go for Pingdom "Starter" for only £6.99/month https://www.pingdom.com/pricing/
IMO that's fairly cheap and avoid yet another system to maintain..
[+] [-] matnewton85|12 years ago|reply
Love my Zapier.
[+] [-] umutm|12 years ago|reply
P.S. I'm a developer at Uptime Robot and Everyone Panic is a very handy integration, great job.