My young child has Type 1 diabetes. I need to know when the server that receives glucose readings from his CGM on his body goes down. It’s a huge help to prevent emergencies and death when I know when his blood glucose is low AND when there’s no new blood glucose readings. I’m going to try this service. Thank you for sharing!
Thank you for your comment! Wow - That sounds like a very important use case. Please feel free to reach out to me if you encounter anything while getting set up admin@varonova.ca; I'll double check once you sign up that everything is configured correctly (any unauthenticated endpoint that retuns a 200 response will work, best is /health).
Built YourServerIsDown.com as a side project that we needed for our startup... anyone else have the issue of not finding out quickly enough if your server went down?
For our app it's super important as if our server goes down, users can download the app but get stuck at the sign in flow. There's subscription services out there that do more in-depth monitoring but this is all we needed.
I listed an alternative solution below for those wanting to build or customize their own solution, ours just gets the job done, is quick to set up, and you can avoid the monthly twilio/sms fees.
Other alternative we received as feedback for those interested: "If any one wants an AWS Native way and assuming it has ALB you can target elb metric 503 via Cloudwatch Alarm and create an output to an SNS topic that goes to Slack, or use AWS chatbot/q, or set number as destination for sms via sns"
I suppose you're right, unless your server goes down more than 50 times. I saw it as credits that expire in a year, would be a bit scary to offer monitoring in perpetuity for $5 if they didn't expire.
Consider changing your landing page to reflect the price (“Only $5 Annually!”). The reason I asked was because the way it is now makes it look like the service is being offered for free, which made me think it was a phishing scheme.
Not quite - just thought of what I would actually pay for it. Looked at others and saw most done for you solutions were monthly subscriptions ~$9. Couldn't see myself being excited by that pricing model + another dashboard to manage, so I made it $5 for the year and built the most simple solution that's easy to set up and reliable (while not losing money).
Almost all monitoring services I found target enterprise, and the ones that don't are self-hosted. This solution is for the small teams/indie devs that just need to know when their servers down. Might raise the price though, thinking the low price might work against me for quality perception. What do you think?
Hi there thanks for purchasing! Currently we have it set to just one endpoint per account (MVP Launch) but let me add support for more today. How many endpoints are you looking to monitor?
This tool wouldn’t be useful for most (if not) all enterprise services I’ve worked for. For enterprise, you want fully featured synthetics services such as Thousand Eyes, plus an internal monitoring and alerting system.
Also you typically don’t want to expose your health endpoint to the outside world. It’s a security risk.
It's aimed at indie devs/startups shipping ideas quick. Built it for ourselves while we were starting an app under the aws free tier which occasionally went down when usage spiked. Notified us to fix it quickly before losing users that could download the app but not create an account. It can be set up in 30 seconds without needing to code anything, so mainly for coders that want a quick and easy solution.
So not aiming for enterprise on this one, made the pricing quite accessible and with minimal features.
For the health endpoint as long as it only returns a 200 status code (without disclosing info like tokens or resource info/server configurations) then the risk is very minimal.
Different use cases, ours is for people who prefer simplicity and just want a text to know if their server goes down. We charge $5 for the year, this service charges $108 (But is aimed at larger enterprises and provides more analytics like monitoring logs/custom status pages). I'm sure their service is great for their enterprise customers, but for smaller teams I feel a text when your server goes down solves the key pain point.
With that being said, I find these kinds notifications to provide more false positives than correctly detecting downtime. That ends up costing more time checking/double checking.
On the other hand, if you are running a service with no users and you have downtime... did you really have downtime?
If you run a service and you have downtime and no one reports it, did you have downtime?
I don't even check for my services. If something goes down, I'll find out via email from one or more of my customers. It happens very rarely.
If a tree falls in the forest and no one is around to hear it, does it make a sound?
You bring up a good point. I think it to be less of a problem for more established companies that don't face unexpected outages too often. When we were starting out with our mobile app however this wasn't the case, and each outage meant downloads lost which were critical for getting early feedback. I see it as a bigger pain point for early founders/small teams whose server could see a lot of volatility.
So far we haven't encountered any false positives (been using it for around 6 months) but perhaps with the wrong endpoint that could be a problem. I'll keep an eye out for that.
Correct. It requires an unauthenticated endpoint that retuns a 200 response. So usually this is the /health endpoint, but as long as we can send a ping it works.
ok how does it actually work. i get it you ll check for 500 errors by hitting multiple endpoints every x units of time. But the number of endpoints you must check also keeps going up for your service. Today you start and have 10 endpoints,6 months down the line you need to check 10000 endpoints every x units of time. How do you manage scaling this?
Right, we ping the servers every minute. Since we charge a one-time fee the credits expire after a year, but the service is scaleable. To answer your question I'll give you some more context:
The architecture uses scalable AWS serverless components (Lambda, SQS, DynamoDB) and is well-suited to handle a large increase in monitored endpoints. The primary scaling mechanism is the automatic concurrency scaling of the Lambda functions processing messages from SQS queues. Should we scale to 10,000 endpoints we do expect some bottlenecks that would require optimizing i.e. increasing lambda timeouts/memory etc. but we'll cross that bridge when we get to it.
For the actual sms sending our numbers can send up to 100 sms texts/second.
So if one hosts one's site on AWS, then your system probably isn't going to work, eh? :)
If AWS goes down, your site and mine both go down together. This was basically why Pagerduty got out to an early win -- they never used AWS when everyone else did.
Where you host is underrated. When I started building on-call for Rootly the first thing we did was build a multi-cloud setup (AWS and GCP) for honestly pretty overkill reliability. Don’t regret it one bit.
thank you for the detail responses, so i understand that you have a lambda function that fires a request to fetch a website url from dynamodb, since lambda's require a memory limit and a timeout, how much memory is each function using and what is the timeout for a request (30s?) Also does each lambda function handle a single url or we doing asyncio aiohttp stuff with a whole bunch of urls at one go?
For our app it's super important as if our server goes down, users can download the app but get stuck at the sign in flow. There's subscription services out there that do more in-depth monitoring but this is all we needed.
I listed an alternative solution below for those wanting to build or customize their own solution, ours just gets the job done, is quick to set up, and you can avoid the monthly twilio/sms fees.
Other alternative we received as feedback for those interested: "If any one wants an AWS Native way and assuming it has ALB you can target elb metric 503 via Cloudwatch Alarm and create an output to an SNS topic that goes to Slack, or use AWS chatbot/q, or set number as destination for sms via sns"
Almost all monitoring services I found target enterprise, and the ones that don't are self-hosted. This solution is for the small teams/indie devs that just need to know when their servers down. Might raise the price though, thinking the low price might work against me for quality perception. What do you think?
This tool wouldn’t be useful for most (if not) all enterprise services I’ve worked for. For enterprise, you want fully featured synthetics services such as Thousand Eyes, plus an internal monitoring and alerting system.
Also you typically don’t want to expose your health endpoint to the outside world. It’s a security risk.
So not aiming for enterprise on this one, made the pricing quite accessible and with minimal features.
For the health endpoint as long as it only returns a 200 status code (without disclosing info like tokens or resource info/server configurations) then the risk is very minimal.
With that being said, I find these kinds notifications to provide more false positives than correctly detecting downtime. That ends up costing more time checking/double checking.
On the other hand, if you are running a service with no users and you have downtime... did you really have downtime?
If you run a service and you have downtime and no one reports it, did you have downtime?
I don't even check for my services. If something goes down, I'll find out via email from one or more of my customers. It happens very rarely.
You bring up a good point. I think it to be less of a problem for more established companies that don't face unexpected outages too often. When we were starting out with our mobile app however this wasn't the case, and each outage meant downloads lost which were critical for getting early feedback. I see it as a bigger pain point for early founders/small teams whose server could see a lot of volatility.
So far we haven't encountered any false positives (been using it for around 6 months) but perhaps with the wrong endpoint that could be a problem. I'll keep an eye out for that.
There are services like Textbelt that leave the trigger mechanisms all up to you and your local tools:
https://textbelt.com/
The architecture uses scalable AWS serverless components (Lambda, SQS, DynamoDB) and is well-suited to handle a large increase in monitored endpoints. The primary scaling mechanism is the automatic concurrency scaling of the Lambda functions processing messages from SQS queues. Should we scale to 10,000 endpoints we do expect some bottlenecks that would require optimizing i.e. increasing lambda timeouts/memory etc. but we'll cross that bridge when we get to it.
For the actual sms sending our numbers can send up to 100 sms texts/second.
If AWS goes down, your site and mine both go down together. This was basically why Pagerduty got out to an early win -- they never used AWS when everyone else did.