False alerts and dashboard 503 errors
Incident Report for Dead Man's Snitch
Resolved
All systems are green. We've migrated all processing back to Heroku as they have resolved their upstream issue.

Our processing system should have recovered more quickly than it did and we're investigating a possible fix for that.
Posted Sep 27, 2021 - 08:55 EDT
Update
Routing issues with the API and Dashboard appear to be mostly resolved. We are migrating some check-in processing back to Heroku and will continue to monitor the situation.
Posted Sep 27, 2021 - 08:10 EDT
Monitoring
Our check-in workers have caught up on all pending check-ins and alerts should be accurate going forward.

Our main goal has been to get alerting and check-in processing back online. Heroku continues to experience issues with dynos and routing requests. We've worked around the dyno issues by temporarily moving check-in processing to hosts on EC2.

We monitoring check-in process and Heroku's status and will update once we consider the issue fully resolved.
Posted Sep 27, 2021 - 05:57 EDT
Update
We've worked around the issues with check-in processing and are currently working through the backlog of pending check-ins in the queue.

It doesn't appear our check-in receiver was impacted by the outage, just the workers that process the check-ins.
Posted Sep 27, 2021 - 05:44 EDT
Identified
We've temporarily disabled alerting as we investigate a way to work around the upstream issues.
Posted Sep 27, 2021 - 04:48 EDT
Investigating
We're currently investigating issue affecting check-in processing and dashboard availability. We believe these are related to two major issues affecting our hosting provider (Heroku) and are currently investigating.

https://status.heroku.com/incidents/2361
https://status.heroku.com/incidents/2362
Posted Sep 27, 2021 - 04:04 EDT
This incident affected: Snitch Check-in Processing, Management Portal, and API.