The root cause has been tracked down to an timeout error during check-in processing that wasn't handled correctly and put the process into a bad state. We're working on a fix for the issue and should have it deployed shortly.
Check-in processing stopped at 11:47 UTC but we weren't made aware of the issue until 11:57 UTC. In reviewing our metrics and alerting we've identified a better metric to be alerting on and will be working that into an update to our internal monitoring and alerting systems.
Posted Jan 04, 2021 - 10:54 EST
Update
We are continuing to monitor for any further issues.
Posted Jan 04, 2021 - 07:28 EST
Monitoring
We've restarted the affected service and confirmed that it's processing correctly. It has now caught up on the backlog of pending check-ins.
We're continuing to investigate the root cause.
Posted Jan 04, 2021 - 07:28 EST
Investigating
We're investigating an issue with check-in processing starting around 11:45 UTC.
Posted Jan 04, 2021 - 07:14 EST
This incident affected: Snitch Check-in Processing.