Sunday, July 24, 2022

Alerts & alerting again


Dan Ravenstone published a refresher on Alerting that's worth a quick listen or skim.  Alerts are supposed to help us improve a service by enabling us to detect issues affecting our customers sooner and should also be useful to help us diagnose the issue quickly so that the issue can be mitigated, alleviating customer pain.  Most alerts are false positives -- paging us out of bed for a problem that does not exist.  Other alerts have no information about what is causing the issue, just some vague alarm that something is wrong.  And, of course, the absence of alerts is the most frequent reason customers tell us when our service is not working instead of our alerting system.

No comments: