At the same time, we had a page that was updated by software monitoring multiple application logs, looking for certain key phrases. Many times, if a multiple outage or problem was occurring, we would be notified by at least one of the monitoring methods, many time by both.
The generalized lesson (even beyond software) is that it's good to monitor multiple key indicators that are independently generated. It increases your chances of catching problems early.
In fact, we found that some messages, if they occurred in a log file were like Spiderman's "Spidey sense!" They indicated that a problem was about to incur in the next 5-10 minutes and allowed us to take preventive measures.
© 2024 Praveen Puri