If you work in IT for any length of time you will have heard someone shouting about an event ID or an alert going off again or that there’s an incident in progress, but what are these and should you really care about them?
The short answer to this question is yes, you should pay attention them. But I hear you ask, what is an alert, or an event or even an incident?
At any time, your device, whether it’s a mobile, laptop, desktop of network device will be generating logs of some sort. From these logs, alerts, events and incidents can be generated.
An event is when something has changed within the environment that is not of normal behaviour, for example a computer has gone offline, disk space is about to run out or that a process workflow has failed. For example, a web server is not processing requests and therefore the service is down impacting business.
An alert is a type of notification that has been configured to respond to a predetermined event or series of events which has occurred and is not considered of normal behaviour. The alert will then go through a predefined course of actions, which is usually alerting someone of importance to notify them of an issue.
An example of this could be that a server is running out of disk space, the alert sees that the low disk space event has gone to red (or failed, or bad) and then the alert sends an email or text message to someone on call.
An incident occurs when the logs cause an event which causes an alert to be fired off which then causes the incident due to the impact of service or the Confidentiality, Integrity and/or Availability (CIA) of the business has been impacted.
An example of this is when there has been a data breach, the logs may show increased transfer of data, the event for high bandwidth output is executed which executes an alert to people on call, who investigate the notification to see there is indeed an issue and raise an alert to deal with the problem at hand.
Putting them all together
When you put all the above together, they can flow smoothly. However, the hardest piece of the puzzle is to identify what you want to log and what you want to alert on, after all you don’t want to get alert fatigue and eventually ignore all the alerts which comes through.
Having an effective process in place which monitors and alerts on core services is a start, ensuring that you are alerted on loss of service, that you have alerts in place before incidents occur is a good process.