Thursday, September 3, 2009

Criteria for Triggering an IT Escalation – some examples

"Escalation" is often mentioned when dealing with Incident and Problem Management processes. The ITIL Incident Management process talks about Hierarchical and Functional escalations. Hierarchical escalations are where higher management attention and additional resources are needed. Functional escalations are done when incidents are assigned to next level support personnel with greater expertise to work on them. Criteria for triggering such escalations are very much organisational dependent. The majority of escalations will be initiated from Incident Management and there are a number of triggers, both time and event based.

Incident Management handles many incidents each month and only a very small percentage will require escalation. It is important that Incident Management adhere diligently to the triggers. Failure to do so may result in Customer anxiety and dissatisfaction. On the other hand, "false alarms" or "crying wolf" may lead to increased costs due to the additional attention and resources required to manage the Customer.

Hence the criteria should be well-defined, documented and made known to the Service Desk, and other IT Support functions. These criteria could be embedded within support tools to help the support staff. The tool could also be used to control the process flow within agreed timescales for those escalations that are triggered due to prolonged service outages.

Examples of reasons to initiate an escalation could include the following:

  • A prolonged service outage that exceeds or threaten to exceed the Service Level Requirements or timeframe, leading to high customer anxiety or complaints.
  • Frequently recurring or multiple related High Priority incidents where Priority is related to business impact and urgency. In situations like this, the Customer's confidence in the Service Provider would have been greatly impacted, not to mention the impact to the Customer's business. Hence, an escalation is called for to bring about management attention and also expertise to find the root cause and prevent future incidents.
  • Management of a major Incident (part of Major Incident Procedure). Typically, a Major incident procedure would have included activities related to escalation or crisis management
  • Management of a major Problem especially where impact to business is high and the assigned problem management team is taking too long to isolate the cause of the incidents. This could in turn cause other issues, such as high customer anxiety, recurring incidents and lost of Customer’s confidence in the Service Provider.
  • Data loss or risk of potential data loss. Any loss of data has a significant impact on the Customer. For example, a disk storage system has malfunctioned, leading to a service outage. The customer's last backup was done yesterday and there is potential data loss if the right solution is not found. In situations like this, an escalation may be called for to ensure the right steps is done to repair the disk storage system and recover the data or ensure no data is lost.
  • Risk of actual or potential damage to customer or provider's reputation
  • Safety issue identified or reported by Customer
  • Risk of breach or non-compliance of regulations e.g. industrial health or safety
  • Customer's crisis situation or customer's anxiety is high and customer requests for escalation
  • Common sense
The last one is a good one, as not all situations can be well foreseen! So, service staff needs to exercise judgement and common sense when it comes to triggering an escalation, even with a documented checklist or tool. When in doubt, check with a more experienced colleague or better still, check with the immediate supervisor.

No comments:

Post a Comment

Do leave your comments on the post.