A Service-Level Agreement (SLA) is simply the expected level of service you shall be receiving from a service provider, vendor or supplier which specifies the metrics & KPIs by which this service can be measured. The SLA also specifies what corrective actions or penalties can be taken in case the aligned service level is not achieved. You will rarely find an IT related contract with a vendor without a clear SLA.
Can you give me an example for more clarity?
A telecom company’s SLA, for example, may promise network availability of 99.999 percent (around 5.25 minutes of downtime per year) and that actually avails for the customer the possibility to reduce their payments by a specific percentage if that is not achieved.
Is it really essential to have an SLA?
As mentioned earlier, SLAs are always considered a critical part of an IT-vendor contract, since the SLA gathers information related to all of the services and their agreed-upon target measures into a single document. Metrics along with responsibilities in addition to what to do in case the service wasn’t provided vs the metrics, provides clarity where both parties are aware what to do & what to expect. To sum up, requirements & expectations are clear! Legal counseling is also necessary in these cases to avoid any misinterpretation where both parties in the agreement are well protected. SLAs should always be aligned to the technology or business objectives to avoid any negative impact on pricing as well as QoS (quality of service), and/or customer experience.
Who provides the SLA?
Standard SLAs can usually be provided by the supplier or service provider to the customer as these SLAs can have several offerings that reflect levels of service at different prices. Negotiations can easily start from there keeping in mind that these SLAs should always be reviewed and modified by the customer and their legal counsel to avoid having an SLA in favor of the supplier on the expense of the customer. During the RFP (Request for Proposal) process, the customer should better include their expected service levels as this affects supplier offerings and pricing leading to the supplier’s decision to respond. For instance, if you demand 90% availability of a system and the supplier is unable to meet this requirement, the vendor might propose a different, more robust solution.
What are key components of an SLA?
The SLA should include 2 main components: Services and Management. Service includes the type of the service provided & the conditions of availing this service such as the time frame for each level of service (prime time and non-prime time have different service levels), responsibilities of each party, escalation procedures, and cost/service tradeoffs. Management includes definitions of measurement standards and methods, reporting processes, contents and frequency, a dispute resolution process, an indemnification clause protecting the customer from third-party litigation resulting from service level breaches. There should also be a mechanism for updating the agreement as required since service requirements and vendor capabilities change, so there must be a way to make sure the SLA is kept up to date.
What is an indemnification clause?
Having an indemnification clause is very important as the service provider consents with the customer to indemnify them for any warranty breaches. Indemnification simply means that the service provider must pay for the customer for any third-party litigation costs resulting from its breach of the warranties. In case you use a standard SLA provided by the service provider, most probably this provision won’t be included. Therefore, always work with your legal counsel to draft a simple provision to include it although the service provider may want further negotiation of this point.
How can I verify service levels?
Service providers avail statistics on a web portal for customers to easily have access & check whether SLA measures are being met. Usually, these processes and methodologies are identified by the vendor ensuring that such processes and methodologies can support the SLA agreement. It’s always highly recommended that the client and the vendor work together during any SLA contract negotiation to avoid any possibility of misunderstanding about the processes/support methodologies, including reporting as well. In the case of critical services, investing in tools that can capture performance data becomes a necessity instead of a luxury.
What kind of metrics should be monitored?
There are different types of SLA metrics & the required ones will mainly depend on the services being provided. In order to agree on the correct metrics, you need to check out your service and decide what is your priority. Having a complex monitoring scheme won’t be effective since no one will have enough time to properly analyze the data.
Metrics worth monitoring might include:
Availability: Uptime defines the percentage of time an instance is up, running, and ready for use while service availability is the percentage of time that service requests return with an expected response.
Response Time: The response/latency time from any cloud resource is the amount of time it takes for a response to return to the client. You always target that the response time to be as low as possible since it impacts the user experience
Defect rates: Number/Percentage of errors in deliverables. Several types of failures such as incomplete backups/restores, coding errors, etc… can be included in this category.
Technical quality: Measuring of technical quality by commercial analysis tools that check out factors such as program size and coding defects.
Security: Any network security breach can be super costly. Therefore, security measures such as antivirus updates and patches is key in keeping track of all preventive measures taken, in the event of an incident.
What should I consider when selecting metrics & measurements for my SLA?
Your target should always be incorporating best practices and requirements that will achieve the desired service performance and eliminate any additional costs.
Metrics have one common goal, which is pushing both parties to incorporate the appropriate behavior. Both parties should seek optimizing its actions to meet the performance objectives. You shall first focus on the targets and behavior that you want to achieve and then agree on the metrics based on that. You need to ask yourselves: How would you optimize your performance? Does this help in achieving the desired results?
1- Metrics must reflect factors within the service provider’s control. Having a 2-sided SLA is required for measuring the client’s performance on mutually agreed actions which is an effective method to focus on the intended results.
2- Choose measurements that are easily collected. Automation is key! Investing the time & effort to collect metrics manually is very redundant.
3- Less is more. Despite the temptation to control as many factors as possible, avoid choosing an excessive number of metrics that produce a huge amount of data that no one will have time to analyze.
4- Set a proper baseline. Defining the metrics is the first step. Metrics have to always be reasonable & attainable from a performance level’s perspective. If you don’t have a strong record of historical measurement data, then you should definitely get ready to modify the settings at a future date or ongoingly via the predefined process specified in the SLA.
5- Define with care. A provider may tune SLA definitions to ensure they are met. You can find some providers meeting the SLA’s targets 100% of the time. Simply, they are delivering an automated reply to any kind of incident report. This way, the Response Time metric will always be at target. That’s why customers should always agree on a clear SLA to represent the targets of the service level.