Photo by Kier in Sight on Unsplash
How to calculate simple and composite service SLAs availability.
What does 99.99% mean?
4 min read
Service providers must provide some form of guarantee to their customers in terms of the reliability of their service, especially when those customers could lose a lot of money if the provider's service went down.
For this reason, it is very common for service providers to give some information to their consumers in terms of the reliability of their service. This information is provided in the Service Level Agreement (SLA) that the provider gives, which is a form of contract in which the provider promises to keep their service up and specifies the maximum amount the service will be down during the day (week, month, and year).
The most common numbers you will see will look like the ones below:
99.999% and also
These numbers are promising that the service will be available for a specified percentage of the time. In other words, a
100% promise means the service will be available all the time during the day, week, month and year.
Now that we know what the percent means, how do we calculate the actual minutes the service will be down?
Let's start with the simplest one - 99%. How many minutes in a day can the service be down and still meet the SLA agreement?
A: Convert day duration to seconds 1 day = 86400 seconds B: Get downtime percent 99% up = 100 - 99 = 1% downtime allowed C: Calculate in seconds the downtime 1% of 86400 seconds = 864 seconds downtime allowed D: Convert to mins and seconds 864 seconds = 14.4 minutes = 14 minutes and 24 seconds. (0.4 minutes = 0.4 * 60 = 24 seconds) ANSWER: 99% Uptime = 14 min 24 seconds downtime per day
Let's calculate the amount of time downtime allowed for the week (7 days) on a 99% SLA agreement:
7 days = 7 * 86400 seconds = 604,800 seconds 99% up = 100 - 99 = 1% downtime allowed 1% of 604,800 = 6048 seconds 6048 seconds to minutes = 6048/60 = 100.8 mins 100.8 mins to hours = 100.8 /60 = 1.68 hours 0.68 hours = 0.68 * 60 = 40.8 minutes 0.8 minutes = 0.8 * 60 = 48 seconds 99% Uptime = 1 hour 40 minutes and 48 downtime per week
Notice that we could easily convert from 1-day downtime to 7 days by multiplying our minutes by 7 as follows:
1 day downtime in minutes: 14.4 minutes 7 day downtime = 14.4 * 7 = 100.8 mins = 1 hr 40 min 48 sec
Now that you know how to calculate downtime duration from the percentage given, let's compare several of them:
|99%||14m 24s||7h 14m 41s||3d 14h 56m 18s|
|99.9||1m 26s||43m 28s||8h 41m 38s|
|99.95||43s||21m 44s||4h 20m 49s|
|99.99||8.6s||4m 21s||52m 9.8s|
|99.995||4.3s||2m 10s||26m 4.9s|
An important thing to remember about SLAs is that service providers can have different SLAs depending on the region, use of availability zones, and the type of action taken.
Let's take an example of Azure CosmosDd for reading data:
99.99%: Reading from a single region without availability zone
99.995%: Reading from a single region with availability zone
You can find more information from Microsoft's documentation on CosmosDb SLAs: https://learn.microsoft.com/en-us/azure/cosmos-db/high-availability#slas
We also need to realize that keeping a service down for a very short duration of time requires a lot of redundant services and mechanisms in place by the service provider. To the consumer, this can translate to higher fees paid for those SLAs.
If you want to see an overview of the SLAs and services provided on Azure, take a look at Microsoft's official SLA agreement.
If two services run in series - meaning,
ServiceA depends on
ServiceB, then, we calculate the SLA as a product of the two.
Series SLA = ServiceA availability * ServiceB availability
ServiceA: 99.9% Service B: 99.95% Composite SLA: (99.9 * 99.95)/100 = 99.85%
If two services are running in parallel, the SLA calculations change to the following:
Parallel SLA = 100% - (ServiceA unavailability * ServiceB unavailability)
SeriviceA: 99.9% ServiceB: 99.95% Composite SLA = 100% - ((100% - 99.9%) * (100% - 99.95%)) = 100% - (0.1% * 0.05%) Convert to decimal, divide by 100 = 1 - (0.001 * 0.0005) = 1 - 0.0000005 = 0.9999995 Convert to percent, multiply by 100 = 99.99995%
Always consult the SLA agreement for the service from the provider. A single service can have different availability agreements based on things like tiers selected, redundancies opted in, and more.