How to calculate simple and composite service SLAs availability.

What does 99.99% mean?

Service providers must provide some form of guarantee to their customers in terms of the reliability of their service, especially when those customers could lose a lot of money if the provider's service went down.

For this reason, it is very common for service providers to give some information to their consumers in terms of the reliability of their service. This information is provided in the Service Level Agreement (SLA) that the provider gives, which is a form of contract in which the provider promises to keep their service up and specifies the maximum amount the service will be down during the day (week, month, and year).

The most common numbers you will see will look like the ones below:

99%, 99.95%, 99.99%, 99.995%, 99.999% and also 100%.

These numbers are promising that the service will be available for a specified percentage of the time. In other words, a 100% promise means the service will be available all the time during the day, week, month and year.

Now that we know what the percent means, how do we calculate the actual minutes the service will be down?

Let's start with the simplest one - 99%. How many minutes in a day can the service be down and still meet the SLA agreement?

A: Convert day duration to seconds
1 day = 86400 seconds

B: Get downtime percent
99% up = 100 - 99 = 1% downtime allowed

C: Calculate in seconds the downtime 
1% of 86400 seconds = 864 seconds downtime allowed

D: Convert to mins and seconds
864 seconds = 14.4 minutes = 14 minutes and 24 seconds.

(0.4 minutes = 0.4 * 60 = 24 seconds)

ANSWER:
99% Uptime = 14 min 24 seconds downtime per day

Let's calculate the amount of time downtime allowed for the week (7 days) on a 99% SLA agreement:

7 days = 7 * 86400 seconds = 604,800 seconds
99% up = 100 - 99 = 1% downtime allowed

1% of 604,800 = 6048 seconds

6048 seconds to minutes = 6048/60 = 100.8 mins
100.8 mins to hours = 100.8 /60 = 1.68 hours
0.68 hours = 0.68 * 60 = 40.8 minutes
0.8 minutes = 0.8 * 60 = 48 seconds

99% Uptime = 1 hour 40 minutes and 48 downtime per week

Notice that we could easily convert from 1-day downtime to 7 days by multiplying our minutes by 7 as follows:

1 day downtime in minutes: 14.4 minutes
7 day downtime = 14.4 * 7 = 100.8 mins = 1 hr 40 min 48 sec

Now that you know how to calculate downtime duration from the percentage given, let's compare several of them:

Uptime %DailyMonthlyYearly
99%14m 24s7h 14m 41s3d 14h 56m 18s
99.91m 26s43m 28s8h 41m 38s
99.9543s21m 44s4h 20m 49s
99.998.6s4m 21s52m 9.8s
99.9954.3s2m 10s26m 4.9s
99.9990.86s26s5m 13s

An important thing to remember about SLAs is that service providers can have different SLAs depending on the region, use of availability zones, and the type of action taken.

Let's take an example of Azure CosmosDd for reading data:

  • 99.99%: Reading from a single region without availability zone

  • 99.995%: Reading from a single region with availability zone

You can find more information from Microsoft's documentation on CosmosDb SLAs: https://learn.microsoft.com/en-us/azure/cosmos-db/high-availability#slas

We also need to realize that keeping a service down for a very short duration of time requires a lot of redundant services and mechanisms in place by the service provider. To the consumer, this can translate to higher fees paid for those SLAs.

If you want to see an overview of the SLAs and services provided on Azure, take a look at Microsoft's official SLA agreement.

Composite SLA

In Series

If two services run in series - meaning, ServiceA depends on ServiceB, then, we calculate the SLA as a product of the two.

Series SLA = ServiceA availability * ServiceB availability

ServiceA: 99.9%
Service B: 99.95%

Composite SLA: (99.9 * 99.95)/100 = 99.85%

Parallel

If two services are running in parallel, the SLA calculations change to the following:

Parallel SLA = 100% - (ServiceA unavailability * ServiceB unavailability)

SeriviceA: 99.9%
ServiceB: 99.95%
Composite SLA = 100% - ((100% - 99.9%) * (100% - 99.95%))
= 100% - (0.1% * 0.05%)

Convert to decimal, divide by 100
= 1 - (0.001 * 0.0005) = 1 - 0.0000005 = 0.9999995

Convert to percent, multiply by 100
= 99.99995%

Always consult the SLA agreement for the service from the provider. A single service can have different availability agreements based on things like tiers selected, redundancies opted in, and more.