Selecting High Availability Technologies for Cloud Computing

Overview

The National Institute of Standards and Technology (NIST), part of the US Department of Commerce, recently published Special Publication 800-144 Guidelines on Security and Privacy in Public Cloud Computing to clearly express concerns about using public cloud computing service providers. I covered a number of these concerns in my article on Security and Privacy in Public Cloud Computing (Security and Privacy in Public Cloud Computing).

My intention in examining these concerns is to map the theory and recommendations that were expressed in SP 800-144, including some supporting reference documents, to a practical IT approach. Last week, we focused on defining availability requirements for cloud computing (Defining Availability Requirements for Cloud Computing); in today’s article, we’ll explore the Availability Technologies recommendations.

Availability Technologies for the Cloud

SP 800-144 broke out two pillars of availability. First, it recommends you understand and define availability requirements. Then you select technologies that achieve or exceed the requirements. My previous article showed you how to define availability, how to distinguish between recovery and availability scenarios, and how to determine your availability needs.

Remember that availability, in the context of outsourced business services, is how much time the service provider guarantees that your data and services are available. This is typically documented as a percent of time per year – e.g. 99.999% (or five nines) uptime means you will be unable to access resources for no more than about five minutes per year.

Note that cloud service providers differ in their definition and measurement of availability. Some providers claim no downtime if a single internet client can access at least one service, while others require that multiple countries or internet service providers can access all services. Ensure that you review the provider’s definition, as you may need to change your requirement to match the way they measure it.

You might assume, as most of us do, that great backups, mirrored data, hot spare systems, clustered servers, and redundant drive arrays are features of every cloud provider and solution. But those are individual technologies. Each by itself is fantastic, but none by itself provides availability.

What Availability Technologies Should I Avoid?

SP 800-144 specifically recommends that you avoid relying on weak availability technologies. There are a few attributes that usually define a weak availability technology, including:

  • New. The newest availability technology is the least proven. That’s one reason redundant disks are still a common availability technology – they’re not cutting edge but they’ve been time-proven.
  • Proprietary. When a company won’t tell you how an availability technology works, run away. The less you know about the standards and operation of the technology, the less confident the cloud provider is.
  • Confidential. Some providers state that they have redundant sites, network connections, defenses, etc., but they will not disclose the details for fear of giving attackers an edge. That’s bogus. Attackers will find the information anyway, and confident providers know it. Lack of provider transparency is a sign of shoddy confidence in the availability.

What Availability Technologies Should I Seek Out?

Now that you know the warning signs to avoid, what technologies should you look for? That’s actually a bit easier to answer. The attributes of solid availability offerings are pretty straightforward.

You should look for technologies that come from someone you trust, are based on accepted standards, and are cost-effective solutions to your availability requirements.

Trusted Vendors

When I think of highly available internet connectivity (as an example), my first thought is of multiple independent access paths to my services and data. Direct connections to providers are necessary, but they’re only as reliable as the internet service provider. If I cannot trust AT&T or Verizon or any other provider, I should not consider their service highly available. The same can be true for any category of vendor – hardware, software, network connectivity, etc.

The vendor trust criteria I use here is pretty simple. If the provider has been in business for several years, has good industry reputation, and provides documentation on how they deliver high availability services, I’m happy.

My peers always have vendor preferences, and you probably do too. Your cloud service provider almost certainly has a preference. If you find a service provider that uses your preferred vendor, great! If not, consider being open to a change after reviewing these criteria.

Accepted Standards and Practices

The assurance of high availability technology usually results from the adoption of documented standards, such as using a redundant array of inexpensive disks (RAID) to provide data availability during storage failures. RAID has become a de-facto standard over time and is widely accepted as an availability technology. Along with RAID is the operational practice of monitoring for disk failure and replacing a failed drive immediately to minimize the potential impact of multiple drive failures.

Not every solution has to be based on an IEEE or RFC document. Many of the best high availability technologies are based on accepted practices or standards that have simply evolved as industry best practices. For example, RAID is typically implemented in hardware drive arrays and is generally considered more efficient, but it can also be successful as a software solution. While there may be no absolutely best technology, you should always seek those that are common in the IT industry.

Cost Effective Availability

Let’s face facts: you don’t have an infinite budget. And some of these highly available technologies cost big money. The cost considerations are across the board – hardware and software purchasing, hiring technology experts, planning and implementing the solution, and operating the solution at the high availability level you demand. Many of them even have dependent technologies that add an entire layer of complexity, such as highly available clustered systems that require their own dedicated independent fiber network.

In fact, that’s the primary reason many companies are looking at cloud computing solutions. Partnering with a cloud service provider gives you access to high availability products and solutions that you may not otherwise be able to purchase and operate.

When I compare costs, I try to reach a key calculation: how long will it take before it costs less for me to deploy the solution in-house? There’s always a time where ownership costs less than outsourcing, but in most high availability scenarios, the time is actually beyond the projected lifetime of the service or data. If it will take six years to make an in-house cluster cost-effective but I don’t think the service will be in place for more than three, the cloud solution is a no-brainer.

Summary

High availability is one of the most important benefits your company can get from cloud computing. Failover and accessibility options that you only dream of today can become a cost-effective monthly payment. You need to beware of the warning signs that indicate a less-than-trustworthy solution. Avoid proprietary and closed offerings, and demand documentation that explains exactly how the cloud provider delivers on its commitment to high availability.