InstaMed Blog

Guest Bloggers: Bill Marvin and Chris Seib, Co-Founders of InstaMed

Last night, I turned on iTunes Match for the first time and streamed music from iCloud while making dinner.  Using the cloud to play music worked great, but it made me wonder: what would happen if the cloud went down and my music was unavailable? For five minutes, or for five hours?  I’d be annoyed and inconvenienced, forgetting all about my recent delight and the old way I used to do things.  The bottom line is, since only my MP3 music data would be affected, it wouldn’t be a big deal.  But today, consumers and businesses are transitioning all kinds of data to the cloud, from MP3 and pictures to mission critical data.  And the cloud is not just being used for data storage and retrieval, it’s being used to support business functions, like CRM, accounting, processing functions and cash flow.  These business functions, especially cash flow, are mission critical to any business.

What if an error caused your cloud-based system to go down for an hour?  Maybe that’s an inconvenience, maybe that’s some lost revenue, or some extra labor costs.  But what if it went down for a few days?  In most cases, this type of event would impact your business in a material way.  While moving to the cloud greatly enhances the way we use data and conduct business, it also presents new risks to consumers and businesses.

The following post, written by Chris Seib, my co-founder and CTO, discusses the best practices all businesses should use to ensure their businesses have “True Availability” when leveraging the cloud or vendor solutions based in the cloud.

–          Bill Marvin, President & CEO, InstaMed

Achieving True Availability through Best Practices

In the wake of the data center outages as a result of a recent storm in Virginia, along with other data center failures, it’s important to recognize what went wrong and what best practices could have been applied to prevent long-term disruptions, so your business can ensure its processes and functions have business continuity and true availability.  It’s often easy to underestimate the cost of your critical vendors being down, until too late.  Worse yet, many vendors talk about reliability but may take shortcuts to save costs, which can have a very significant impact on your business.  Here are some best practices and tips you can use in discussions with your current or potential vendor partners.

Local High Availability & Fault Tolerance

Most downtime is caused by hardware failures rather than natural disasters.  In fact, between two and four percent of data center grade hard drives can be expected to fail each year (nearly four times as likely as manufacturers will claim).  A private cloud data center must be architected at all layers with this in mind to minimize any disruption from these events. This is often referred to as High Availability or Fault Tolerance.

Power & Cooling Best Practices

A private cloud data center should have complete power redundancy.  In most cases, this means having two separate, high-priority feeds from the local power company, battery and generator backups, and high-end electrical equipment available to ensure seamless switching between these sources. Many vendors have a simple, low-end Uninterruptible Power Supply (UPS), which may only supply minutes of backup. It’s crucial to have multiple generators with fuel supply contracts so a data center can run indefinitely.

In private cloud data centers, it is critical to have adequate cooling. Cooling systems must be completely redundant, with high fault tolerance. Many data centers only have a single air conditioning unit, which is often insufficient when there is a heat wave.

  • Tip: Ask your vendors: When is the last time your backup power supplies were tested? How much downtime is expected if the power company has a complete blackout? How long can they keep services up if there was a complete power blackout? Ask them to prove it by sharing testing results and allowing you to tour their data center facilities.

Hardware Best Practices

When it comes to data center hardware, the rule of thumb is to always have one more than you need “active” (IT people call this N+1). If you need a firewall, you should have two firewalls, and they need to be configured for zero-downtime failover.

Furthermore, there should be no single point of failure; every component must have redundancy. Storage area networks should have redundant drives, hot spares and multiple controllers. All layers of the system must be included, and individual components should have high availability in order to avoid downtime.

Many vendors often claim to have redundancy, but they still show single points of failure that can be exposed and cause extended downtime.

Having multiple pieces of hardware is all well and good, but what is truly important is that these pieces are interchangeable with no customer impact, otherwise known as immediate failover. Many vendors claim to have standby servers or equipment, but it will take hours or days for that new equipment to come online.

  • Tip: Ask your vendors to prove that they have complete redundancy of all components and that they regularly test the failover.

Monitoring

At a private cloud data center, it’s important to have proactive monitoring and alerting in place with adequately trained professional IT staff that are familiar with the applications and services. This helps ensure that any issue or degradation is identified early and resolved quickly before any customer impact. Issues will happen; hard drives fail and network issues are common, but in almost all cases there are early warning signs.

  • Tip: Ask your vendors to describe their data center monitoring and alerting procedures.

–          Chris Seib, Co-Founder & CTO, InstaMed

Click here to read True Availability: Part 2, featuring more best practices for disaster recovery, business continuity and security.

 


The views expressed within posted comments do not necessarily reflect the views or opinions of InstaMed.

  1. As a non-IT trained person in pediatric medical environment responsible for disaster recovery and security policies and, with strong contracted IT support, co-managing on-site computers/network, I am currently reading everything I can on disaster recover, security and cloud computing. I find this article helpful. We are watching cloud computing evolution very closely so we can determine when it is “mature” enough to trust to handle PHI. Keep info coming!

  2. Mary, it’s good to hear about your diligent approach. Many vendors extol and sell the ‘magic and buzz’ of the cloud, and the fact is that many of them just leverage a third party and do not even do their own diligence of their downstream suppliers. The fact is, this is not magic, and the same basic fundamentals apply and the same failure points (disk, power, change management, network layers) exist. The key is to ask questions, keep pressing, ask for independent third-party audits, and understand what service levels and DR (disaster recovery) capabilities you are buying now. If and when the service has an issue, you’ll be asking questions then… and no one likes surprises or buyer’s remorse. Good luck, and keep an eye out for our next post!

  3. Another key feature for cloud storage is transportability, the ability to move your entire storage environment to a new cloud vendor should it become necessary due to outage issues, poor service or cost increase problems. For our clinic, we use a cloud provider that has software to keep data on multiple local computers synchronized to the data stored on the cloud. This feature allows us to write the data to local storage on multiple local computers and have it all synchronized to the cloud in background tasks. If the cloud provider has an outage, all our files are still available to us through our local network. If we decide to change cloud vendors, all our files are available for immediate transition to the new vendor. Since two of the local backup computers are offsite from the clinic, we could recover from a cloud outage and/or local outage at the clinic with minimal disruption or data loss. The low cost of data storage devices available in the market today combined with the synchronization ability of some cloud providers makes it is easy and cost effective to establish a massively redundant data environment.

Leave a Reply

Your email address will not be published. Required fields are marked *

Download PDF

Thank you for your interest in InstaMed. Please complete and submit the webform below and your PDF Download will become available.

  • Cancel

Thank you for your interest in InstaMed.

Please click the link below to download your PDF.

Download PDF

To learn more about InstaMed, please click here.