5 Best Practices For Data Center Disaster Planning
Fire is the biggest threat to your data center operations—here's your action plan for minimizing downtime and choosing the fire suppression systems that may matter most
By Charles Strobel
There's no time like now to review disaster planning for your data center operations—and what kind of fire suppression systems are right for you.
Whether it's a new or existing installation for a small server room or a hyper-scale data center complex, there are some important steps you can take to maximize business continuity this year. And while there are many causes for downtime, addressing the gravest danger you face is a smart place to start.
Just ask Delta Airlines. A fire at its Atlanta data center in 2016 resulted in a power outage that led to 2,000 grounded flights and $150 million in lost revenue. Or pension provider UniSuper. Last year, a fire at a data center in Australia knocked services offline for 400,000 customers for roughly 24 hours, preventing them from accessing accounts that contribute to the $57.5 billion the firm has under management.
So what can you do to minimize the threat from fire and other causes of downtime? While this by no means a comprehensive list, the following should provide five important considerations to help organize your disaster planning.
5 Best Practices
#1 Analyze This: What Are the Risks, and How Will You Recover?
The first step in any disaster planning effort is a full risk assessment on the state of disaster preparedness. What's the business impact of an outage? What's the probability your center will experience one—and what will be the causes?
Nearly one-third of all data centers had an outage in the past year, up from 25% in 2017. And while fire is the third biggest cause of downtime, it also has by the biggest impact on downtime. On average, it takes 25 hours for data centers to recover from a fire, versus 7.8 hours for Heating, Ventilation, and Air Conditioning (HVAC) issues, and 6.5 for power outages. With the cost of downtime now averaging $1 million or more, that starts to add up fast.
What will you do in the event of a fire or other disaster? Which employees will be put in charge of restoring specific systems and technologies? In addition to fire, any viable disaster plan must account for the probability and impact of earthquakes, extreme weather, cyberattack, faulty equipment and human error. Then determine and document what it will take to recover in each scenario.
#2 Snuff, Don't Spray: Consider Inert Gas Fire Suppression
The best solutions for fire protection will always vary by use case, and even from site to site.
Most fire suppression systems come in four basic types: aqueous-based, wet- and dry-chemical, and inert gas. All have their place. But in many data center environments, it's worth your while to investigate an automatic, inert gas fire suppression system made with Underwriters Laboratories (UL)- and FM-certified components.
Generally speaking, an inert gas fire suppression agent is made up of 100% nitrogen or 100% argon, or some combination of both. These inert gasses are non-reactive to chemical interactions, making them exceptionally safe and effective for putting out fires.
In most cases, "total flooding" systems discharge the inert gas suppression agent in sufficient concentrations to suffocate data center fires by displacing oxygen in the protected zone to a level below which fire cannot burn—but is still safe for people who may still be evacuating. Within seconds, flames are put out, and there's no damage to your data or equipment. You're up and running again fast.
#3 Be Redundant: Backup Your (Fire) Systems
Hosting all your data in one center: Bad idea. Data should always be hosted in multiple locations simultaneously. And backing up your fire suppression can be a good idea, too well.
The truth is, 60% of data center fires begin in electrical cabinets, for instance. In addition to room- and building-based fixed systems, consider installing automatic fire suppression systems within the electrical cabinet itself, close to wires and circuitry where a fire can start.
The most advanced electrical cabinet fire suppression systems use pressurized, linear pneumatic tubing technology that bursts the moment it detects fires within an electrical cabinet. The sudden tube depressurization actuates the system and instantly floods the entire cabinet area with extinguishing agent, reaching where room protection systems cannot.
Fire is quickly snuffed out just moments after it begins, minimizing damage. And room-based systems don't have to be actuated more than is absolutely necessary, helping to accelerate incident recovery times and ensure business continuity.
#4 Be Quiet About It: Reduce Sound, Save Data
The last thing you want is for an inert gas fire suppression system to put out the fire but destroy your data. In extremely rare instances, depending on the disk drive type and construction, room acoustics and other factors, the sudden release of extinguishing agent can damage sensitive equipment if sound levels reach 110 dB.
Thankfully, it's also fully avoidable. Today, many of the best inert gas fire suppression systems also offer silent nozzle options that dampen the sound of system actuation to protect delicate servers and hard drives against sound damage.
#5 Rinse & Repeat: Test and Validate Your Plan
Once you've got a plan put together, validate it. Run fire drills with employees. Simulate a disaster to test recovery systems.
Don't start a fire or anything. That's messed up. Instead, Enterprise Storage Forum suggests disaster recovery testing can involve tabletop tests where recovery procedures are discussed and evaluated without physically taking the actions described in your documented plan. You can also conducted hands-on technical tests where participants are tasked with resorting an offline system in order to gauge readiness, too. Then schedule regular audits to repeat all of this again in order to determine if any new or modified precautions or plans need to be put in place.
Expect the Unexpected
As in warfare, so with disaster: The plan is the first thing to go when it comes to an actual event.
But still, a plan is better than no plan at all. With enough preparedness, there's always hope that staff will know what they need to do, and can improvise as needed, based on real-world conditions. The good news: Having the right fire suppression systems in place may be half the battle or more.