Wake Radiology Reduces Risk with Tier II+ Data Center
In the fall of 2005, 58-member Wake Radiology found itself feeling vulnerable. Having achieved the holy grail of the electronic practice by banishing paper and film, the practice understood that it had much to lose if it lost the network over which it leveraged subspecialized reads across three area hospitals and 11 standalone imaging centers in North Carolina’s Triangle area: Raleigh, Durham, Chapel Hill, and Research Triangle Park.
Even the telephones were running over the network via voice-over IP (VoIP). So Wake elected to tighten up its IT practices by building a Tier 2+ Data Center.
“People need to understand the cost of down time, of not having a system. An organization like ours runs on electrons. It’s like blood.” —Ronald B. Mitchell, MSc, CIO, Wake RadiologyAny interruption in effect takes the business down, and not every practice is that way,” Mitchell continued. “Some are still passing paper and some are still printing film. Each individual practice needs to decide for itself the cost of systems and network failure, and really go through different scenarios and try and figure out for them what the cost is, and then decide what they want to spend to avoid that cost.” There were no precipitating events that convinced the practice to make its move. In fact, the equipment and existing network were fairly reliable. Nor did Mitchell provide the partners with a hard and fast dollar figure for a network failure. “A lot of the risk was assessed in soft dollars, relationships with patients and referring physicians mean a lot to us, and patient care is paramount,” he explained. “All of those would be affected if there was an extensive network outage.” In considering that downtime could be extensive and the resulting cost quite high, Wake Radiology gave the go-ahead to build a data center where a parking lot formerly lay. They broke ground in the spring of 2006, and in January 2007 began moving into the new 1,140-sq-ft data center housed in an addition to its Raleigh-based administration building. Mitchell estimates they have enough storage space to meet the needs of the practice for 15 years and is actively seeking tenants to generate some revenue through the IT arm of the practice. What Is a Tier II+ Data Center? Mitchell, who joined the practice in October 2005, believes the previous data center did not even meet the criteria of a Tier I data center. “The original location was two converted offices, and they did not have a raised floor and the air conditioning was unreliable,” Mitchell explained. “I did put a monitor in the room and if my alarm went off in the middle of the night—and it did—I had about half an hour to get down there and do something about it before the machines shut down due to overheating. I was pretty happy to not have to do that anymore.” Wake hired a consulting company to provide guidance in building the center, Chicago-based Forsythe. A Tier II data center is composed of a single path for power and cooling distribution, with redundant components, and is considered slightly less susceptible to disruptions from planned and unplanned activity than a Tier I. Other features include a raised floor, a UPS (uninterruptible power supply), and engine generators. Maintenance of the single path and other parts of the site infrastructure will require a processing shut down, providing 99.741% availability. The tier classification system includes four tiers, and a white paper from The Uptime Institute, Santa Fe, NM, provides further details on IT industry standard classifications for tier performance in data centers (pdf). The data center Wake built is actually closer to a Tier III, as it features dual power and cooling paths. “The first thing to remember is that those classifications really represent gradations,” Mitchell explained. “We really are more than a Tier II in that we do have redundancy for most of our system, in power, network and air conditioning. And in the fire alarm system we have two different VESDA (very early smoke detection apparatus) systems. So it’s extensive redundancy, and it has proven to be very useful and has kept the data center up. We have not had downtime…and we have had failures, but they are failures we can predict and failures for which redundant components pick up and continue. So it’s working for us.” In North Carolina, power failures are common due to frequent thunderstorms, but the UPS systems pick up, and the generator picks up after that. Mitchell also had a failure in an in-cabinet power distribution unit (PDU), which he describes as a glorified power strip with intelligence. But the duplicate PDU inside the cabinet carried on and enabled the servers to continue running. “All of our servers have dual power supplies,” Mitchell said. “Inside the cabinet you will see a plug coming out of the server on one side and plugging into a power distribution unit. Then on the other side, there’s another one that comes off the other UPS. It does provide the redundancy we need. We have had networking outages at our remote locations and the users haven’t even known about them because the redundant connection picked up and continued to provide images and clinical data.” And if an air conditioner fails, Mitchell will not be roused in the middle of the night to turn fans on the servers: a second unit will kick into action. Design Features of the Wake Data Center While the data center was being built, Mitchell visited the site daily to be sure the center was built to specifications. Mitchell provided the following descriptions and accompanying images of key design features: Location. The data center was placed on the inside of the building with no part open to an exterior wall. The sub-floor is on slab, and there is nothing above the center, including no equipment on the roof. Flooring. The center was built on a 2-ft raised floor throughout. In actuality, the raised floor is at the same level as the rest of the addition, but the sub floor was sunk 2 feet. The sub floor is graded to channel any water that might collect there to a drain. All pipes and cables run under the floor, including power and data in separate troughs. The raised floor tiles are concrete-filled. Air Conditioning. Two 15-ton Liebert CRAC (computer room air conditioners) units push cold air down under the raised floor and up through perforated tiles into the equipment. Either unit could handle the full load of the data center equipment, but the redundancy provides for unit failure or planned outage due to maintenance. Raised flooring showing equipment rows Layout. The data center layout allows for four rows of equipment cabinets (two rows are currently in use). Each row of equipment faces in opposite directions so that the aisle between the first and second rows is called a hot aisle since the backs of the cabinets face each other and the aisle between the second and third rows is a cold aisle since the fronts of the cabinets face each other. Power. The first two cabinets in each row consist of an APC Symmetra 40Kw UPS and associated PDU (power distribution unit). Each UPS/PDU receives a 480 Volt supply from the utility, regulates and smooths it, and provides power to the cabinets in the row for both 120V and 208V outlets as needed. In addition, each UPS also supplies power to the cabinets in the adjacent row. Since nearly all the servers and other equipment in the cabinets have dual power supplies and receptacles, this arrangement provides power redundancy should any of the UPS units fail or need to be taken out of service for maintenance. For utility blackouts, the UPS can handle the load for about 20 minutes. However, after less than 30 seconds of utility power failure, a 350Kw generator housed behind the building automatically engages. The generator can run for 48 hours without refueling. Network. As with power and air conditioning, many aspects of the network, both inside and outside the data center, are redundant. For example, servers with dual NICs (network interface cards) are connected to two different switches. The Wake data center network core consists of dual layer III switches connected at the backplane. If one fails, the other will keep going. For the WAN (wide area network) connections to the imaging centers, two different companies are used and each one's connection comes into the building from two different streets, thus eliminating the effects of the “backhoe through the cable” problem. Monitoring. An APC Infrastruxure Manager provides environmental monitoring of the power, including the UPS/PDU cabinets as well as the power distribution units in each server cabinet, the temperature and humidity, door opening, air movement and motion. In addition, the system monitors for any water build-up on the sub-floor. For any critical problems, emails and pagers are used to notify appropriate IT personnel. The monitoring system has also been extended to network closets in the imaging centers, where UPS units powering critical network equipment report back to the environment monitor a number of parameters, including time on battery due to power failures, battery health, self-test results, and closet temperature and humidity. Critical aspects of each server are also monitored, as well as overall network performance with two additional systems. The tools enable IT to pre-empt potentially critical issues and speedily locate network components that are causing problems. Fire control. A three-hour fire-rated wall surrounds the entire data center and separates it from the offices outside that are protected by a sprinkler system. Inside the data center, a very sophisticated early warning system, a VESDA (very early smoke detection apparatus) uses a series of porous pipes in the room, under the floor, and over the drop ceiling to constantly sniff the air for particulate matter. The VESDA units have a number of levels for reporting smoke. The fire suppression system in the data center is a dry-pipe sprinkler system. With this solution, the pipes above the data center ceiling do not have any water in them until there is a fire. If that happens water sprinkling is limited to the area of the fire. A policy also ensures that the data center is kept free of combustible material. Any cardboard or paper are removed as soon as possible. Security. The data center has only one door for access. This is protected by a biometric system that requires a pass-code, as well as fingerprint recognition, and access is restricted to a few individuals. A security camera can be viewed remotely and graphical displays of movement and door opening activity can also be viewed. Entrance to the surrounding office space is restricted by pass-code and an alarm is set, including movement alarms, when the building is not occupied. Backups & Replication. All the images are stored on servers in the data center. In addition, they are replicated to a storage system at one of our imaging centers. Database backups are done daily via a robotic tape library, and the tapes are taken to a secure location, off site. Current and Future Needs Wake Radiology is currently producing and archiving approximately 4 terabytes of data a year, and all of those images are kept online, all of the time. “We don’t have a hierarchical system as some practices and some hospitals do,” Mitchell explained. “I think that’s a great decision, because I have been in situations where there is a hierarchy, and it’s a problem because you really never can predict what it is the radiologist wants to see. And if they have to wait, obviously that is not good.” The data center currently contains about 20 terabytes, and it is only using two of four available rows in the data center. “There is room in the data center for another two rows, so from a data center perspective, we are ready to expand as needed,” Mitchell said. Mitchell also is actively shopping the space to potential tenants. “One of the things we can and are looking at is offering space in the computer room to other organizations that need perhaps a backup data center or some place to store images,” he said. “This center is a great environment for a server, no matter whose server it is.”