Feeding the Energy Hogs
For years, reliability has been the key word for data center power systems, but these days energy efficiency is a buzzword for this technology-hungry world.
By Lori Lovely
Data centers account for an increasing amount of power usage worldwide, according to Jeff Klaus, director of Data Center Solutions for Intel Corp. Describing the growth rate of data centers as "significant," he says, "Handhelds are everywhere; everyone has GPS...."
The rapid growth of Internet connectivity has caused a sense of urgency in the data center field. The challenge, Klaus, believes, lies in the “explosion” on the Internet that requires “more in the backroom.
“Servers must grow to handle the load,” he adds. “We need more servers, more data centers.”
Data Centers started with small sets of servers, reflects Arlon Martin, vice president of marketing, contracts, and government affairs for Kotura, but these days, large server farms with hundreds and thousands of servers for hundreds of thousands of personal computers are commonplace. A considerable amount of energy is required to power them all.
“Data centers are one of the highest users of energy, consuming roughly 3% of all the power in the US,” calculates Martin. “A Berkeley study commissioned by the government estimated that data centers consume 100 billion kilowatt-hours annually, at a cost of $7 billion a year—and it’s growing with more technology. A Cisco study predicts that global traffic will increase four times in the next five years. Some websites like Groupon and Facebook already add one million users a month. They need data centers that can expand rapidly.”
They also need reliable power capable of keeping up with that kind of growth. But beyond reliability, they need efficiency. Because the cost of power is up, efficiency is critical. Energy costs are the fastest-rising cost in a data center, Intel reports, and power consumption is a major concern, particularly because servers continue to use up to 60% of their maximum power even while doing nothing.
In fact, Klaus indicates that the greatest expense at a data center is power, which leads many of them to upgrade to newer, more efficient systems.
The Power of Friending
That’s exactly what Facebook did. Founded in 2004 by Harvard sophomore Mark Zuckerberg and currently valued at $50 billion, Facebook is the world’s largest social network, with more than 800 million active users around the world.
The company’s data center in Prineville, OR, is one of the most energy-efficient facilities in the world, having achieved Leadership in Energy and Environmental Design (LEED) Gold Certification from the US Green Building Council. A founding member of the Open Compute Project intended to help the entire industry achieve greater efficiency through open sharing of data center design, operational strategy, and hardware technology, Facebook is helping develop servers and data centers following a model traditionally associated with open source software projects.
With a goal of building one of the most efficient computing infrastructures at the lowest possible cost, Facebook designed and built its own software, servers, and data centers that are 38% more efficient and 24% less expensive to run than Facebook’s existing leased data centers.
Among the innovative systems Facebook pioneered at Prineville are custom energy-efficient servers that can operate at higher temperatures to reduce mechanical cooling needs; a low-energy, 100% outside air evaporative cooling system that eliminates the need for cooling towers or chillers because it draws cool outside air into the building for further cooling through evaporation; and electrical distribution from an onsite sub-station that eliminates unnecessary losses due to fewer transformations and conversions. Typical energy loss during conversion is 21 to 27%, but at Prineville, it’s only 7.5%.
DC backup and high-voltage (480 VAC) distribution eliminated the need for centralized UPS and 480V-to-280V transformation. A built-in penthouse contains the chiller-less air conditioning system that uses 100% airside economization and evaporative cooling to maintain the operating environment. Contributing to the facility’s gains in energy efficiency is its location in Oregon, where cool year-round temperatures reduce the amount of energy needed for cooling.
By incorporating these features, Facebook has reduced the data center’s energy consumption (reflected in power usage effectiveness) to anywhere between 1.06 and 1.11, which compares with a verified 1.07 at full load during commissioning.
A challenge faced at Prineville was keeping the air handler lineups from “fighting” with each other when they dealt with rapid changes in temperature and humidity of the outside air from day to night. According to Facebook, if the outside air dampers of one lineup were at 70%, the adjacent lineups would have their outside air dampers at 20–30%. This alternate modulation often led to stratification of air streams.
A more serious issue encountered was an error in the sequence of operation controls that led to a complete closure of the outside air dampers, causing the one-pass airflow system to function like a re-circulatory system, Facebook reports. The data center was recirculating hot exhaust air at high temperature and low humidity when outside air conditions began changing rapidly in June and the incorrect control sequences drove economizer demand to zero. In reaction, the evaporative cooling system sprayed at 100% to maintain the maximum allowed supply temperature and dew point temperature, resulting in cold aisle supply temperature exceeding 80 degrees and relative humidity above 95%. Several servers were rebooted or automatically shut down due to power supply unit failure, which was linked to condensation.
Facebook investigated the failure by subjecting the server to rapidly changing temperature and humidity in a controlled test chamber. When the relative humidity level was raised to 97%, and the temperature increased from 59°F to 86°F, condensation was observed on the non-heated components. The erroneous control sequence was corrected and additional safeguards, such as the reevaluation of the minimum economizer demand setting to avoid complete closure of the outside air dampers and modification of monitoring points and alarm settings, were put in place to eliminate the possibility of repeated occurrences.
Facebook isn’t alone in searching for approaches to reduce energy consumption. Because they use so much power, other data centers are trying to find alternative solutions . . . but they also want increased performance, according to Kotura’s Martin. “They all want more energy.”
In fact, he says, businesses depend on it. “Google is the largest advertising company in the world,” he notes. “If a server is down, it hurts their revenue.”
Using silicon photonics, Kotura, a leader in photonics with 100 patents and applications, is helping data centers reduce power consumption without compromising performance. Servers are connected through switches, routers, and cabling by cluster fabric. By increasing optical interconnections, Kotura can decrease power usage with photonics instead of traditional electronic signals.
Photonics is the generation and transmission of light and stems from the first practical semiconductor light emitters invented in the 1960s. In short, it means communicating with light instead of electrical signals. Martin explains that because copper wires aren’t fast enough to transmit 100 gigs and use too much power, data centers are now using optical fibers. With photonics, signals can be transmitted long distances over fiber optics at low power without amplification. Kotura builds the optical components in silicon—chips—that support 100 gigs, have a high data rate and high bandwidth and reduce power consumption by a factor of 10, while increasing the distance of transmission.
Data centers are searching for efficiency solutions that can also enhance performance.
Benefits of photonics include size and distance. Optical chips can be as small as the size of a thumbnail and since the fiber is thinner than a human hair, a large number of them can be bundled and are still smaller than a typical cable.
“The specification for optical components is 10 kilometers,” says Martin, adding that the first generation was about the size of an iPhone and used 20 W, while the new generation is the size of a memory stick and uses 5 W, which demonstrates the trend of reduction in size and power consumption.
The more compact size means reliability is even more critical than in the past.
“Larger size pipes are easier to manage and maintain than small pipes with lots of interconnections,” he explains.
Fortunately, reliability and power have been strong, Martin reports. Kotura has been selling them for five years and has logged one billion device hours of service. “They last longer,” he states.
They also give off less heat, which saves money in cooling costs.
“If you can cut the number of watts to run a data center, you can also cut the number of watts to cool it,” says Martin. A feature Kotura is working on is the ability to run chips at higher temperatures. He says it could save even more energy if chips could run at 85°F versus 70°F.
Intel, IBM, and the military have done a lot of research on silicon photonics, Martin says.
“They show tremendous promise: lower power [usage], reduced size, increased reliability.”
In addition, he indicates that the management structure makes them easier to maintain and monitor and to optimize performance.
“You can use [fewer] servers for the same work, or you can provide faster service,” he adds.
In 2011, data centers saved $170 million by using optical transmitters, Martin notes, adding that photonics is still in the early stage of deployment and that Kotura continues to work on the next generation of optical interconnect.
“That number will double in the next few years,” states Martin.
Take It to the Bank
As important as Google searches and Facebook friendships are, there’s little argument that management of a secure and reliable data center for a bank is truly critical. When Synovus Financial, a large southeast regional bank based in Columbus, GA, needed a new dedicated data center facility as part of its growth plan, MTU Onsite Energy, one of the leading manufacturers of advanced standby power systems, provided the solution.
“They knew what they wanted,” states Eddie Oliver, MTU Onsite Energy distributor and power generation sales engineer. “We designed the equipment to meet their very specific site requirements.”
What they wanted was a fully redundant 2,250-kW emergency standby power system with the ability to expand with a third generator for future growth.
The most unique aspect of the project is due to Synovus’ rotary UPS that provides a brief 20 seconds of backup power before dropping the load. That necessitated unique starting requirements beyond the typical norm.
“It had to consistently start in eight seconds, which is faster than normal and emphasizes the importance of generator reliability,” says Oliver.
The MTU generator engine is fast-starting.
“It’s a couple seconds faster than many other large generators,” he adds.
Many mission critical applications choose to have redundant generator set(s), but the reliability of each generator set can be significantly increased by taking the redundancy requirement a bit further.
“How far do you take it?” ponders Oliver. A redundant starting system is a very sound investment. In typical large engine applications, two starters are connected in parallel and get power from a single battery string. If any part of the starting system fails, the whole system is down.
Other requirements involved sound. “The financial institution is in a noise-critical area and adjoins a residential property,” he says. “In most applications, considerable noise comes from the fan blowing air through the radiator to cool the system. We worked with the radiator manufacturer and acoustic engineer on a custom cooling package and oversized radiator with a low-noise fan. It’s so quiet, you can stand outside the engine room and carry on a conversation.”
One similarity the system shares with others is the need for routine maintenance. An extensive inspection is conducted quarterly by the local MTU Onsite Energy Dealer, W. W. Williams, along with annual oil and filter changes and a coolant change every three years. The recommendation is for oil and filter changes at 250-hour intervals, but because standby applications rarely reach that many run hours in a year, Oliver says they do a change annually.
The other part of their maintenance involves exercising the units in the form of weekly testing under facility load for 30 minutes. Synovus’ system operates in a closed-transition transfer mode: power can be transferred from the utility to the generator and back with no power interruptions. Even with this capability, not all data centers do their routine testing under the actual facility load, Oliver says. “Some see it as a risk.”
The importance of the weekly operation under load comes from the fact that diesel engines are made to run—and they like to work. Letting them regularly get up to full operating temperatures and pressures helps ensure the long-term health of the engine. As a general rule, he says a diesel generator shouldn’t run with less than 30% of its rated load for an extended period of time. If it’s not exercised under load, diesel engines can “wet stack,” meaning oil and unburned fuel can accumulate in the exhaust and cause damage.
“It has to be burned off, or it can cause a mess or even engine failure,” states Oliver. “Running under load keeps things clean and helps seat the piston rings, extending engine life. You should get 30-plus years of operation out of a well maintained standby generator set. We pull 30- to 40-year-old engines out of hospitals on a regular basis. The number of hours they’re capable of running is more than the number of hours they normally get to run as standby.”
Reliability often comes at the expense of efficiency.
“For a typical onsite power system, the power must be there 24/7,” says Ed Spears, manager of technical product marketing for Eaton. “It’s analogous to the criticality of a hospital.”
He defined the tiers of reliability as:
- Tier 1—battery backup—a UPS
- Tier 2—redundancy—multiple UPS
- Tier 3 and Tier 4—the ability to conduct concurrent maintenance while the system is running
Explaining that the higher tiers are more reliable but more costly, he says a facility must decide on the level needed and the type of system: modular or centralized. Dual-bus architecture is a popular option for critical data centers, he says, because there is no vulnerability to power problems.
“Every server and router has two power cords so the server can operate on either,” says Spears.
Another popular innovation is a multi-mode UPS, which merges the benefits of a static and a rotary UPS for high performance, high efficiency, and high protection.
“It’s new,” indicates Spears, but he says that because “tradition is king” in the large data center world, it’s difficult to be successful with new ideas.
“There’s an element of risk that overrides cost,” he continues. “If the power is down and information is lost, they have to pay a lot of money to their customers.”
Eaton offers a scalable multi-mode UPS.
“Buy what you need when you need it, add to it as you grow,” advises Spears. “You need to scale capacity to growth so that it expands and contracts with your needs. It’s not efficient to ‘buy big’ expecting to grow into it.” Modular, flexible, scalable designs are efficient, drawing less power. Its technology is compatible, which is beneficial during expansion since the system can be easily updated. “It’s highly firmware-controlled so you can keep the hardware longer.”
The multi-mode UPS is also “extremely high-efficiency,” says Spears.
“It’s 99% [efficient] versus the usual 70 to 75% in the industry. We just can’t be that wasteful now. The stakes are high.”
Its efficiency results in it throwing off less than 1% of energy as heat. He explains that previously, data centers might have spent 6–10% of their budget on heating and cooling, but now that power density is high, they spend 25–40% of their budget on heating and cooling.
“It’s a challenge to energy consumption because of the amount of power needed to power devices and the amount of power to remove heat—which is the biggest part,” he says.
Efficiency is important, but change cannot be at the expense of reliability, Spears mandates.
“Data centers have to be up and running; that’s part of the mission criticality,” he says.
Ensuring reliability through redundancy in a burgeoning environment results in large data centers (or “farms”). Even with today’s smaller devices that use less power, total power usage continues to increase every year as a result of the dot-com boom and expanding use of the Internet and other electronic devices. Acquiring more efficient power—how to use power to get the most out of it—may be an extra thing to worry about, Spear fears.
In fact, he explains, “Visionaries worry about a power limit. If data centers become more of an energy hog, it could be a problem: Facebook versus Greenpeace. The government could get involved—it’s already happening in Europe: cap and trade. To avoid hitting a wall, we may have to limit time on Facebook.”
Until that happens, Spears explains the roles of those charged with ensuring that data centers are sufficiently powered. Engineer and data center architects design the power network and the whole data center for performance, reliability, and flexibility. A data center must expand as a company’s needs increase, he says.
“Today’s cost of power and cooling means it’s too wasteful to start with big systems and a big room; it must grow with a data center’s needs in an effort to drive down the total operating costs,” says Spears.
Equipment managers like Eaton “listen to the voice of the customer, meet their needs and innovate to improve efficiency.” Data center managers are involved in the decision making about the choice of systems. Once installed, it’s the facility manager’s job to keep things working and make necessary infrastructure decisions. In the last five years, Spears says the two have started to make joint decisions.
With the trend toward smaller, lighter, less intrusive systems, Spears says customers want the product to recede into the background.
“They want less worry about power costs and reliability,” he says.
Achieving reliability requires ongoing maintenance and diagnostics. As in the automotive world, data center diagnostics involves software rather than hardware. It’s important to be proactive in looking for advance warning of things trending in the wrong direction. It’s equally important to be proactive by implementing a routine maintenance schedule.
Routine maintenance facility managers are responsible for includes testing the emergency power system in as realistic a fashion as possible. Spears recommends removing utility power to verify startup and proper running so it feeds without interruption. In addition to regular “stress” tests, it’s important to keep them clean and to check batteries and air filters.
“Even a static UPS needs maintenance of those items,” he adds.
Spears advises budgeting for sealed battery replacement every five years and points out that maintenance of heating/cooling systems contributes to battery life.
“Proper temperature can extend battery life; they’re more sensitive than the rest of the UPS,” he says.
If a regular maintenance routine isn’t maintained, it can be costly, particularly for a rotary UPS.
“Repair is extensive and expensive,” indicates Spears. “If you have to replace bearings, you incur shut-down time unless you have a generator or bypass system.”
A well-designed and fully integrated system can be easier to maintain, provide better efficiency and deliver better reliability. Schneider Electric uses whitespace monitoring systems and building management systems to understand server power usage in order to save on utility and cooling costs without risking reliability.
“Some designers feel that taking systems from different vendors and putting them together is optimal,” says Lee Featherstone, Schneider’s business development manager for the Energy Solutions Group. “You may get the best of breed in individual pieces, but it can cost you in efficiency, reliability, and price.”
Instead, he says a fully integrated solution is more efficient because it allows the client to obtain data from the cooling and power systems as well as the server, using less energy.
“A 1 to 2% energy savings at a large data center means a lot of difference in energy usage.”
Schneider believes the future is not just tying systems together; it’s fully integrated systems. “If designed properly, you can save up front because it’s more efficient,” says Featherstone, adding that a good design is still necessary with a fully integrated system.
Nevertheless, he contends that it’s more efficient to have all domains in a unified platform—security, building management, power, cooling, white space—as opposed to buying from multiple vendors and building a bridge between them.
In addition to making information gathering simpler, integrated systems make maintenance easier.
“With disparate systems, every change requires “touching” in multiple places to update the system,” says Featherstone. Disparate systems can also mean multiple technicians.
One Schneider customer—Raging Wire, located in the California valley near Sacramento—reports better reliability and visibility into the power system with their system. The existing Square D power customer was already outfitted with power gear, a power monitoring system and a circuit monitoring system before adding a building management system.
“Tying systems together has benefits,” says Featherstone.
Offering break-fix service and maintenance contracts, Schneider helps data centers maintain their systems.
“The back room is as critical as the UPS or white space,” states Featherstone. “If the parallel switchgear goes bad, you have 30 minutes or less to fix if the UPS has no power. If a data center loses cooling, you have less than 5 minutes to shut down. A data center is a cash register: companies could lose millions of dollars if it’s down.”
In addition to maintaining the system, Featherstone says a key to reliability is to recommission it regularly, which could involve reprogramming the overall system for better efficiency. If the system is set to run and cool at full load, but only runs at half load, it’s wasting energy, he indicates.
“Most designers over-design the white space; most UPS run best at full load,” he adds. “The best efficiency is at the right load.” However, he says, they’re not always re-commissioned; more often, they’re “just maintained.”
Regardless of the system or the maintenance schedule, Featherstone concludes, if the wrong driver is in place, it may not be driven right. Schneider’s lead technicians are embedded at 30 data centers as operators.
“It adds to our break-fix and maintenance capability. The IT [information technology] guys at data centers may not know how to handle issues; we understand the system and have a relationship with support techs.”
Seeing the Heat
Energy savings can be confounded by lack of visibility at the server level. Power management of Internet connections is further complicated by calculations based on modeled data that is not in real time and doesn’t account for variations in energy usage throughout the day, which can deviate by as much as 40%, according to Jeff Klaus, director of Data Center manager for Intel Corporation.
“Previous tools to extract data involved modeling to estimate, based on history,” he elaborates. “The challenge is the concern of overloading the circuit. Alternately, data centers purchased expensive power dispensing units—two for redundancy—to tell what the servers are consuming at the rack level.”
Data supplied by Intel reveals that an estimated 15% of servers in data centers drawing power even when not in use, at an annual cost of $800 or more per 400-W server. It adds up to approximately $24.7 billion spent on server management, energy, and cooling for unused servers.
Intel’s new tool is the Intel Data Center Manager SDK, a power management solution stack that provides power and thermal monitoring and management for individual servers, groups of servers, racks and PDUs in real time. Granular analytical ability in real time with software tools is new, Klaus says, but can be beneficial in managing hot spots and forecasting power usage.
Although each OEM has its own firmware methodology, Klaus adds, the SDK can be used on hardware from any of them. By translating all the “languages” into one, it allows customers to be multi-sourced, which he says is an important consideration.
“It allows someone to get real-time information on usage from various manufacturers that shows specific peak usage,” he says.
Customers such as BMW and Pixar like its ability to go to the device level, not just rack level, he indicates.
“It allows them to increase rack density. They can use fewer servers, which saves on energy used to cool and gives them room to expand as their business grows.”
Another benefit is power capping.
“You can limit the amount of power used by a server, a row, a rack of servers or an entire data center,” explains Klaus.
Limiting power allows a company to allocate its budget or could be mandated by law. He references a brownout in China, after which companies were limited to running manufacturing floors from eight to one, and the situation in Japan after the tsunami and nuclear plant disaster in 2010, when the power usage level was reduced by 25%.
“When the server is at idle, it still consumes 50% of peak capacity,” says Klaus. “From 9:30 [a.m.] to 4 [p.m.], Nasdaq is at full capacity, but they employ power capping after 4 p.m.”
Similarly, he says Pixar, which renders film at night, incorporates power margining.
“They put the cap just below peak or a guard rail just above peak to save power.”
In an emergency, the backup supply lasts longer when power capping is used.
“Jobs take longer, but the power supply can run longer because you’re not using as much,” he says.
The real innovation is the monitoring of inlet temperature data.
“Some operators find that data centers are cooler than necessary,” explains Klaus. “They can safely turn the thermostat up and save 4 to 6% on air conditioning.”
Due to new Energy Star requirements in July 2012 that will require data centers to meter IT devices, Klaus says the EPA is interested in Intel’s power management system. It provides continuous power readings and the new platform release this quarter will feature additional capabilities.
“Getting data through hard-wired metering—UPS with metering or hardware meter—is expensive, and it’s hard to upgrade, but with Intel’s perpetual licensing, maintenance updates are free,” he says.
Box-level readings help determine if there’s a need to upgrade at all.
“Maybe they could consolidate due to less power consumption,” suggests Klaus. “What power does a workload take? This gives a data center the ability to be more efficient.”
By aggregating efficiencies—adding five to seven servers to the rack versus adding a new rack, perhaps—a data center can save 10–15% on power usage and 4–6% on cool usage.
Because each application is different, specific return on investment varies, but Klaus insists that the “paybacks are attractive. You don’t have to be a big data center to benefit.”
For Australian companies, it goes beyond savings. Klaus refers to a new carbon footprint law down under that will tax companies in 2012 for their carbon footprint. Many companies don’t know how to calculate, let alone reduce, their carbon footprint or energy use, Klaus says, so they turn to Intel.
“Our goal is to impact the grid.”
Author's bio: Lori Lovely, winner of a Society of Professional Journalists award for non-deadline news in 2011, writes authoritatively on transportation and technical subjects.