Data centers are facing the challenge of becoming more and more sustainable. Heat Management by using liquid cooling.
Cooling is one of the most important adjusting screws. But what cooling methods are there at all? And which ones can be effective and sustainable?
The Problem: Cooling data centers is a big challenge. While computers have gotten much better, cooling technology hasn't improved as much. This leads to higher energy costs.
Two Reasons for Change:
Efficient cooling of data centers is a big challenge for the IT industry. While computer, storage, and networking technologies have advanced quickly, cooling technology has changed very slowly. Recently, two factors have promoted new approaches to data center cooling.
The first factor is the need to build more sustainable data centers and improve the energy efficiency of existing ones. The industry’s key players have committed to the Climate Neutral Data Centre Pact (CNDCP), which aims for data centers to be climate-neutral by 2030.
To achieve this goal, more efficient cooling methods are needed.
The second factor driving innovation in cooling technologies is the increasing density of data centers. High-performance computers, especially those used for artificial intelligence and advanced analytics, require a lot of cooling.
However, there is often a fundamental problem: many data centers operate at much lower temperatures than necessary. This increases energy consumption and costs. Each new generation of microprocessors is more powerful and more heat-resistant. Modern data centers no longer need to operate at the same low temperatures as in the past, and employees should not need to wear winter clothing. Data center operators should regularly check the required temperature for their equipment to run reliably and efficiently, and adjust their cooling policies accordingly.
The more uniform the equipment in a data center, the easier this is. Colocation providers, who host a variety of different devices, must consider the least temperature-tolerant devices to set the highest possible temperature for the entire data center, unless the data center is divided into different cooling zones.
Data centre demand is ‘heating up’
Recommended by LinkedIn
As a result of rapid and widespread digital transformation, data centres are needing more on-prem hardware, in addition to more space and more energy to run everything smoothly, which is creating challenges for businesses.
Data centre operations are already a sustainability concern, with the IEA already suggesting that global electricity demand grew 2.2% in 2023 and warning this figure could double by 2026.
Liquid cooling isn’t just for high-performance computing (HPC) and artificial intelligence (AI).
Currently, liquid cooling is still a niche technology, mainly seen as beneficial for HPC (64.4 percent), dense server configurations (60.6 percent), and to a lesser extent, AI workloads (46.2 percent).
Traditionally, liquid cooling has been used in densely packed supercomputing cabinets from companies like Eviden, HPE Cray, and Lenovo. These systems are complex and rely on large coolant distribution units, chillers, and facility water systems. In contrast, most AI systems have been air-cooled until recently.
However, this trend might change soon. At GTC, it was shown that while Nvidia’s HGX B100 and HGX B200 systems will still be available in air-cooled versions, their most powerful accelerators, like the 2,700-watt Grace-Blackwell Superchip, will require liquid cooling.
Despite the excitement around AI, adding AI capabilities was not the top priority for enterprises. About 58 percent of respondents said improving facility security was their biggest concern, followed by reducing energy consumption, increasing the utilization of existing hardware, and acquiring higher performance systems, all ranking above AI capabilities at 27 percent.
Liquid cooling is not limited to AI and HPC systems. It is also much more efficient at removing waste heat than air cooling. As previously discussed, 15-20 percent of power consumption can be attributed to the fans used to move air through these systems. Switching to liquid cooling can largely eliminate the need for high RPM fans, significantly reducing power consumption.
With the opportunistic boost algorithms found on most modern processors, liquid-cooled systems should theoretically achieve higher clock speeds than their air-cooled counterparts.
But the Data center main concerns are maintenance, complexity, and the initial cost of implementation. Liquid-cooled systems require additional resources like facility water, CDUs (coolant distribution units), and rack manifolds to distribute the coolant to individual systems.
Existing data centers can be retrofitted to support liquid cooling using in-aisle coolant reservoirs and liquid-to-air CDUs. However, these methods generally have lower cooling capacity.
Besides cost and complexity, 48.6 percent of respondents say they lack experience with the technology, and 41 percent express concerns about leaks and spills. Additionally, 21.4 percent mention the cost of buying and replacing coolant as a challenge.
Although liquid cooling adds complexity and potential points of failure, the technology is not new. It has been used for decades in supercomputers, HPC clusters, render farms, and more recently in large-scale GPU and accelerator farms for AI training.
The increased interest in liquid cooling has led to preventative measures like negative pressure coolant loops designed to minimize spills in case of a leak, and in-rack CDUs with redundant or modular pumps.
The change is happening faster than many thought, with liquid cooling becoming more common in data centers.
Sixteen percent of data center operators stated they plan to fully implement direct-to-chip (DTC) liquid cooling by 2026. This solution replaces heat sinks with cooling plates through which warm or chilled water or coolants are pumped. In comparison, 6.5 percent said they plan to adopt 100 percent immersion cooling. This technology involves submerging the entire system in either single-phase fluids like synthetic oils or two-phase fluids designed to boil at or near the chips’ operating temperature.
About one-sixth of respondents said they plan to use a mix of DTC and immersion cooling in their data centers, while 61.7 percent said they have no plans to use either technology in the next two years.
Unsurprisingly, the largest enterprises expect to adopt liquid cooling the fastest.