NVidia proposes new energy metrics for datacenters

data centers
Sheila Zabeu -

May 28, 2024

Datacenters need a new energy efficiency metric that represents the advances of real-world applications. This is what NVidia, the main supplier of processors for use with current Artificial Intelligence systems in datacenters, argues on its blog.

The most widely used formula today measures the effectiveness of energy use and is known by the acronym PUE. ‘For the past 17 years, PUE has sought to bring operators closer to an ideal in which almost no energy is wasted on processes such as conversion and cooling,’ says NVidia, commenting that this metric has served the demands of datacenters well during the rise of cloud computing and will continue to be useful.

The PUE metric was developed by The Green Grid Association and has been widely used in the datacenter industry since its publication in 2007. It gained a new version in 2016, the Performance Indicator, which focuses more on cooling systems.

For NVidia, PUE has proved insufficient in the current era of Generative AI, because it doesn’t measure the useful output of datacenters, only the energy consumed. ‘This would be equivalent to measuring the volume of petrol an engine consumes without taking into account the distance travelled by the car,’ explains the blog. NVidia cites a 2017 document that lists almost three dozen standards with specific objectives associated with datacenter performance, such as cooling, water use, security and costs.

When we talk about metrics in watts, we are only measuring the input power in a given environment, not the actual energy used by the systems or the efficiency of consumption. NVidia points out that modern datacenters are in fact consuming more, i.e. they report higher input power in watts, but this does not necessarily mean that they are less energy efficient.

NVidia suggests that metrics for modern datacenters use the kWh (kilowatt-hour) or joule unit, which measures the magnitude of work.

What datacenters use as a metric

Datacenters that work with AI systems often use MLPerf benchmarking tools. Supercomputing centres working on scientific research usually use other measures. And with the growing set of applications, more MLPerf tests are emerging using two new Generative AI models.

According to Christian Belady, the engineer who came up with the original idea for PUE, we need to focus on other metrics that are more relevant to today’s datacenters. ‘The Holy Grail would be performance metrics. You can’t compare different workloads directly, but if we segment by types of workloads, I think we’ll have a better chance of success,’ says Belady, who continues to work on initiatives to boost the sustainability of datacenters.

Another researcher agrees with this view. ‘To make good decisions in terms of efficiency, datacenter operators need benchmarking tools that measure the energy implications of the most commonly used AI workloads today,’ says Jonathan Koomey, from the area of computational efficiency and sustainability.

Koomey explains that companies will need to participate in open discussions, share information about their own workloads and carry out realistic tests to ensure that the metrics accurately characterise the energy consumption of hardware running real-world applications. ‘We need an open public forum to accomplish this important task,’ comments the researcher.

More than energy efficiency

Last month, the Uptime Institute announced a service for assessing the sustainability of digital infrastructures. With this tool, it is possible to evaluate and compare the sustainable characteristics of datacenters. The data obtained from the assessment can be used to promote continuous improvements in 14 main categories and more than 50 subcategories of sustainability. The main areas include energy and water use, carbon emissions and waste generation.n areas include energy and water use, carbon emissions and waste generation.

As it takes into account local and regional requirements, the assessment can be applied worldwide, covering a single site or a distributed hybrid set.

Recent and recurring research by Uptime Intelligence suggests that many operators of datacenters and IT environments are still at an early stage in their sustainability journeys. According to the latest report ‘Sustainability strategies face greater pressure in 2024’, less than half of digital infrastructure operators are compiling and reporting on water use (41 per cent) and only 23 per cent do the same for the three scopes (1,2 and 3) of carbon emissions.