From the first paper A study of the environmental impact of this technology was published three years ago, a movement has grown among researchers to self-report the energy consumed and the emissions generated by their work. Having accurate numbers is an important step toward making changes, but actually collecting those numbers can be challenging.
“You can’t improve what you can’t measure,” says Jesse Dodge, a researcher at Allen Institute for AI in Seattle. “The first step for us if we want to make progress in reducing emissions is to get a good measurement.”
To that end, the Allen Institute recently collaborated with Microsoft, the artificial intelligence company Hugging Face, and three universities to create a tool that measures electricity consumption of any machine learning program that runs on Azure, Microsoft’s cloud service. With it, Azure users building new models can see the total electricity consumed by graphics processing units (GPUs) – computer chips specialized to perform calculations in parallel – during every phase of their project, from model selection to training and putting it into use. It is the first major cloud service provider to provide users with access to information about the energy impact of their machine learning programs.
Although tools already exist that measure the energy consumption and emissions of machine learning algorithms running on local servers, these tools do not work when researchers use cloud services provided by companies such as Microsoft, Amazon and Google. These services don’t give users direct visibility into the GPU, CPU, and memory resources their activities are consuming—and existing tools like Carbontracker, Experiment Tracker, EnergyVis, and CodeCarbon need those values to provide accurate estimates.
The new Azure tool, which debuted in October, currently reports energy consumption, not emissions. So Dodge and other researchers figured out how to match energy use with emissions and presented accompanying paper on this job at FACT, a major computer science conference, in late June. The researchers used a service called Wattime to estimate emissions based on zip codes on cloud servers running 11 machine learning models.
They found that emissions could be significantly reduced if researchers used servers in specific geographic locations and at specific times of the day. Emissions from training small machine learning models can be reduced by up to 80% if training starts at times when there is more renewable electricity on the grid, while emissions from large models can be reduced by over 20% if the training work is pausing when renewable electricity is scarce and restarting when it is more abundant.