Track the CO2 emissions of your Python code the same way you time it. Here is how!

5 minute read

Nothing exists until it is measured

Niels Bohr

Just think of the following absurd: we live in a digital era where almost everything could be measured and tracked, yet we are struggling to reliably measure the carbon footprint behind the AI computing. There’s no need to say that this becomes especially important with the rising amount of computation.

In the AI community there is already an ongoing effort to encourage responsible research and start measuring the environmental impact. For instance, one of the most eminent conferences NeurIPS is encouriging the researchers to report the CO2 emissions of their research. The Green AI initiative is calling for measures of efficiency in order to boost the innovations in AI without skyrocketing the computational costs.

To make this possible, at least there are couple of existing open-source solutions to track the AI computing CO2 emissions, even though they are not on par with the developments in the AI. Nevertheless, one of these initiatives is Code Carbon. It is build on the same premise as the quote from Niels Bohr above: the AI computing CO2 emissions will be hidden until we discover them by measuring.

In this blog post we will take a look at the CodeCarbon Python library and its importance in the mission to track the AI carbon footprint. Finally, we will experiment a bit and make a demonstration of training a toy neural network in Keras on the IMDb sentiment analysis dataset in order to track the CO2 emissions.

What is CodeCarbon?

Code Carbon is an initiative with the aim to finally start tracking and reporting the AI computing CO2 emissions. It is a lightweight open-source Python library that lets you track the CO2 emissions produced by running your code.

To achieve this, it executes the following two tasks:

  1. Tracks the electricity consumption of the machine on which the code is executed. This is measured in kilowatts (kWh).
  2. Estimates the CO2 emissions per kWh of the electricity in the same geolocation where the machine resides.

The first task is less prone to errors as the environment is predictable. CodeCarbon is measuring the energy consumption of the CPU, GPU (if available) and the RAM memory by taking samples every 15 seconds by default.

There are a multitude of tools to precisely measure the energy consumption of the CPUs. This is an excelent blog that goes over many of them. Currently, CodeCarbon is using either Intel Power Gadget or Intel REPL. If none of these energy profilers is available it falls back to handcrafted techniques: using the CPU load to estimate the CPU power.

For the GPUs it uses the well established PyNvml Python library. To track the RAM memory energy consumption it uses only handcrafted rules.

The second task - to estimate the CO2 emissions of the electricity – is far more trickier. To estimate the CO2 emissions, CodeCarbon is calculating the carbon intensity of the electricity: a weighted average of the energy sources emissions in the current grid. Ideally this should be dynamically computed, but in spite of that it is already a good approximation.

To compute the carbon intensity, the library relies on the CO2 Signal API. This API gives the sources of energy in the region where the computation is taking place. For cloud-based computing, this is even more relevant because there is a precise geo-location and sources of energy that the computing center is using.

Finally, using the hardware energy consumption and the CO2 emissions of the electricity we calculate the total carbon footprint as a simple multiplication between them.

How we can use CodeCarbon?

Now we are ready to demonstrate how we can track the CO2 emissions of a toy neural network training process. Let’s dive in.

As a first step we load the IMDb sentiment analysis dataset. It contains two classes: positive and negative sentiment, meaning this is a straightforward binary classification task. To load this dataset is fairly easy, as it is part of the Keras built-in datasets:

1
2
3
4
5
6
7
8
9
from keras.datasets import imdb
from keras.utils import pad_sequences

max_features = 50000 # vocabulary size
maxlen = 512 # The length of every input sequence

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
x_train = pad_sequences(x_train, maxlen=maxlen)
x_test = pad_sequences(x_test, maxlen=maxlen)

As a second step we build a simple neural network typical for a text classification task. It includes an embedding layer followed by 1D convolution and bi-direcional LSTM layer in order to finish with a single neuron predicting the sentiment.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM, Bidirectional
from keras.layers import Conv1D, MaxPooling1D
from keras.metrics import BinaryAccuracy, Precision, Recall
from keras.models import Sequential

def make_model():
    model = Sequential([
        Embedding(50000, 256, input_length=512),
        Dropout(0.1),
        Conv1D(128, 5, padding='valid', activation='relu'),
        MaxPooling1D(pool_size=4),
        Bidirectional(LSTM(64), merge_mode='ave'),
        Dense(1),
        Activation('sigmoid'),
    ])

    model.compile(
        optimizer='adam',
        loss='binary_crossentropy',
        metrics=[
            Precision(name='precision'),
            Recall(name='recall'),
        ]
    )

    return model

Finally we define the training procedure specifying the batch size and the number of epochs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def train_model(model):
    h = model.fit(
        x_train,
        y_train,
        batch_size=32,
        epochs=50,
        verbose=1,
    )

    test_metrics = model.evaluate(
      x=x_test,
      y=y_test,
      batch_size=32,
    )

    print(f"test loss: {test_metrics[0]}")
    print(f"test precision: {test_metrics[1]}")
    print(f"test recall: {test_metrics[2]}")

And now it is time to train this toy neural network and track the CO2 emissions. Using CodeCarbon this is such a simple task, it’s same as we were measuring the elapsed training time. All we have to do it to instantiate a EmissionsTracker object ans squeeze the training procedure between the start and stop methods. CodeCarbon will take care of the rest as shown below:

1
2
3
4
5
6
from codecarbon import EmissionsTracker

tracker = EmissionsTracker(project_name="imbd_sentiment_classification")
tracker.start()
train_model(make_model())
tracker.stop()

Indeed, CodeCarbon tracked and logged many aspects of the training process. The summary of every run is saved as one row in a file named emissions.csv by default. My general opinion is that it lacks better techniques to track the CPU consumption.

The library also comes with a command line tool named carbonboard that produces a dashboard showing equivalents of the carbon emission produced by the experiment. An example for the experiment we did above is shown below:

Dashboard showing the CO2 equivalents
Fig. 1: Dashboard showing CO2 equivalents


The source code for the implementation can be found on GitHub. If this is something you like and would like to see similar content you could follow me on LinkedIn or Twitter. Additionally, you can subscribe to the mailing list below to get similar updates from time to time.


Appendix: other initiatives to track the CO2 emissions

CodeCarbon is not the only movement to help tracking the CO2 emissions. There are at least two other with the same goal:

  • Experiment Impact Tracker: similar to CodeCarbon a simple drop-in method to track energy usage, carbon emissions, and compute utilization of the underlying system

  • ML CO2 Impact Calculator: a simple web interface that let’s you calculate the CO2 emissions yourself.

Updated:

Leave a comment