Authors

Raghavendra Selvan, Nikhil Bhagwat, Lasse F. Wolff Anthony, Benjamin Kanding, Erik B. Dam

Abstract

The increasing energy consumption and carbon footprint of deep learning (DL) due to growing compute requirements has become a cause of concern. In this work, we focus on the carbon footprint of developing DL models for medical image analysis (MIA), where volumetric images of high spatial resolution are handled. In this study, we present and compare the features of four tools from literature to quantify the carbon footprint of DL. Using one of these tools we estimate the carbon footprint of medical image segmentation pipeline. We choose nnU-net as the proxy for a medical image segmentation pipeline and experiment on three common datasets. With our work we hope to inform on the increasing energy costs incurred by MIA. We discuss simple strategies to cut-down the environmental impact that can make model selection and training processes more efficient.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_49

SharedIt: https://rdcu.be/cVRy3

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The paper raises awareness and provides informative results about the carbon footprint of deep learning in medical image analysis.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Timely and important topic. Presenting the carbon footprint of DL models in medical image analysis in terms of distance travelled by car is an excellent way to convey the results.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

No major weaknesses. The empirical part is limited in scope, but sufficient to make an important contribution.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Can be reproduced, but probably shouldn’t in order to reduce CO2 emissions.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

One way, as a community, to reduce CO2 emissions would be to avoid unnecessary, repeated experiments. Many works present similar baseline results (e.g., running the same U-net model on same data), over and over again. The paper discusses the briefly as part of the idea of open science. I think this point is important and could be made stronger and highlighted a bit more.

Maybe a way forward would be to construct a library of trained models that can be shared and re-used for comparative analyses avoiding many training runs of similar models, and thus reducing CO2 emissions. But this would need to go hand in hand with recommendations for publications, reviewing, etc. I would suspect that many reviewers are asking for (sometimes unnecessary) comparisons which require the authors to run many more model trainings than needed. We should be more careful, as a community, to ask for ablation studies, etc. Any additional experiment should come with a clear justification and trade-off analysis of added (scientific) value over the increased carbon footprint. Similar applies to cross-validation, etc.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper makes a strong point that should be of interest to the whole MICCAI community and beyond.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The authors propose clearly defined guidelines for reducing carbon emissions during the development of machine learning models. They use a well-known segmentation framework and well-known datasets in the medical image analysis (MIA) community to estimate the energy consumption used by the community during training.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper originally addresses an extremely important concern for the community and the general global population, from the point of view of the medical image analysis (MIA) community. This is very interesting because it is not a topic frequently discussed within the medical community, despite its relevance. It presents and compares multiple methods that can be used to measure energy consumption. The manuscript is well written and structured, and experiments correctly corroborate the authors’ hypotheses, showing that training using large 3D/4D images means a larger energy consumption.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The methodology itself is not novel, as different existing tools to measure carbon emissions are used. However, I appreciate that the point of the paper is to make the community aware of this issue and recommend guidelines.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
1. For all models and algorithms, check if you include A clear declaration of what software framework and version you used. [Yes]
Software versions are not reported.
1. For all code related to this work that you have made available or will release if this work is accepted, check if you include: Specification of dependencies. [Yes] Training code. [Yes] Evaluation code. [Yes] (Pre-)trained model(s). [Yes] Dataset or link to the dataset needed to run the code. [Yes] README file including a table of results accompanied by precise command to run to produce those results. [Yes]
I have not seen any references to shared code or data in the manuscript.
1. For all reported experimental results, check if you include:
The average runtime for each result, or estimated energy cost. [Yes] The training of models in this work is estimated to use 39.948 kWh of electricity contributing to 11.426 kg of CO2eq. This is equivalent to 94.898 km travelled by car.

Excellent! I hope reporting this will soon be a trend.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
Please double-check for grammar, e.g., “an year”. Please replace “f.x” with “e.g.,”. In Table 1, abbreviations are probably not necessary. Fig. 3:
- It says “total […] energy […] over the five-fold cross validation”, but it seems to me that it is the mean consumption, not the total.
- On the right graph, please modify the bars so they are separated instead of stacked (i.e., three groups of three bars, where each group has a bar for each region). The current visualization suggests the consumptions add up and each region is responsible for a certain percentage.
- Some labels are barely visible when printing the manuscript in black and white. The paper claims AMP made a large difference when training with brain images, but this is not clear in Table 2. I think either the table or the text should be corrected. In Section 6, it is not clear where 2.89 comes from (the number used to calculate the carbon emissions from MICCAI papers).
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is very original and relevant for the community. It is very well written and structured. Results are reported clearly. I am glad that teams within the community are looking into this.
Number of papers in your stack

6
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This paper presents carbon footprint of selecting and training deep learning models for medical image analysis, which seems to work from the test results.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The experiment results are provided.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1 The main contributions of this paper must be further summarized and clearly demonstrated. This reviewer cannot distinguish the new findings of this paper and the existing methods/approaches in the literature. This reviewer suggests the authors exactly mention what is new compared with existing approaches and why the proposed approach is needed to be used instead of the existing methods. 2 The theoretical depth of this paper must to be strengthened. The principle of the proposed approach is not clearly explained, and there is no equation throughout the manuscript. 3 There is no comparison with other state-of-the-art methods in literatures. The effectiveness and superiority of the presented method in this paper should be verified through such comparisons. 4 This reviewer would like to suggest the authors add a flowchart of the presented method and the corresponding description to enable readers to have a better grasp of the approach as a whole. 5 The novelty and contribution of the present work need further justification. Authors need to add more results with more discussions to thoroughly support the main findings. 6 The practicality of the approach should be further discussed.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility of the paper is not good.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

The presented approach seems to work from the test results. However, the contributions of the paper are not clear and not sufficiently significant to be published. In addition, the overall technical quality of this paper is below average as this work is not well presented.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

1
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The contributions of the paper are too limited. In addition, the overall technical quality of this paper is poor.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

4
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
Summary & Contribution: This paper informs about the carbon footprint of machine learning in medical image analysis and propose clearly defined guidelines for reducing carbon emissions. Reducing carbon emissions during the development of deep learning based methods is important to increase awareness of the increasing energy and carbon costs of developing machine learning models for medical imaging. The authors use a widely used segmentation framework (nnU-net) and three popular datasets to estimate the energy consumption used during training, using four open-source tools available from literature to quantify the carbon footprint.

The main contribution of this paper is the recommendation of 5 simple steps (THETA-guidelines) to ensure good practices when developing and training machine learning models for medical image analysis.

Key strengths:
- Novel, timely and important topic.
- Original way of presenting the carbon footprint (as distance travelled by car) which is an easy unit to understand.
- The use of open-source tools to measure energy consumption depending on the country is important and relevant
Key weaknesses:
- Limited empirical analysis
- Limited technical novelty
- Although discussed, there is no clear evidence that the 5 proposed steps have a substantial impact in the energy consumption.
Evaluation & Justification: Reviewers agree that the technical novelty of this work is limited but it is a novel and important topic of interest in the medical imaging community and beyond. The proposed set of recommendations could have a positive impact in the society in order to reduce energy and carbon costs. The only missing information in the paper is an estimated energy cost before and after implementing the 5 steps proposed, which could have been added in the discussion.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

2

Author Feedback

N/A

back to top

Carbon Footprint of Selecting and Training Deep Learning Models for Medical Image Analysis