Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Chong Wang, Yuanhong Chen, Yuyuan Liu, Yu Tian, Fengbei Liu, Davis J. McCarthy, Michael Elliott, Helen Frazer, Gustavo Carneiro

Abstract

State-of-the-art (SOTA) deep learning mammogram classifiers, trained with weakly-labelled images, often rely on global models that produce predictions with limited interpretability, which is a key barrier to their successful translation into clinical practice. On the other hand, prototype-based models improve interpretability by associating predictions with training image prototypes, but they are less accurate than global models and their prototypes tend to have poor diversity. We address these two issues with the proposal of ProtoPNet++, which adds interpretability to a global model by ensembling it with a prototype-based model. ProtoPNet++ distills the knowledge of the global model when training the prototype-based model with the goal of increasing the classification accuracy of the ensemble. Moreover, we propose an approach to increase prototype diversity by guaranteeing that all prototypes are associated with different training images. Experiments on weakly-labelled private and public datasets show that ProtoPNet++ has higher classification accuracy than SOTA global and prototype-based models. Using lesion localisation to assess model interpretability, we show ProtoPNet++ is more effective than other prototype-based models and post-hoc explanation of global models. Finally, we show that the diversity of the prototypes learned by ProtoPNet++ is superior to SOTA prototype-based approaches.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16437-8_2

SharedIt: https://rdcu.be/cVRsO

Link to the code repository

N/A

Link to the dataset(s)

https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70230508


Reviews

Review #1

  • Please describe the contribution of the paper

    This work proposes and describes network that adds interpretability to a global model by assembling it with a prototype-based model. The proposed approach was tested in their own database and in a public available database. Results are similar to state-of-the art approach with the advantage of interpretability results.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • A new proposition for ProtoPNet. - Interpretable results in the network.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The accuracy improvement is insignificant.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Results presented in the paper are reproduceable.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    In my opinion this is an excellent work and I do not have any suggestion for improvement.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    8

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In my opinion this is an excellent work and I do not have any suggestion for improvement. The improvement of ProtoPNet is very welcomed.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    To integrate the advantages of accurate global models and interpretable prototype-based models, the proposed ProtoPNet++ distills the knowledge from the global model to the prototype-based model. The performance and interpretability of the proposed ProtoPNet++ are valiated on private and public datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed method is simple but effective to improve the model performance and interpretability.
    • The paper is well organized and description of the motivation is clear.
    • The experimental results supports the validity of the proposed method.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Maybe limited by the paper length, this paper lacks of important ablation study regarding the hyper-parameters, such as the hyper-parameters in Eq. (1)-(3) and the number of prototypes.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    -The authors declares they will share all the codes for the experiement when accepted.

    • The authors also presented most of the relevant setting parameters for the expriments.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Although this paper is well-written and efficient, there lacks of important ablation study regarding the hyper-parameters, such as the hyper-parameters in Eq. (1)-(3) and the number of prototypes. I understand such experiments might be missing due to length constraint of the MICCAI conference. However, I think it would be better if the authors could provide more detailed ablation studies in the supplementary file.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is well-written and easy to read. The motivaion and method are clearly demonstrated. The experimental results can support the method.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposed ProtoPNet++, which ensemble a global model with a prototype-based model for mammogram classification tasks (cancer/no cancer). Such combination is claimed to be both more accurate than the baselines and can provide more interpretability.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    There are three major contributions of this work (1) adding interpretability; (2) higher accuracy benefited by ensemble learning and (3) improve performance by introducing diversified training strategy. The research topic is highly interesting, since the interpretation of deep learning models have been a heated topic in recent years.

    The authors implement knowledge distillation from GlobalNet to ProtoPNet through minimizing KD loss, forcing the networks produce similar results on the same sample. The training signal, i.e. loss functions are well designed, with thoughtful considerations of margins/diversity/over-fitting prevention.

    The studied dateset, though private, is relatively well-sized. What interests me most is the activation maps that can visualize the cancer localisation across different methods, which allows deep model interpretation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    One issue unclear to me is in section 2.1: “The prototype layer has M learnable class-representative prototypes….with M/2 prototypes for each class”. In this paper, the label space is only binary classification: with cancer or without cancer. Could the authors clarify on this for their specific application?

    Since there are multiple loss functions used to update the parameters, it should be essential to show the loss curve and evaluate how they change by the training epochs.

    In figure 4, there seems to be more than one activation on the ProtoPNet++ KD, which is less concentrated and accurate than the mere ProtoPNet. Is there any reasoning on this phenomenon?

    For the experimental sections, the authors did not report any statistical tests regarding the performance, thus we may not know whether there is a significant improvement.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    In the reproducibility checklist, the authors claim to release the training/evaluation codes with pretrained models which is a plus. Since the proposed method is a sophisticated ensemble learning framework with multiple loss functions, source code release could be a great help.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    I would encourage the authors pay attention to the PDF formats.

    The authors include several weight parameters (alpha, beta, lambda 1/2), it should also be critical to show the hyper-parameter tuning analysis. Loss curves for each term could also be favorable.

    The current project is a binary classification problem. The authors could consider studying a multi-category classification task and see how the ProtoPNet++ perform, especially on some public datasets competitions (CheXpert, ROCC etc.).

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is of high quality in writing and reasoning.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    All reviewers consider the technical novelty of the approach sufficient, recommending acceptance unanimously. The final version of the paper should include reviewers’ comments, in particular: the training loss issues, more comprehensive ablation experiemments

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    1




Author Feedback

We would like to thank the reviewers for the insightful comments about our paper. We will provide the training loss curves for the ProtoPNet and discuss the effect of the number of prototypes on AUC result in the supplementary file. For R3-5, the reason for having more than one activation in ProtoPNet++ KD is that it tends to have higher activation values (and larger activation regions) than the original ProtoPNet, which may produce some small false positive predictions. For R3-8, our method is general and can be easily extended to multi-category classification tasks, we will explore this in our future work.



back to top