Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zhixiong Yang, Junwen Pan, Yanzhan Yang, Xiaozhou Shi, Hong-Yu Zhou, Zhicheng Zhang, Cheng Bian

Abstract

Medicalimageclassificationhasbeenwidelyadoptedinmed- ical image analysis. However, due to the difficulty of collecting and la- beling data in the medical area, medical image datasets are usually highly-imbalanced. To address this problem, previous works utilized class samples as prior for re-weighting or re-sampling but the feature repre- sentation is usually still not discriminative enough. In this paper, we adopt the contrastive learning to tackle the long-tailed medical imbalance problem. Specifically, we first propose the category prototype and adver- sarial proto-instance to generate representative contrastive pairs. Then, the prototype recalibration strategy is proposed to address the highly imbalanced data distribution. Finally, a unified proto-loss is designed to train our framework. The overall framework, namely as Prototype- aware Contrastive learning (ProCo), is unified as a single-stage pipeline in an end-to-end manner to alleviate the imbalanced problem in med- ical image classification, which is also a distinct progress than existing works as they follow the traditional two-stage pipeline. Extensive exper- iments on two highly-imbalanced medical image classification datasets demonstrate that our method outperforms the existing state-of-the-art methods by a large margin.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16452-1_17

SharedIt: https://rdcu.be/cVRYZ

Link to the code repository

https://github.com/skyz215/ProCo

Link to the dataset(s)

https://challenge.isic-archive.com/data/

https://www.kaggle.com/competitions/aptos2019-blindness-detection/data


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper studies the long-tailed imbalance problem in medical image classification. Typically, the authors adopt contrastive learning to tackle such an imbalance problem. The category prototype and adversarial proto-instance are used for generating representative contrastive pairs with the prototype recalibration strategy. The authors apply such a learning scheme in highly imbalanced data for medical image classification. Experiments vandalized their claims on different datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Long-tailed medical image classification is studied in this paper, and the authors make an investigation on this task.
    2. Contrastive learning and prototype learning are jointly considered in the overall learning framework.
    3. A recalibration idea is given to make the alignment.
    4. Experiments show the effectiveness of the proposed method on different datasets.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The main contributions of this work are built on the contrastive learning and prototype learning. For the widely-studied image classification task, they are well testified on different vision-based learning schemes. By contrast, such a learning model is not extensively studied in the medical field. These are the advantages and also the disadvantages.
    2. For the concept of category prototype, this has been widely used in few-shot learning or detection or segmentation, or others. This is not a new concept.
    3. For recalibration, the reviewer may doubt the effect is very little to the whole learning scheme.
    4. The motivation of this work is also less attractive. Technically, the proposed method can be adaptive to any stations, rather than the specific long-tailed situation.
    5. The experiments are less sufficient in the present form.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The overall learning scheme is clear but it is hard to reproduce, and the review highly suggest the authors release their codes.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Please see the weakness.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors aim to tackle the long-tailed medical classification task by using prototype learning and contrastive learning. Experiments seem good in the present form.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    After reading the comments from the other reviewers and the rebuttal doc from the authors, I may still keep my opinion on this paper. However, I do not agree with the 3rd reviewer, SA, which is not reasonable.



Review #2

  • Please describe the contribution of the paper

    The paper proposes a prototype-based contrastive learning framework for long-tailed medical image classification. A mixup-style sythesis is adopted to generate adversarial instance, a prototype recalibration strategy is proposed, and a proto-loss is proposed. The reported numbers show outperforming performance over baselines.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper addresses an important problem in practical medical image classification, class imbalance. The motivation is good.
    2. The experiments show promising results.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I think the paper is not in a complete form. There are many missing details to fully understand the contributions.

    1. The problem of interest is long-tailed classification, while there is no discussion on the inference phase in Sec. 3 at all. For a technical presentation, I think it is good to present a complete pipeline.
    2. The idea of sythesizing hard examples for contrastive learning has been proposed before. I think more dicussion and analysis for Sec. 3.1 are needed. A good example could be found in [1].
    3. What’s the intuition behind Eq. 4? I think Sec. 3.2 could also be expanded.
    4. Similarly, Sec. 3.3, as a major contribution, should be expended. I am a bit confused: how do you train the classifier? Based my understanding, F and G are only projectors, right?

    [1] Hard Negative Mixing for Contrastive Learning, NIPS 2020

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I think the paper could be reproduced but I have no access to source code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Small Issues:

    1. In the title, I think Medical Classification should be Medical Image Classification, unless there are additional experiments presented (e.g. clinical text, audio, video).
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I don’t think the paper is in a complete form. But I believe the future version could be improved.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    4

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    To address the long-tailed dataset problem, the authors propose a novel end-to-end framework using prototype learning and contrastive learning, namely prototype-aware contrastive learning (ProCo). Specifically, adversial proto-instance is generated from the combination of learnable category prototype and feature of representative instance to enhance the robustness of contrastive learning over all classes on the long-tailed dataset. A prototype recalibration strategy is adopted to alleviate the prototype bias. Two long-tailed medical datasets are adopted to evaluate the proposed framework, and the experimental results support the effectiveness of ProCo.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The proposed framework, ProCo, is a novel contrastive learning method for the long-tailed medical classification problem. By introducing synthesized adversial proto-instance into the contrastive learning, ProCo can encourage the network to rectify the decision boundaries of the tailed categories. A prototype recalibration strategy is also proposed to address the prototype bias problem during training.
    2. Sufficient experiments are conducted on two public long-tailed medical datasets (ISIC2018 and APTOS2019). Accuracy and F1-score are reported on these two datasets, and the experimental results are consistent with the conclusions.
    3. Ablation studies were performed on the three proposed modules. The results show that each component contributes to improvement.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The test accuracy of each category is not presented. Because this work is proposed to address the long-tailed problem, a confusion matrix or a table containing the test accuracy of each category is a better way to reflect the effectiveness of the proposed method. Improvements in tailed categories are expected.
    2. The number of instances to compose adversarial proto-instances and the random interpolation coefficient are important hyperparameters in ProCo. However, they are not specified in the paper, and no sensitivity analysis or selection criteria are provided.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The proposed method is evaluated using two publicly available datasets, and most of the hyperparameters are included. Authors claimed in the Reproducibility Response that they will release code once the paper is accepted, but no reference to that is included in the text.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. Why the calibration factor reflects the difficulty of each category? Why add the factor with the prototype? Please provide more details or proper references.
    2. Kappa metric is a popular metric for an imbalance dataset. The author should consider reporting this metric.
    3. As stated before, the hyperparameters gamma and E are important for ProCo. The authors should put more words on them.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed ProCo is a novel method to address the long-tailed problem. The paper is well written, well organized, and sufficient experiments are conducted. There are no major weaknesses in this work. The sensitivity analysis on hyperparameters might increase the importance of the method.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper studies the long-tailed medical image classification problem by using prototype learning and contrastive learning. The category prototype and adversarial proto-instance are used for generating representative contrastive pairs with the prototype recalibration strategy. Experimental results are reported to support the proposed method.

    This paper received mixed ratings, two positive and one slightly negative. The reviewers all agreed that the studied problem is important for the community, the presented method has contributions in terms of methodology, and the experimental results are sufficient to validate the proposed method. Besides addressing the questions and critiques listed in all the review comments, the weaknesses listed by R#2 should be particularly addressed, including more detailed/clearer descriptions to explain the formulations and the experimental procedures.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    3




Author Feedback

We appreciate all constructive comments and will give point-to-point response below:

Q1: Source code (R1&R2&R3) Our inference code and test images are now available at https://github.com/skyz215/ProCo. The entire project will be released after acceptance.

Q2: Intuition of Prototype Recalibration (R2&R3) The long tail phenomenon curtails the representatives of the prototypes for the tailed categories while over-focusing on the head ones (Ref. Balanced Meta-Softmax for Long-Tailed Visual Recognition). In Eq. 4, the calibration factor is the mean distance (rectified by sigmoid) between a prototype and its related samples, reflecting the representativeness of the prototype or the learning difficulty for the corresponding category. In Eq. 6, by adding a calibration factor, the importance of tailed prototypes is improved, emphasizing the corresponding terms in Eq. 7.

Q3:The category prototype is not new (R1). The category prototype is not firstly proposed by us, however, as opposed to simply applying this prototype, we further use it to enhance the robustness of the contrastive learning by generating adversarial proto-instance in the long-tailed setting.

Q4:The improvement of the recalibration module may be limited (R1). The motivation of Prototype Recalibration is shown in Q2. In Table 3, our method achieves 1.37% improvement in accuracy and 1.6% in F1 score using the recalibration module.

Q5: The method can be adapted to any situation and may not be a specific design for the problem (R1). The generation process of hard positive and negative proto-instance and the recorrection of category prototype are specific designs for the long-tailed setting. Apart from these modules, the contrastive framework and proto-loss can be applied to general situation

Q6: Inference phase & How to train the classifier (R2). Our prototypes are category-aware and can be treated as classifiers directly without additional classification layers, despite trained with a contrastive loss. Therefore, in the inference stage, the category of an input image, x, can be predicted from the category prototype P according to the formulation argmax(dot(F(x), [P_1, P_2, …, P_C])), where F(x) are the latent features of x; C is the number of classes. Details will be provided in the camera ready.

Q7:Discussion & Analysis for Sec. 3.1 (R2). The concepts of [1] and ours on synthesizing sample are totally different. Specifically, the hard negative sample in [1] is generated by query and negative samples. Without ground truth, such sample may contradict supervised training in the long-tailed problem, causing inferior performance. In contrast, our synthesizing samples interpolated from category prototype include adversarial positive and negative proto-instance rather than simply using hard negative one, ensuring the optimization consistency of supervised training and contrastive training so as to boost the performance.

Q8:Title suggestion (R2). We will modify the title in the camera ready.

Q9: Test accuracy of each category / kappa metric (R3). We analyze the test accuracy of each class and kappa metric on ISIC2018 dataset over MEL, NV, BCC, AIEC, BKL, DF and VASC for your reference, which consist of 1113, 6705, 514, 327,1099, 115 and 142 instances respectively. The followings are the corresponding test accuracy and kappa metrics, we can see that ProCo also ranks the best on most classes, especially on tailed classes. |CE+resample |0.676|0.923|0.735|0.652|0.683|0.734|0.908|(k) 0.728|, |Focal loss |0.665|0.917|0.718|0.647|0.686|0.692|0.882|(k)0.703|, |CL+resample |0.683|0.942|0.741|0.66|0.709|0.707|0.926|(k) 0.732|, |ProCo |0.698|0.947|0.748|0.657|0.714|0.784|0.913|(k) 0.747|.

Q10:Number of instances & Random interpolation coefficient (R3). The hyperparameters: number of instances, gamma, and random interpolation coefficient, Ε, are set at 20 and 0.4 using a grid search experiment. We will deliver more details in our future version.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors’ rebuttal has reasonably addressed the reviewers’ concerns/comments. Please the authors address the reviewers’ comments in your final paper, such as those provided in the rebuttal.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    5



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    interesting idea of contrastive prototype based learning to deal with long-tail issues in medical classification

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    9



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper leverages the prototype idea to address the imbalanced class problem. Two of the reviewers are supportive. All reviewers gave good comments on the improvement of the paper. The meta-reviewer wonders: how to know the number of prototypes of a medical task in advance? How will the performance change with too many or too less prototypes? The prototypes are feature vector based. How do they span the feature space? Is there any thought to compare the generated prototypes with basis vectors from PCA?

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2



back to top