Authors

Wenlong Hang, Yecheng Huang, Shuang Liang, Baiying Lei, Kup-Sze Choi, Jing Qin

Abstract

Self-ensembling framework has proven to be a powerful paradigm for semi-supervised medical image classification by leveraging abundant unlabeled data. However, the unlabeled data used in most of self-ensembling methods are equally weighted, which adversely affects the classification performance of models when difference exists among unlabeled data acquired from different populations, equipment and environments. To address this issue, we propose a novel reliability-aware contrastive self-ensembling framework, which can leverage the reliable unlabeled data selectively. Concretely, we introduce a weight function to the mean teacher paradigm for mapping the probability predictions of unlabeled data to corresponding weights that reflect their reliability. Hence, we can safely leverage the predictions of related unlabeled data under different perturbations to construct a reliable consistency loss. Besides, we further design a novel reliable contrastive loss to achieve better intra-class compactness and inter-class separability for the normalized embeddings derived from related unlabeled data. As a result, our reliability-aware scheme enables the contrastive self-ensembling framework concurrently capture both the reliable data-level and data-structure-level information, thereby improving the robustness and generalization power of the model. Experiments on two publicly available medical image datasets demonstrate the superiority of the proposed method. Our model is available at https://github.com/Mwnic/RAC-MT.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16431-6_71

SharedIt: https://rdcu.be/cVD7r

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The paper describes a semi-supervised classification method based on reliability analysis. MT method is used to contrastively analysis the reliability of the classification results, which is then used as the fake-label for fine training. The proposed method achieves the SOTA performances.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Detailed formulation of the problem and methods, formulas are easy to understand and follow. Corresponding codes are also published for accessing.
- Bi-level optimization is addressed properly, which is interesting to explore in the other different tasks
- Solid experiments and analysis
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

For me, it is a nice miccai paper submission without obvious weakness
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Author provided code for reproducing the results
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

Maybe in the future, it is also interesting to see the results of the segmentation tasks.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

8
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Well organized and explained paper, for me it is wonderful
Number of papers in your stack

3
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The manuscript presents a novel reliability-aware contrastive self-ensembling framework, which can leverage the reliable unlabeled data selectively. The authors introduce a weight function to the mean teacher paradigm for mapping the probability predictions of unlabeled data to corresponding weights that reflect their reliability and also design a novel reliable contrastive loss to achieve better intra-class compactness and inter-class separability for the normalized embeddings derived from related unlabeled data. Extensive experiments are conducted on two public datasets to verify the effectiveness of the proposed method.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The proposed method is novel since it can concurrently capture both the reliable data-level and data-structure-level information of the images, thereby improving the robustness and generalization power of the model. The proposed method achieves state-of-the-art performance on two public datasets.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Some details about the experiments are missing, which should be added in the revision.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility of the method is good, since the author provide us with almost all implementation details and the code is also released on the Github.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

(1) There is a typo in “Input of min-batch of images” in Fig.1, please correct it. (2) How many training iterations are operated between the update of the parameter of the weight function and the network parameters on two datasets? (3) Which dataset is used to conduct the ablation study to investigate the role of each component in RAC-MT. Please specify it in the manuscript. (4) Please discuss the limitation of the proposed method and the possible solutions/future directions in the manuscript.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed method is of great novelty, and the experiments are extensive and sufficient. The proposed method achieves very good performance on two datasets, outperforming compared methods significantly. The manuscript is well organized, clear and easy to follow.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #4

Please describe the contribution of the paper

This paper aims to effectively use unlabeled data for semi-supervised medical image classification. The challenge in using the unlabeled data is that they can be acquired from different populations or equipment, which may result in difference between these data. To address this challenge, this paper proposes to assign different unlabeled data with difference weights, instead of assigning equal weights. The weight function is learned together with the consistency loss and contrastive loss. Experimental results on two datasets show improved accuracy over other semi-supervised learning methods.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The idea of assigning each data different weights by a learned model is novel.
- The paper is well-written and easy to follow. I enjoy reading this paper.
- By employing the proposed weight function in the consistency loss and contrastive loss, improved accuracy is observed on two datasets.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- While this papers describes the proposed learnable weight function from the perspective of reliability, no evidence or evaluation of how this weight function improves the reliability is demonstrated. Instead, only the accuracy metrics are used.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The experimental setup such as data pre-processing and learning hyper-parameters are provided in detail. These descriptions are sufficient to reproduce the reported results.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- It could be better to replace the word reliability with words such as importance. The learned weight function is novel, but is not designed for data reliability.
- Could you also include comparison with contrastive learning based method such as SimCLR on the skin dataset, following the protocol of contrastive pre-training and fine-tuning with labeled data?
- From the ablation study in Fig. 2, without using the weight function, the simple combination of CST-MT (consistency loss and contrastive loss without weighting) already achieve a good accuracy, which means the improvement by the weighting function is not as effective as it appears. Could you explain more about this result?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed learnable weighting function for each data is novel. Applying this weight to the consistency and contrastive loss improves the overall model performance.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper studies the problem of semi-supervised medical image classification by reliability-aware contrastive self-ensembling, which can leverage the reliable unlabeled data selectively. Sufficient experimental results are reported to support the proposed method.

The reviewers consistently agree this is a good paper, which the AC echos as well. The proposed method is novel since it can concurrently capture both the reliable data-level and data-structure-level information of the images, thereby improving the robustness and generalization power of the model. In addition, the performances of the proposed method appear strong in the experiments. Thus, the AC recommends an acceptance of this paper. In the final paper, the AC hope the authors to read the review comments carefully and addresses the questions/concerns raised by the reviewers, including better explain the experimental procedures and results, wording of the method, and more description on the key points of the experimental datasets.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

1

Author Feedback

Response to comments from the Reviewer #1: (1) Thank you for your valuable suggestion. The proposed reliability-aware scheme indeed has the potential to provide more reliable supervision for segmentation tasks. This will be our direction of research in the future.

Response to comments from the Reviewer #3: Many thanks for your careful observation. According to your kindly reminder, we revise our paper as follows: (1) We correct the phrase to “ Input min-batch of B images” in Camera-ready paper. (2) The number of training iterations was set to 180. We supplement this in Camera-ready paper. (3) The skin dataset was used to conduct the ablation study. We point this in Camera-ready paper. (4) Extending the proposed reliability-aware scheme to medical image segmentation tasks is an interesting topic, which will be our direction of research in the future.

Response to comments from the Reviewer #4: Thank you for your valuable suggestions. We revise our paper as follows: (1) We are carefully considering your suggestion to choose an appropriate word to replace the word reliability. (2) We plan to add the comparison in Camera-ready paper. (3) Since the pictures from the publicly available medical image dataset have been pre-screened, CST-MT is able to achieve acceptable performance. We believe the proposed reliability-aware scheme may benefit more in practical scenarios. In the future, we will test our framework in the real-world medical image classification tasks.

back to top

Reliability-aware Contrastive Self-ensembling for Semi-supervised Medical Image Classification