Authors

Jiuwen Zhu, Yuexiang Li, Lian Ding, S. Kevin Zhou

Abstract

Limited training data and annotation shortage are the main challenges for the development of automated medical image analysis systems. As a potential solution, self-supervised learning (SSL) causes an increasing attention from the community. The key part in SSL is its proxy task that defines the supervisory signals and drives the learning toward effective feature representations. However, most SSL approaches usually focus on a single proxy task, which greatly limits the expressive power of the learned features and therefore deteriorates the network generalization capacity. In this regard, we hereby propose two strategies of aggregation in terms of complementarity of various forms to boost the robustness of self-supervised learned features. We firstly propose a principled framework of multi-task aggregative self-supervised learning from limited medical samples to form a unified representation, with an intent of exploiting feature complementarity among different tasks. Then, in self-aggregative SSL, we propose to self-complement an existing proxy task with an auxiliary loss function based on a linear centered kernel alignment metric, which explicitly promotes the exploring of where are uncovered by the features learned from a proxy task at hand to further boost the modeling capability. Our extensive experiments on 2D and 3D medical image classification tasks under limited data and annotation scenarios confirm that the proposed aggregation strategies successfully boost the classification accuracy.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16452-1_6

SharedIt: https://rdcu.be/cVRYJ

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The manuscript presents two strategies of aggregation in terms of complementarity of various forms to boost the robustness of self-supervised learned features, i.e., a principled framework of multi-task aggregative self-supervised learning from limited medical samples to form a unifed representation, with an intent of exploiting feature complementarity among diﬀerent task and an auxiliary loss function based on a linear centered kernel alignment metric to self-complement an existing proxy task. These two strategies can effectively boost the modeling robustness and capability.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The proposed method is novel and effective. The authors conducted extensive experiments to verify the effectiveness of the two proposed strategies. The manuscript is well organized, clear and easy to follow.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The dataset split and backbone selection might be not appropriate.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility of the paper is good since most of the important implementation details are provided in the manuscript.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

(1) There is a typo in Eq.(2), please correct it. (2) In Fig.1(b), it is suggested to highlight the iteration of training. So please modify it. (3) Please correct line 8 in Algorithm 2. (4) The two datasets are separated into training and testing sets according to the ratio of 80:20. However, in the training process, an extra validation set is needed to determine whether a further training iteration will continue. It is not appropriate to use the testing set for model tuning during training. Please elaborate on it. (5) The authors use the average classification accuracy as the evaluation metric. Why not use the overall accuracy (i.e., the number of correctly predicted samples/the total number of testing samples)? Please also provide the confusion matrix of the proposed method on two datasets. (6) From Table 2, we can observe that the VGG network outperforms ResNet18. But the authors mentioned in Section Implementation details that they use ResNet18 as backbone. Why not VGG?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed method is of technical novelty, and the experiments are very comprehensive to verify the effectiveness of the proposed method.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper
- Authors propose strategies for various aggregate forms of pre-defined self-supervised learning tasks to boost the performance. The prior papers mostly combine all SSL tasks together, which can potentially harm performance.
- This framework is helpful in general when we have several options for SSL tasks and want to find a way to combine these SSL tasks.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Their multi-task aggregative SSL (MT-ASSL) uses linear centered kernel alignment (LCKA) to align the feature representation of two neural networks to look interested and novelty.
- Self-Aggregative SSL makes sense and is able to apply to other settings.
- In general, Reviewer thinks that the novelty of the method in this paper is good enough.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

One of the main weaknesses of this paper is the baseline setting. In particular, the main contribution of this work is the strategy to combine SSL tasks. The author also mentioned that this work is different from conventional methods in which all SSL tasks are combined and trained together. Therefore, one of the crucial experiments shows the difference between their approach and default settings. For instance, in Table 1, it would be useful if the authors could provide an experiment for training together all SSL tasks and compare them with the proposed method (SRC, SimCLR, 2D Rot). A similar result should be done for the 3D Brain hemorrhage dataset as well. Given this evidence, the contributions of this paper will be more convinced.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Authors well-described experiments in this paper; thus, it is possible to reproduce their experiments.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

As mentioned above, the Reviewer strongly recommends that authors provide further experiments for the conventional approach where all SSL tasks are combined. The experiment described in Table 1 at the ‘MT-ASSL ACC’ is confusing to the reader initially. Therefore, the authors should detail how these results are computed. It is unclear how the method trains the feature representation \theta in Eq.(5) given a selected subset A. Is \theta trained from scratch given all SSL tasks in A, or \ will theta be fine-tuned given a new SSL task added to A after each iteration? What is the difference between these two options? After each iteration, rather than purely combining all SSL tasks in A and training with equal parameters, the authors may apply to set proper weights for each SSL task, e.g., using a grid search to find the best combination.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

In general, the method in this paper is good enough for the MICCAI. It can bring more benefits by adding new SSL tasks into a pool for consideration. Though the experiment to compare with the standard method (training all SSL tasks together) is missing thus, it would not be comfortable to validate the effectiveness of this approach.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This paper targets self-supervised feature learning (SSL) for low-data regime. To achieve this goal, two techniques of aggregation in terms of complementarity of various SSL tasks. The first technique is a principled framework of multi-task aggregative SSL. Tasks are iteratively added to exploit feature complementarity among different tasks. The second technique self-complements an existing proxy task by an auxiliary loss function. Experimental results on two medical image classification dataset show improved accuracy over the existing works.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- This paper is well-motivated. SSL needs a large amount of samples to train, and collecting medical images is expensive. Achieving good performance of SSL by using limited data is necessary.
- The idea of aggregating multiple SSL tasks to improve the representation learning is interesting.
- The writing is clear.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The training cost of SSL aggregation is high. It needs iterative training of each SSL method.
- The reported accuracy of existing SSL works is low and inconsistent with existing works.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The experimental setup is described in detail. Following the description can reproduce the results.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- If I understand it correctly, by minimizing Eq.(9), L_com is minimized and the similarity between \phi and \phi^\prime is maximized, which contradicts the goal of learning a self-complementary representation. Could you please explain more on this equation?
- The data augmentation (i.e. horizontal flip) is very weak compared the ones used in SOTA self-supervised learning approaches such as [1]. By using stronger data augmentations, [1] can be greatly improved. Could you compared the accuracy of the proposed method and [1] with strong augmentations?
- SSL training has high computation cost, and iterative SSL training will be higher. Could you compare the total training cost (in terms of training time or FLOPs) of the proposed method and the baselines?
[1] A simple framework for contrastive learning of visual representations, ICML 2020
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The overall method is well-designed, and my main concerns is the reported performance in the experiments. The performance of the baseline method SimCLR is lower than expected. For example, the accuracy of all the SSL baselines in Table 2 is almost the same as w/o SSL, which is inconsistent with results reported by existing works. Carefully tuning the hyperparameters of SimCLR may improve its performance and even outperform the proposed method. Therefore, the effectiveness of the proposed methods is not well evaluated in current experimental results.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The reviewers have consistent reviews of your paper. I believe that their comments are of great value, so please take them into careful consideration to further enhance your paper.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

2

Author Feedback

N/A

back to top

Aggregative Self-Supervised Feature Learning from Limited Medical Images