Authors

Shoukun Sun, Min Xian, Aleksandar Vakanski, Hossny Ghanem

Abstract

Robust self-training (RST) can augment the adversarial robustness of image classification models without significantly sacrificing models’ generalizability. However, RST and other state-of-the-art defense approaches failed to preserve the generalizability and reproduce their good adversarial robustness on small medical image sets. In this work, we propose the Multi-instance RST with drop-max layer, namely MIRST-DM, which involves a sequence of iteratively generated adversarial instances during training to learn smoother decision boundaries on small datasets. The proposed drop-max layer eliminates unstable features and helps learn representations that are robust to image perturbations. The proposed approach was validated using a small breast ultrasound dataset with 1,190 images. The results demonstrate that the proposed approach achieves state-of-the-art adversarial robustness against three prevalent attacks.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16440-8_39

SharedIt: https://rdcu.be/cVRv4

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The authors propose the Multi-instance RST with drop-max layer which includes a sequence of iteratively generated adversarial instances during training to learn smoother decision boundaries on small datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The proposal of multi-instance RST and drop-max layer seems novel.
- The proposed approach performs significantly well against the adversarial attacks.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- I’m not convinced about the clinical motivation of adversarial defenses in medical images. It is possible that this problem might gain traction and significance in future, however, I don’t see the significant interest in this problem for wider audience.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper seems reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- The related work of this paper is quite long and can be shortened significantly to make more space to explain the proposed approach.
- Generally speaking the term multiple instance is used for semi-supervised (SS) learning methods. However, RST itself is a SS approach, so the use of multiple instance with RST is not clear.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The technical novelty and impressive quantitative results make me to recommend acceptance.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The authors investigate the hot research topic of adversarial learning on robust self-training for image classification purposes. The generalizability and reproduction of the existing adversarial robustness on small medical image sets are considered to make some improvements. The authors proposed a multi-instance robust self-training with a drop-max layer to learn smoother decision boundaries on small datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The authors investigate the adversarial robustness of the deep models on small medical image classification tasks.
2. A multi-instance robust self-training with a drop-max layer is proposed to make robust training.
3. The proposed drop-max layer can remove unstable features to learn robust representations.
4. Experiments validate these claims for medical image classification purposes.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. In computer vision tasks, especially on adversarial machine learning, it exists on adversarial examples on small datasets. The authors may be one of the first attempts on this task, but the overall idea is still somewhat weak.
2. Adversarial training or co-training is one useful trick for improving the robustness of one medical image classification task, while the overall learning scheme is hard to avoid the severe overfitting problem.
3. Could the proposed method be adaptive to target attacks? It seems most of these efforts are devoted to un-target attacks.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors should release their codes, or this paper is not easy to be reproduced.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

See weakness.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors focus on the adversarial example robustness on small-size medical images, and there are some useful works. The authors present a method that is multi-instance robust self-training with a drop-max layer, which is simple yet maybe effective.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The paper proposes a method to defend against adversarial attacks for the task of breast tumor classification on ultrasound B-mode images. The method extends Robust Self Training by adding multiple instances of adversarial examples with gradually increasing perturbation during training and a dropmax layer to smooth the decision boundaries and achieve higher adversarial robustness.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The lack of large datasets if very common in the medical field, therefore adapting approaches to work well will less data is beneficial.
- The paper is well written.
- Cross-validation is used, which is critical for such small datasets.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The dropmax layer is an interesting and intuitive idea. However, choosing the second largest value does not guarantee that the perturbations are ignored. To give a simple example of an alternative approach; one could ignore or ‘drop’ the upper quantile of values instead of ignoring just the top-1. An interesting experiment would be showing how the performance changes as more of the max-values are ignored.
- The standard deviation across the five folds could have been reported.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
- The hyperparameters for the model training are clearly given.
- The datasets used are publicly available.
- There is no mention that the code will become publicly available upon acceptance.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- There is a typographical error: CIFA-10 instead of CIFAR-10.
- In Table 2 the highest values for every row should be made bold. Specifically in the ‘No Attack’ scenario, the baseline has 0.831 f1-score over the MI-RST that has 0.830. I understand that the comparison is focused between RST and MI-RST but in my opinion making a lower value bold is confusing to the reader.
- Fig. 2 has low resolution and I would recommend a high resolution version of it to be added to the manuscript.
- The name multi-instance is a bit misleading since it could refer to multi-instance learning which is not related to the paper. An alternative could be multi-adversary.
- The word ‘significantly’ is used widely for the discussion of the results but there are no statistical tests performed so I would replace it with ‘substantially’.
- To my understanding, all methods were trained using the same hyperparameters but that could be unfair for some of the baselines. I would recommend using the exact hyperparameters stated in the papers of the comparative methods to achieve a fair comparison of all approaches.
- An interesting experiment for future work would be to train on one of the two datasets and test on the other to see if the method is able to generalize better after robust training. That would show the benefits of the method in a more realistic scenario than adversarial attacks.
- Another experiment for future work would be to test whether it generalizes well in larger datasets or if it is really tailored to smaller datasets and the multiple instances are not required when a larger dataset is present.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is well-written and the results show a clear improvement between the baseline and the proposed approaches. Some aspects of the method like the dropmax layer heuristically perform well but it would be nice to better formulate the selection of the element used by the pooling.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
Strengths:
- The paper is well written.
- The proposed idea of the drop-max layer is interesting and quite novel.
- The proposed method is intuitive and has clear advantages in robustness for small medical image classification.
Weaknesses:
- The statistical evaluation is missing in the experiments.
- More justification and analysis on the design choice would make the paper be more strong.
- More discussion on the clinical use cases would be needed
Overall: The authors investigate the interesting topic of robustness for medical image classification. The reviewers agree on the innovative and novel aspects of the paper. Comparative experiments show the effectiveness of the proposed method. This paper can make an interesting discussion at the MICCAI conference. As a result, I would suggest the acceptance of this paper.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

1

Author Feedback

N/A

back to top

MIRST-DM: Multi-Instance RST with Drop-Max Layer for Robust Classification of Breast Cancer