Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Lie Ju, Yicheng Wu, Lin Wang, Zhen Yu, Xin Zhao, Xin Wang, Paul Bonnington, Zongyuan Ge

Abstract

Most of the medical tasks naturally exhibit a long-tailed distribution due to the complex patient-level conditions and the existence of rare diseases. Existing long-tailed learning methods usually treat each class equally to re-balance the long-tailed distribution. However, considering that some challenging classes may present diverse intra-class distributions, re-balancing all classes equally may lead to a significant performance drop. To address this, in this paper, we propose a curriculum learning-based framework called Flexible Sampling for the long-tailed skin lesion classification task. Specifically, we initially sample a subset of training data as anchor points based on the individual class prototypes. Then, these anchor points are used to pre-train an inference model to evaluate the per-class learning difficulty. Finally, we use a curriculum sampling module to dynamically query new samples from the rest training samples with the learning difficulty-aware sampling probability. We evaluated our model against several state-of-the-art methods on the ISIC dataset. The results with two long-tailed settings have demonstrated the superiority of our proposed training strategy, which achieves a new benchmark for long-tailed skin lesion classification.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16437-8_44

SharedIt: https://rdcu.be/cVRuv

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors present a cirriculum learning based sampling strategy to improve the performance in classification of dermoscopy images. They initially train a embedding representation model and determine anchor samples within each class. These anchor samples are then used to train a classification model which will then be used to select the next set of samples that can be added the training set for further training. Authors show that iteratively training a model with such curriculum can help in boosting the classification performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written and the ideas are clearly presented The curriculum idea in sample selection overall makes sense in general Authors support their idea with extensive experimental results

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The anchor sample selection model assumes that the embedding representation show a unimodal gaussian distribution. However, this assumption was not supported theoretical or experimental data.

    A few minor details of the model are missing.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    -

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Overall the paper is very well written and the presentation is clear. Extensive experiments and comparisons are conducted to validate the presented method.

    The only major issue that requires more in depth analysis is the selection of the anchor points. The paper reads like authors assume that the initial embedding representation of the samples are distributed around a mean anchor point (or in more general sense unimodal gaussian) However, no evidence is presented if this assumption holds. Did the authors conduct any analysis on this issue?

    In the text, it is not clear, how do the authors calculate the entropy for further selection of the samples. Figure 2 makes things more clear but the reviewer thinks that authors could have explained this more clear in the text.

    It would have been good if the authors can show how many new samples are introduced at every step of the curriculum for an example run.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written and the ideas are well presented The idea makes a lot of sense The experimental results are extensive

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    Authors proposed a class re-sampling method for learning from the long-tailed distribution. The idea is to first use SSL to learn balanced representations, and then filter out an anchor dataset with balanced difficulty. Finally, using this anchor dataset to initialize a model, the evaluated difficulty/uncertainty samples of the model is sampled to train it in a recurrent manner. Experiments are conducted on skin lesion classification comparing with a wide range of long-tailed distrubution learning methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. the representation and orgnization of the article are nice, figures are clear.
    2. the topic of learning from long-tailed distribution is hot and important. Long-tailed data is a common problem in medical imaging.
    3. authors provide a novel way toward long-tailed distribution learning. It establishs an anchor set based on balanced representation learned from SSL. This anchor set is key for the method to outperform the others. Such a sub-set helps to fairly evaluate the difficulty/uncertainty of the samples, and thus facilitate the learning from imbalanced data.
    4. the experiments comparing with SOTA methods are comprehensive.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Authors include too much techniques in one paper. The relation between the modules and the effectiveness of each module are not well-explored.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    yes

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. focus on one main technique and give detailed analysis/motivation/discussion OR
    2. give the detailed ablation study to verify the effectiveness of each module, and discuss/analysis the relation between them. If the pages are not enough, I think SOTA comparison can be shortened, comparing with SOTA resampling methods is enough.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method is intriguing, but further analysis/discuss/evaluation is needed.

  • Number of papers in your stack

    1

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper presents a curriculum sampling-based method to improve automatic skin lesion classification on imbalanced datasets. The paper introduces a strategy to mitigate class imbalance in several stages, such as pre-training a CNN backbone with a self-supervised loss, sampling anchor points to highlight the “key” elements on the dataset per class, and curriculum sampling to incorporate unsampled elements on the fly in the rest of the training.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Overall, the paper is well written, the experiments are easy to follow, and the results are presented clearly and concisely. The paper is technically sound, and all the equations reflect the key elements to understanding the proposed method. The comparative study is well executed and contains adequate baselines to compare thoroughly with the state-of-the-art.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Although the author’s method does show improved results, they are relatively close to some methods in the comparative study. e.g., RW and ELF. The latter also lies on the curriculum learning-based category, making it significantly close to the proposed method.

    The paper is in general well written. However, some parts of the manuscript regarding the description of the method (section 2) are not entirely clear. For instance some symbols and mathematical variables are used in equation without introduction.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Section 3 of the paper includes enough information to reproduce the method and architecture presented in the manuscript. The authors explain in detail what datasets they used, the network’s hyper-parameters, the architecture, and the training recipe.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    My main comments are:

    1) The results of some of the methods in the comparative study are relatively close to the proposed method (RW and ELF). However, these results do not seem to be explained or discussed in the paper. If the authors have additional experiments or different metrics to show the improvement of their method with respect to the baselines, I would recommend including them as well. See section 3.3. 2) Table 2 only highlights when the proposed method performance of the proposed method when it does better than the rest. I suggest properly highlighting the best results to show the advantages and weaknesses of the proposed method properly. 3) Section 2.1 is slightly challenging to read. Although it becomes clear later in the paper what x, v represent and the purpose of equation 1, I recommend rewriting that subsection to improve the overall clarity of the manuscript.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper’s method does shows improvements on skin lesion classification through a well executed experimental setup.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #5

  • Please describe the contribution of the paper

    1) A curriculum learning-based framework called Flexible Sampling for the long-tailed skin lesion classification task. 2) A new benchmark for long-tailed skin lesion classification.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The approach is interesting. The work uses self supervised learning (SSL) to determine the representative features of each classes, and sampling neighbor points in the feature space as anchor points to concentrate the training. 2) The evaluation is good. Different types of approaches have been considered. Ablation studies are included.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The approach is new in lesion classification, but not in classification. For example, [1] proposed dynamic curriculum learning for imbalanced classification. The work also propose anchor points to guide the model to learn from easy to hard. 2) The comparison methods are representative, but not state-of-the-art. For example, [2] decoupling based approaches show state-of-the-art performance in the problem. 3) No visualization examples, e.g., good cases, bad cases, to show the improvement over other methods. 4) No results of SSL to demonstrate the distribution of classes. TSNE can be used for visualization.

    [1] Wang, Yiru, Weihao Gan, Jie Yang, Wei Wu, and Junjie Yan. “Dynamic curriculum learning for imbalanced data classification.” In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5017-5026. 2019 [2] Kang, Bingyi, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, and Yannis Kalantidis. “Decoupling representation and classifier for long-tailed recognition.” arXiv preprint arXiv:1910.09217 (2019).

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No code is proovided. The datasets are publicly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The paper aims to alleviate the long-tailed problem in skin lesion segmentation. The paper proposes a curriculum learning-based framework for the long-tailed skin lesion classification task. The approach uses self supervised learning (SSL) to determine the representative features of each classes, and sampling neighbor points in the feature space as anchor points to concentrate the training. Further, class-wise and instance-wise sampling are considered. The dynamic sampling approach is demonstrated in the task of long-tailed skin lesion classification. I think the approach is interesting, but there are some comments: 1) dynamic curriculum learning for imbalanced classification have been proposed before in [1]. 2) I sugguest the wrk include more analysis on good and bad cases. 3) I suggest cmparison with state-f-the-art decoupling approach in long-tailed classification problem. 4) TSNE can be helpful to understand the features learned in SSL procedure.

    [1] Wang, Yiru, Weihao Gan, Jie Yang, Wei Wu, and Junjie Yan. “Dynamic curriculum learning for imbalanced data classification.” In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5017-5026. 2019

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1) The method is interesting and new in skin lesion segmentation. 2) The approach demonstrate good performance on skin lesion classification. 3) More analysis on the method (feature visualization) and results (good and bad cases) will be helpful.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a class re-sampling method for learning from the long-tailed distribution. It establishes an anchor set based on balanced representation learned from SSL. This anchor set is key for the method to outperform the others. Such a sub-set helps to fairly evaluate the difficulty/uncertainty of the samples, and thus facilitate the learning from imbalanced data. Experiments are conducted on skin lesion classification comparing with a wide range of long-tailed distribution learning methods.

    The paper is well written, the experiments are easy to follow, and the results are presented clearly and concisely. I recommend accepting this submission. The authors should address the detailed comments from the reviewers in the camera-ready manuscript.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    1




Author Feedback

We would like to thank all the reviewers for their positive comments and constructive suggestions. We carefully summarized the concerns of reviewers and gave detailed responses below:

  1. More details should be added and the effectiveness of each component should be discussed. R: Thanks for this valuable comment. The missing details and more related discussion will be added in the camera-ready paper, e.g., the selection of anchor points.

  2. The selection of the comparison methods. R: Thanks for pointing this out. As we claimed in Sec. 3.3, the selected comparison methods can be grouped into 4 categories regarding the relations to our proposed methods. For the curriculum learning-based methods[1], we selected two representative methods and ELF achieved state-of-the-art performance. For the two-stage methods [2], classifier re-training is used as a pluggable and effective component for further improvements in many state-of-the-art methods, which have been included in our comparison study, e.g., ELF. To shorten the benchmark, we did not explicitly present the results. In the camera-ready paper, we will add the references of these two papers for more discussion since they are highly related to our work.

[1] Wang, Yiru, Weihao Gan, Jie Yang, Wei Wu, and Junjie Yan. “Dynamic curriculum learning for imbalanced data classification.” In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5017-5026. 2019 [2] Kang, Bingyi, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi Feng, and Yannis Kalantidis. “Decoupling representation and classifier for long-tailed recognition.” arXiv preprint arXiv:1910.09217 (2019).

  1. Some descriptions are not clear. R: Thanks for this comment. We will carefully revise some descriptions in the manuscript to make it easier to follow, e.g., the use of some symbols and mathematical variables. More details on the analysis of the components used will be added to show how they make our proposed methods performant.



back to top