List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Yilan Zhang, Jianqi Chen, Ke Wang, Fengying Xie
Abstract
Skin image datasets often suffer from imbalanced data distribution, exacerbating the difficulty of computer-aided skin disease diagnosis. Some recent works exploit supervised contrastive learning (SCL) for this long-tailed challenge. Despite achieving significant performance, these SCL-based methods focus more on head classes, yet ignoring the utilization of information in tail classes. In this paper, we propose class-Enhancement Contrastive Learning (ECL), which enriches the information of minority classes and treats different classes equally. For information enhancement, we design a hybrid-proxy model to generate class-dependent proxies and propose a cycle update strategy for parameters optimization. A balanced-hybrid-proxy loss is designed to exploit relations between samples and proxies with different classes treated equally. Taking both “imbalanced data” and “imbalanced diagnosis difficulty” into account, we further present a balanced-weighted cross-entropy loss following curriculum learning schedule. Experimental results on the classification of imbalanced skin lesion data have demonstrated the superiority and effectiveness of our method. The codes can be publicly available from https://github.com/zylbuaa/ECL.git.
Link to paper
DOI: https://doi.org/10.1007/978-3-031-43895-0_23
SharedIt: https://rdcu.be/dnwx5
Link to the code repository
https://github.com/zylbuaa/ECL.git
Link to the dataset(s)
N/A
Reviews
Review #2
- Please describe the contribution of the paper
This paper proposes class-Enhancement Contrastive Learning for long-tailed skin lesion classification. The proposed method consists of several innovative component, including the hybrid-proxy model, balanced-hybrid-proxy (BHP) loss, and balanced-weighted cross-entropy loss. Experiment results demonstrate that the proposed method outperforms several state-of-the-art methods on two imbalanced dermoscopic image datasets.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is well written and easy to follow.
- The proposed method is novel and sound. It consists of several innovative component, including the hybrid-proxy model, balanced-hybrid-proxy (BHP) loss, and balanced-weighted cross-entropy loss specifically designed for the long-tailed classification problem. The effectiveness of each component is validated through ablation studies,
- Extenstive experiments on two datasets demonstrate the effectiveness of the proposed method for long-tailed skin lesion classification. ECL also outperforms several state-of-the-art methods, especially on the ISIC2019 dataset.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The comparison with TSC [13] is missing. TSC is a targeted supervised contrastive learning method for long-tail classification algorithm, which has similar concept as the proposed method.
- It is unclear how hyper-parameters are selected such as E1, E2, τ, λ, and µ.
- In the cycle update strategy, the proxy vectors are only updated once per epoch, which might be suboptimal. Has the authors tried other strategy, such as using weighted moving average to update the proxy vectors more frequently?
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The paper uses public datasets and the authors agree to make the codes public, which will make this work easy to reproduce.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
The authors are encouraged to address the comments in the weakness section.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
6
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The proposed method is novel and the paper is well written with extensive experiments. I tend to accept this paper but will encourage the authors to address the comments in the weakness section.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
The rebuttal addresses most of my concerns. However, it seems the improvement compared to TSC is not significant. The authors are encouraged to discuss this in the final version.
Review #3
- Please describe the contribution of the paper
The manuscript addresses the important problem of simultaneously improving performance on i) rare (“tail”) disease classes and ii) particularly difficult-to-diagnose disease classes. These correspond to the two problems of class imbalance and diagnosis difficulty imbalance.
The authors propose a combined method that employs, firstly, “class-enhancement contrastive learning”, a supervised contrastive learning approach that addresses the class imbalance issue, and, secondly, a curriculum learning strategy that increases the loss weight on samples from under-represented and under-performing disease classes.
The approach is validated on the ISIC2018 and ISIC2019 datasets, with the proposed method (using a ResNet50 backbone) outperforming a range of alternative approaches on metrics including accuracy, F1 score, and AUROC. An ablation study investigates the importance of different aspects of the proposed method.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The task of simultaneously addressing class representation imbalance and imbalance in task difficulty is highly important, and this is the first work that I see attempting to address this challenge. The developed methods are - to my knowledge - novel and interesting, and the results of the performance evaluation are promising.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
My main criticism concerns the clarity of exposition - in many places, I found it hard to follow the main text. It may be that this was partially because I am not very familiar with some of the relevant background literature, but I believe that the clarity can be improved significantly. Some specific instances:
- I believe the notion of (especially) “head” and “tail” classes will not be familiar to many readers.
- The notion of a “proxy” is referred to a lot throughout abstract, introduction, and the rest of the paper, but it is never really explained what a proxy is and what it is used for. The word can refer to many different concepts in different areas of machine learning.
- In Fig. 1, what do “over-treatment” and “equal treatment” mean?
- In the introduction, the authors write about designing a loss for “an unbiased classifier”. Unbiased in which way? Again, the word “bias” can refer to a very large number of different concepts in machine learning.
- In Fig. 2, what is “a reserve imbalanced way [to generate proxies]”? This phrasing never occurs throughout the main text.
- The first paragraph of the Methods section was near-unreadable for me. This should be the paragraph making the reader familiar with the whole concept, but it starts with very low-level details and complex terminology. How does the classifier learning branch “provide abundant data representations for CL”? The “class-dependent proxies generated by HPM” are referred to before it is explained what they are.
- In some places, it was unclear to me whether a concept is newly introduced by the authors, or whether it has previously been described. For instance, is the notion of a “Hybrid-Proxy Model” new? (What makes it “hybrid”, by the way?) How do the concepts proposed by the authors relate to Balanced Contrastive Learning (BCL)? In the introduction, the authors write that “we propose a balanced-hybrid-proxy loss (BHP), besides introducing balanced contrastive learning (BCL) [23]”, which leaves me confused as to the relation of the present work to BCL.
- The large number of abbreviations - SCL, ECL, BCL, HPM, BHP, BWCE - many of which are non-standard make the manuscript harder to read than necessary.
- In section 2.1, the authors write about “categories” in some places. What are these supposed to be? Is it the same as “class”?
- Can the authors provide a textual description of the meaning of Eqs. (2) and (3), including the intuition behind them? What is the motivation for this particular loss function?
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The datasets used in the study are publicly available, and the authors have indicated that they will make all necessary code available after publication.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
Besides my clarity-related concern above, I only have a few minor remarks.
- Could the authors elaborate on their rationale for the specific formula in Eq. (1)? It seems like this is just one of many possible ways to choose the number of proxies?
- In Eq. (5), should there be parentheses around the fractions? I assume the exponent relates to the whole fraction?
- Would it be possible to add uncertainty metrics to the evaluation tables, e.g., standard deviation? If re-running all experiments multiple times is too costly, some UQ could already be achieved by simply resampling (bootstrapping) the test set?
- Especially given the focus on rare and difficult diseases, it might be very interesting to evaluate the proposed method on a more diverse dataset, such as the Diverse Dermatology Images (DDI) dataset (see Daneshjou et al., Disparities in dermatology AI performance on a diverse, curated clinical image set).
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
While I find the presentation of the paper lacking in clarity, it is still reasonably well-written, the evaluation is reasonable, and - in particular - I consider the proposed method an interesting, novel, and practical approach to an important open problem. If the authors can alleviate some of my concerns and improve the clarity of the presentation, I believe this will be a great contribution to the conference.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
The proposed method represents an interesting, novel, and practical approach to an important open problem. The rebuttal convinced me that the authors will address most of my (largely exposition-related) concerns in the final, camera-ready version. I believe this will be a good contribution to the conference.
Review #4
- Please describe the contribution of the paper
Authors proposed Class-Enhancement Contrastive Learning for Long-tailed data Classification.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The hybrid-proxy model and balanced-hybrid-proxy loss are novel formulation.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Sounds good
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Sounds good
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
What is the computation complexity of ECL, is there any advantage than other CL methods?
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
7
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The hybrid-proxy model and balanced-hybrid-proxy loss are novel formulation.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
The proposed method includes several innovative components, such as the hybrid-proxy model, balanced-hybrid-proxy (BHP) loss, and balanced-weighted cross-entropy loss. The experimental results demonstrate that the proposed method outperforms several state-of-the-art methods on two imbalanced dermoscopic image datasets. However, the reviewer notes that the clarity of exposition could be improved significantly. There are several instances where it was hard to follow the main text, and many concepts are not adequately explained. The reviewer suggests that the authors provide a clear definition of the head and tail classes, explain what a proxy is and what it is used for, and provide a textual description of the meaning of equations (2) and (3), including the intuition behind them. Additionally, the reviewer notes that the large number of abbreviations used in the paper makes it harder to read than necessary. The authors should provide a textual description of the meaning of these abbreviations. Overall, the paper proposes an innovative method that outperforms several state-of-the-art methods in long-tailed skin lesion classification. However, the clarity of exposition should be improved to make the paper more accessible to readers.
Author Feedback
We would like to thank all the reviewers for their favorable comments and constructive suggestions. We give detailed responses as follows: #R2: Q1: Comparison with TSC. Following your suggestion, we have conducted this method, and the results are as follows: 74.94%F1 and 85.94%Acc on ISIC2018; 75.13%F1 and 84.75%Acc on ISIC2019. We will add them in the final version. Q2: Regarding the hyper-parameters selection. We use the grid search method. The params are divided into three groups: temperature τ; weights λ and μ; training stages E1 and E2. When optimizing each group, we fix the others and the params with optimal results on validation set are chosen. The recommended range of params will be publicly available with codes.
Q3: Regarding the cycle update strategy. With the motivation of enabling the updating of proxies in view of whole data distribution, we have tried some strategies such as exponential moving average(EMA) and simple moving average(SMA). But the performance is not significantly improved (e.g. EMA achieved 78.05%F1 and 85.74%Acc on ISIC2019, lower than ours). So we choose cycle update as a hyperparameter-friendly and effective strategy. #R3: Q1: Clarity of exposition. Regarding the concepts, abbreviations, and expositions you highlighted, e.g. “head”, “proxy”, “unbiased classifier”, “ECL” etc., we will revise carefully and ensure they are clear in the updated paper, and therefore address the following concerns. Firstly, in Fig.1 “over-treatment” means that the supervised contrastive learning loss focuses more on majority classes (Theoretical proof can refer to reference[23]). And our goal is to learn a great representative space by treating all classes equally, hence we call it “equal-treatment”. Then, regarding the complex terminology in Fig.2, we will simplify the figure to help readers better understand it. About “a reverse imbalanced way” in caption, we have introduced this way in Section2.1, which is a method to calculate proxy numbers for classes. Also in Fig.2, the feature embeddings z^{1} are extracted by classifier branch and also used for contrastive learning, so this branch can “provide abundant data representations for CL”. After that, regarding whether concepts are new or existing, most of the existing concepts are labeled with references. “Hybrid-Proxy Model” is a new notion, and it hybrids ideas like proxy-based, CL, and optimization to enhance proxies approximating data space, giving it its unique name. Finally, concerning the relation of our work to BCL, we utilized the class averaging idea in BCL to make each class have an approximate contribution(Eq.3), however, our method differs by adding proxy-to-sample and proxy-to-proxy relations in CL (Eq.2). The intuition behind Eq.2 and 3 is to pull points in the same class together (Eq.2 fraction), while push apart samples from different classes in embedding space(Eq.3) by using dot product as a similarity measure. Q2: Rationale in Eq.1. Yes, it is one possible way. However, we believe it is an appropriate method. We choose the proxy numbers by calculating the imbalanced factors N_{max}/N_{c} of each class, and reducing the factors tenfold since the original factors are too high. We set 1 proxy for the largest class and others set 2 or more proxies, which can alleviate imbalance issues in a batch. In ablation study, the results showed that this method is better than setting proxies with equal numbers. Q3: About Eq.5. Yes, we will add parentheses. Q4: Uncertainty metrics. Standard deviation has been reported in supplementary material. Q5: Rare and difficult diseases. We look forward to further validating our method on DDI and other datasets in future studies! #R4 Q1: computation complexity of ECL. We don’t introduce any computation complexity in inference and our method performs better than other CL methods. #Meta-Reviewer The clarity issues will be revised in the final version (see #R3 for details) to ensure reader accessibility.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The paper under consideration addresses a crucial challenge in the field of medical diagnostics: improving performance in diagnosing both rare (“tail”) disease classes and particularly difficult-to-diagnose disease classes. These issues pertain to the broader problem of class imbalance and diagnosis difficulty imbalance. The authors have proposed a novel method combining “class-enhancement contrastive learning,” a supervised contrastive learning approach addressing the class imbalance issue, with a curriculum learning strategy that heightens the loss weight on samples from under-represented and hard-to-diagnose disease classes. The effectiveness of this method is demonstrated through validation on the ISIC2018 and ISIC2019 datasets, where the proposed method outperforms several alternative approaches on various performance metrics. The authors further investigate the importance of different components of the proposed method in an ablation study. The paper’s strength lies in its innovative approach to tackling class representation imbalance and task difficulty imbalance concurrently. The methods presented are unique and promising, and the results of the performance evaluation are impressive. However, the paper does have some weaknesses, primarily concerning the clarity of its exposition. Several parts of the paper could be difficult to follow for readers. Some concepts used, such as “head” and “tail” classes, might not be well-known. Additionally, many terms, including “proxy”, “over-treatment”, “equal treatment”, and “unbiased classifier” lack clear explanations. The paper is also laden with complex terminology and an excess of non-standard abbreviations. It is occasionally unclear whether a concept is newly introduced or previously described. In response to the reviewers’ feedback, the authors have acknowledged the need for revisions and clarified the terms, concepts, and abbreviations in the rebuttal. They provide explanations for terms like “over-treatment”, “equal-treatment”, and “Hybrid-Proxy Model”. They clarify their work’s relationship to Balanced Contrastive Learning (BCL) and the intuition behind Eqs. (2) and (3). They also defend their selection of proxy numbers and assert that their method outperforms setting proxies with equal numbers. The authors also address various technical questions and suggestions put forward by the reviewers and promise to make necessary adjustments and additions in the final version of the paper. Given the thoughtful responses from the authors and the significant contributions of the paper, I recommend the acceptance of the paper. However, it is essential that the authors fulfill their promise to revise and clarify the paper’s terminology and concepts, making it more reader-friendly.
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
Pros:
- Topic: The topic on long-tail disease classification is significant.
- Novelty: The proposed method is novel and sound. Several modifications are specifically designed for the long-tailed classification problem.
- Experiemnts: Extenstive experiments on two datasets demonstrate the effectiveness of the proposed method for long-tailed skin lesion classification compared with several SOTA methods.
- Style: The paper is well written and easy to follow. Cons:
- Experiments: A similar method should be compared.
- Clarity: The clarity of exposition could be improved significantly. After Rebuttal:
- Reviews are more consistant and one reviewer changes to a higer score
- major issues are well explained and acknowledged
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
I agree with Reviewer 2 that many important concepts in the paper are poorly explained. Some basic terms such as proxy and bias are ambiguous. And explanations of Equations (1)-(3) is missing. As a result, I cannot verify technical correctness of the paper.
I think the review provided by Reviewer 3 is not informative and his/her score should be disregarded.
Figure 3 is practically useless because it cannot be read.
Overall, this is a paper that has attacked an important problem, but the presentation is so convoluted that will have little use for MICCAI audience. Hence, I recommend rejection of this paper.