Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Nannan Wu, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

Abstract

Federated learning (FL), training deep models from decentralized data without privacy leakage, has shown great potential in medical image computing recently. However, considering the ubiquitous class imbalance in medical data, FL can exhibit performance degradation, especially for minority classes (e.g. rare diseases). Existing methods towards this problem mainly focus on training a balanced classifier to eliminate class prior bias among classes, but neglect to explore better representation to facilitate classification performance. In this paper, we present a privacy-preserving FL method named FedIIC to combat class imbalance from two perspectives: feature learning and classifier learning. In feature learning, two levels of contrastive learning are designed to extract better class-specific features with imbalanced data in FL. In classifier learning, per-class margins are dynamically set according to real-time difficulty and class priors, which helps the model learn classes equally. Experimental results on publicly-available datasets demonstrate the superior performance of FedIIC in dealing with both real-world and simulated multi-source medical imaging data under class imbalance. Code is available at https://github.com/wnn2000/FedIIC.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_65

SharedIt: https://rdcu.be/dnwzx

Link to the code repository

https://github.com/wnn2000/FedIIC

Link to the dataset(s)

https://www.fc.up.pt/addi/ph2%20database.html

https://derm.cs.sfu.ca/Welcome.html

https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T

https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection

https://challenge.isic-archive.com/data/#2019

Reviews

Review #3

Please describe the contribution of the paper

The paper aims to address class imbalanced federated learning problem (including inter-client and intra-client imbalance). To this end, it employs inter and intra-client contrastive learning coupled to a logit adjustment scheme that negates the impact of local class imbalance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper combines the well-known ideas of contrastive learning and logit adjustment in federated learning setting to tackle class imbalance problem. It applies the model on three real world medical imaging datasets and achieves better performance than chosen baseline methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The title of the paper suggests ‘Robust Federated Learning for Class-Imbalanced…’. However, the robustness of the proposed method is not validated by varying class imbalance in different datasets. It is not evident how robust the proposed model is against varied class imbalance situations.
2. The paper lacks technical novelty. Interclient and intraclient contrastive learning has already been proposed earlier in federated learning (eg: (a) FedProc: Prototypical Contrastive Federated Learning on Non-IID data, (b) Dual Class-Aware Contrastive Federated Semi-Supervised Learning). Logit adjustment and orthogonalization are also widely used methods.
3. The paper claims that the method alleviates attribute bias and class bias simultaneously. However, there is no evidence as to how attribute bias is being addressed.
4. The paper does not explain the reason behind the choice of value for k1, k2 in Eqn 7. These need to be validated via ablation analysis.
5. The proposed method has not been compared with the state-of-the-art class imbalanced federated learning papers. Most of the baselines papers included are not targeted towards solving class imbalance in federated learning. Instead, the authors might consider comparing with class imbalanced federated learning papers like (a) Addressing Class Imbalance in Federated Learning, (b) Federated learning with class imbalance reduction, (c) Self-Balancing Federated Learning With Global Imbalanced Data in Mobile Systems.
6. The paper does not state the imbalance ratio of any of the task/dataset. While Fig 2 provides a qualitative idea, it is hard to determine the exact imbalance ratio from there. Since the work claims to tackle class imbalance, it is an important information to include. It will help determine the impact of attribute bias as the algorithm might perform differently for different datasets with same class imbalance level.
7. The paper does not report the minority class accuracy (or recall) which is a standard performance metric in most other class imbalanced learning works. Without the accuracy per class or atleast the minority class accuracy, it is hard to see how much real improvement is achieved in classifying the minority classes, which is the main target of class-imbalanced learning.
8. Dirichlet coefficient is kept 1 for all datasets, which is not sufficient for class imbalance problems. It is rather a standard FL setting for mild non-IIDness. The authors might want to consider decreasing alpha (to 0.5, 0.2) and validate the method on higher class imbalance situations. Most existing works (eg: Optimized Federated Learning on Class-biased Distributed Data Sources, Federated Learning on Non-IID Data Silos: An Experimental Study, etc) demonstrate model effectiveness in higher non-IID conditions under federated learning setting.
9. Ablation study lacks an important case with Intra and Inter but no DALA. I believe DALA won’t improve results after feature extractor has already been calibrated.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Some of the model hyperparameters and experimental design considerations are missing. Neither code nor pseudocode/algorithm has been provided, which might make it harder to reproduce the work.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The paper would benefit from robustness tests, i.e., conducting additional experiments by varying the data heterogeneity (via Dirichlet coefficient). See weakness section for more comments.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

3
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

( a) Lack of technical novelty, (b) lack of experiments and results to justify the claims made in title, abstract, introduction, and conclusion, (c) lack of comparison with state-of-the-art class imbalanced federated learning papers, (d) lack of proper ablation study.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

a)A new perspective is proposed for realistic medical federation learning scenarios with category imbalance in global training data. b)A new privacy-preserving framework FedIIC is proposed for balancing the category imbalance problem in federation learning. c)FedIIC is shown to have superior performance in handling category imbalance in real-world and simulated multi-source decentralized settings.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The FedIIC approach proposed in this paper addresses the category imbalance problem from two aspects, by calibration of feature learning and classifier learning. This approach is different from the existing methods and is highly innovative. In addition, the paper also applies the FL technique to the field of medical image processing by using real multi-source dermatological mirror datasets, etc., which also demonstrates its feasibility in practical medical applications and clinical applications.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

This paper has less visual representation of the results, which may make it difficult for readers to understand the findings. It is recommended that the authors add some appropriate charts or graphs to visually represent the study results to help readers better understand the data while improving the readability and credibility of the paper.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

This paper has acceptable reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

First, your paper provides a clear and concise summary of the problem you are solving (category imbalance in medical image classification) and the approach you have developed (FedIIC). You do a good job of explaining the motivation behind your work and how it differs from previous joint learning approaches. In addition, your experimental results show that FedIIC outperforms state-of-the-art FL methods in both real-world and simulated medical FL scenarios. A major strength of your paper is the approach to address class imbalance from the perspective of feature and classifier learning. By using two levels of comparative learning to extract better class-specific features of imbalanced data in FL, you are able to improve model performance while preserving privacy. In addition, your use of difficulty-aware logit adjustment helps to balance the decision boundaries across classes. However, I think the file could also be improved in some ways. For example, while you provide a clear explanation of how FedIIC works, I sometimes found myself struggling to understand the technical details. It might be helpful to provide more specific examples or visual aids to help the reader understand how each component of FedIIC works. In terms of presentation and organization, I found your paper to be generally well structured and easy to understand. However, there are a few places where I feel certain parts could have been more clearly defined or explained. For example, more background information on joint learning for readers less familiar with the topic might have been helpful. Finally, in terms of reproducibility, I believe that your paper provides enough detail for other researchers to replicate your experiment. However, it might be helpful to provide more information about the specific dataset and model used in the experiment, as well as any hyperparameters or other settings used.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. This paper addresses an important problem in medical image classification: classification imbalance. The authors clearly explain why this problem is particularly challenging in the context of rare diseases, and how their approach helps to solve it.
2. The authors propose FedIIC, a new federal learning method that addresses the class imbalance problem from the perspective of feature and classifier learning. They use two levels of comparative learning to extract better class-specific features of imbalanced data in FL and use difficulty-aware logit adjustment to balance the decision boundaries of different classes.
3. Experimental results show that FedIIC outperforms state-of-the-art FL methods in both real-world and simulated medical FL scenarios. However, there are also some areas where the paper could be improved. For example, while the authors provide a clear explanation of how FedIIC works, there were times when I found myself struggling to follow along with the technical details.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #1

Please describe the contribution of the paper

Federated Learning performance can be degraded especially for minority classes. The author tackles the class imbalance problem in FL using feature learning (with contrastive learning) and classifier learning (by setting per-class margins to help the model learn each class). There are biases caused by class imbalance: attribute bias (minority classes have more imbalanced background attributes) and class bias (difference in prior probabilities, biased to majority classes). The paper presents a new class-balancing FL to tackle those biases, called FedIIC.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Novel approach: Instead of only balancing the classes from classifier perspective, this paper explored better representations with class-imbalanced data to improve the performance.
- Extendibility: They proposed intra and inter-client contrastive learning, which is shown to improve the baseline Federated Avg
- Strong evaluation: Extensive comparison between other methods on 3 datasets.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Limited novelty: although the overall method brings novelty to Federated Learning but the components are not that new (contrastive learning) and tweaks of losses ( DALA, Intra and Inter loss)
- Limited explanation: try to be more clear in explaining the function of per-class margins to logits:
- Although FedIIC achieves the best overall performance(outperforming FedAvg with large margins), but Table 1 shows a not significant performance especially on ICH dataset
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

no code given
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

Well done for presenting an interesting paper in Federated Learning with a new approach to calibrate feature extractor and the classification. It is important to reduce biases in imbalance classes, especially for minority classes. The author also had done an extensive evaluation and compared to various datasets & methods. However, an easier explanation can be detailed regarding component-wise and parameter-wise study.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Federated learning is applicable in real clinical setting. Also, this paper address the real problem: class imbalance problems across the various sites. Although, the improvement is not that tremendous, but we see that the biases reduction in FL could help to improve the model performance.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper proposes a solution to one of major problem in federated learning in medical domain such as class imbalance. The components of the method are not novel but the combination various ideas to solve the problem is unique. It would be great of the authors can address the comments from all the reviewers. Experiments are done of three dataset.

Author Feedback

We sincerely thank all reviewers for the high-quality reviews and constructive feedback. We have carefully studied the review comments and will revise the manuscript as suggested. Due to the page limitation, some comments will be seriously taken into consideration and well-addressed in our future work. The major concerns are addressed as follows. [R3: Lack of technical novelty.] Compared to previous FL studies [1, 2] using contrastive learning (CL), CL in this paper was designed in a different way and for a different motivation. Specifically, we present intra- and inter-client CL to obtain a better feature extractor in the context of class imbalance, instead of addressing non-iid [1] or semi-supervision [2]. For design details, components like dynamic temperature and orthogonalization not considered in [1, 2] are combined with CL to adapt it to class imbalance. As for logit adjustment, difficulty is introduced to compute per-class margins for better adaptation to medical images with large intra-class variations, which is different from many studies towards imbalanced natural images [3]. [R3: there is no evidence as to how attribute bias is being addressed.] T-SNE visualization of feature distributions is given in the supplementary materials. Compared to CReFF, the feature distributions of FedIIC have better intra-class compactness, indicating its better recognition of class-specific features of each class. This phenomenon validates our claim on alleviating attribute bias to some degree. [R3: Explain the reason behind the choice of value for k1 and k2.] In the component-wise ablation study, we remove the two CL components by separately setting k1=0 and k2=0 to validate the effectiveness of each component. Due to the page limitation, we did not conduct ablation studies on the choice of k1 and k2, which will be added to our future work. [R3: Comparison with the SOTA class imbalanced federated learning papers [4, 5, 6].] Some methods were selected as baselines due to similar research contents (e.g. contrastive learning) though they were not designed for class-imbalanced FL. For class-imbalanced FL, we have chosen several most recent and related studies (e.g. CLIMB and CReFF) for comparison. As for the three papers suggested, the papers [4, 5] are not suitable as baselines for comparison due to their dependency on an auxiliary balanced dataset which is infeasible under the setting of this paper. We will add the comparison against [6] in our future work. [R3: State the imbalance ratio of any of the datasets.] The imbalance ratio will be reported in the camera-ready version. [R3: Report the minority class accuracy (or recall).] We believe the most significant target of class-imbalanced learning is to obtain a balanced classifier, hence metrics averaged over all classes (i.e. BACC and F1) are reported. We will report per-class accuracy in our future work as your suggested. [R3: Dirichlet coefficient is kept as 1 for all datasets, which is not sufficient for class imbalance problems.] The primary focus of this paper is on class imbalance rather than the non-iid problem, and hence we simply set the Dirichlet coefficient as 1 instead of discussing more complicated cases. In fact, this setting is complicated enough in medical scenarios compared with existing studies like [7] where the Dirichlet coefficient was set as 1.5. [R3: Missing ablation studies without DALA.] More comprehensive ablation studies will be conducted in our future work.

References: [1] FedProc: Prototypical Contrastive Federated Learning on Non-IID data [2] Dual Class-Aware Contrastive Federated Semi-Supervised Learning [3] Balanced Contrastive Learning for Long-Tailed Visual Recognition [4] Addressing Class Imbalance in Federated Learning [5] Federated learning with class imbalance reduction [6] Self-Balancing Federated Learning with Global Imbalanced Data in Mobile Systems [7] Dynamic Bank Learning for Semi-supervised Federated Image Diagnosis with Class Imbalance

back to top

FedIIC: Towards Robust Federated Learning for Class-Imbalanced Medical Image Classification