Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Meirui Jiang, Hongzheng Yang, Xiaoxiao Li, Quande Liu, Pheng-Ann Heng, Qi Dou

Abstract

class distributions among unlabeled clients is still unsolved for real-world use. In this paper, we study a practical yet challenging problem of class imbalanced semi-supervised FL (imFed-Semi), which allows all clients to have only unlabeled data while the server just has a small amount of labeled data. This imFed-Semi problem is addressed by a novel dynamic bank learning scheme, which improves client training by exploiting class proportion information. This scheme consists of two parts, i.e., the dynamic bank construction to distill various class proportions for each local client, and the sub-bank classification to impose the local model to learn different class proportions. We evaluate our approach on two public real-world medical datasets, including the intracranial hemorrhage diagnosis with 25,000 CT slices and skin lesion diagnosis with 10,015 dermoscopy images. The effectiveness of our method has been validated with significant performance improvements (7.61% and 4.69%) compared with the second-best on the accuracy, as well as comprehensive analytical studies. Code is available at https://github.com/med-air/imFedSemi.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16437-8_19

SharedIt: https://rdcu.be/cVRs5

Link to the code repository

https://github.com/med-air/imFedSemi

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper aims to address the challenging problem of class imbalanced semi-supervised FL. It proposes a dynamic bank learning scheme consisting of the dynamic bank construction and the sub-bank classification. Expensive experiments on two benchmark datasets demonstrate the superior performance of the proposed method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The idea is relatively novel.
    2. The overflow of the proposed method is clear.
    3. This paper is well written.
    4. The results of the proposed are very good.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some figures and equations are not well explained,like:

    1. In Fig.1, it is not clear what is the \pi_1\cdots \pi_k. Additionally, it is better to simply explain the overflow in the caption.
    2. In Eq. 2, it is better to explain the function 1(\cdot).
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This paper has shown the implementation details. Additionally, this paper has provided the codes in the supplemental material, and it will release the codes. So I think its reproducibility is very good.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. Some figures and equations are not well explained,like: 1). In Fig.1, it is not clear what is the \pi_1\cdots \pi_k. Additionally, it is better to simply explain the overflow in the caption. 2). In Eq. 2, it is better to explain the function 1(\cdot).
    2. It is better to simply analyze the reasons that why the proposed method has superior performance over the comparative methods in the experimental section.
    3. The paper might miss some related works as follows: [1] Laine, Samuli, and Timo Aila. “Temporal ensembling for semi-supervised learning.” ICLR (2017). [2] Shi, Xiaoshuang, et al. “Graph temporal ensembling based semi-supervised convolutional neural network with noisy labels for histopathology image analysis.” Medical image analysis 60 (2020): 101624.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Well written.
    2. The idea is relatively novel.
    3. Good results.
  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This work proposes a dynamic bank learning scheme to address the class imbalance in the semi-supervised FL. This scheme consists of two parts at the client side, including the dynamic bank construction to distill various class proportions for each client, and the sub-bank classification to guide the local model to learn different class proportions. On two public datasets, the method achieves remarkable improvements.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This work aims to address the class imbalance in the semi-supervised FL, which is an interesting task. Moreover, they consider the server with a small amount of labeled data and all clients only have unlabeled data, which is a relatively new setting in medical imaging.
    2. Significant performance improvements.
    3. The work is well written.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The technical improvement is limited to the local training at the client side.
    2. In the imbalanced case, accuracy is not suitable as a primary analytic metric.
    3. The experimental results in Fig. 2 (d) are somewhat strange.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Hyper-parameters are complete. Authors are going to release the code. The details of thresholds h_m/h_c may be improved.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. According to the S_k of the experimental setting, the server side has a small number of class-balanced samples, which is a huge advantage compared to other semi-supervised FL algorithms in comparison. However, the proposed method only improves the training on the client side, without making full use of the server-side knowledge provided by the setup. Compared to the upper bound of performance, there is still a lot of room for improvement.

    2. Author should introduce the analysis in Section 3.3 is performed on which dataset. Furthermore, accuracy is not suitable as a primary analytic metric due to the severe effects of imbalance, such as the large difference in numerical values between accuracy and F1.

    3. The experimental results shown in Fig. 2 (d) are not intuitive. (1) When S_k=15, the standard deviation is reduced significantly compared to the adjacent values. What makes the method significantly improve the robustness at the selected S_k? (2) In addition, a larger S_k (i.e., more class-balanced samples on the server side) should not hurt the performance of FL, why is the performance apparently lower than 10 and 15 when S_k=20?

    4. It seems the method requires the assumption of IID for data at server and clients. In other words, the method does not consider the data heterogeneity (shift on p(x)) among participants. In practice, due to the small number of samples on the server side, data heterogeneity will inevitably be involved.

    5. Since the Dirichlet distribution is used to construct the imbalanced clients, authors should state the work is verified as a simulation of imbalanced scene to avoid potential ambiguity with “real-world use”.

    6. Does this work use the same threshold (h_m/h_c) for clients with different class distributions? If so, this may not be reasonable.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The interesting task and setting for medical FL. Significant performance improvements. Improper primary metric for evaluation.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper aims to achieve effective class-imbalanced semi-supervised federated learning, where the server has labeled data and the clients have only unlabeled data. To this end, this paper proposes a dynamic bank learning method to leverage the class proportion information. The dynamic bank construction distill class proportions for each client, and a sub-bank classification task is used for local training. The proposed methods are evaluated on two datasets and show improved performance over existing works.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The idea of using bank learning as a proxy task on clients is novel and interesting.
    • The performance improvement is effective compared with other methods.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The labeled data on the server is a subset of the client-collected data, which can raise privacy concerns.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The description of the experimental setup and the hyperparameters are clear, which is sufficient to reproduce the reported results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • When building the memory bank by Eq. 2, the threshold for the majority and minority classes need to be different. However, the class distribution may be unavailable and which class is the minority class may also be unknown.
    • When estimating the class prior for each sub-bank, it is assumed the class distribution of each sub-bank are not exactly the same. However, the sub-banks are split from the bank by random splitting, which may have the same class distribution.
    • If the data are distributed among client by a i.i.d. distribution, does the proposed method still outperform other methods?
    • The proposed methods require training on the server, but the baseline methods such as FedAvg do not. Could you please add this server-side training to FedAvg-FM and compare the results?
    • In Fig. 2 (e), it is interesting to see that with the increasing number of unlabeled clients, the accuracy of the proposed methods increase. Intuitively, with more clients, each client will have less number of samples and degrade the accuracy. Could you please explain more on this result?
    • Some related works on self-supervised federated learning [1][2] need to be discussed. [1] Federated Contrastive Learning for Decentralized Unlabeled Medical Images, MICCAI 2021 [2] Federated Contrastive Learning for Volumetric Medical Image Segmentation, MICCAI 2021
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The overall quality of this paper is good. The only concern is that the server needs data for training, which does not follow standard FL protocol and raise privacy concerns. Besides, in the experiments, the labeled data on the server seem to be a subset of the client-side data, which may cause privacy leakage.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper aims to address the class imbalance in the semi-supervised federated learning. The reviewers thought that this is an interesting direction to pursue and they agreed on the novelty of the proposed method. They also all comments positively on the clarity of the writing. Furthermore, experiments demonstrate significant improvements over other methods. There was some concern for using accuracy as the main metric fora an unbalanced dataset; however, since other metrics including F1-score are presented and follow the same trends this is not seen as a major weakness. Another minor concern was related to privacy due to the labeled data on there server-side being a subset of the client collected data.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2




Author Feedback

N/A



back to top