Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Qin Zhou, Guoyan Zheng

Abstract

Federated learning is a promising strategy for performing privacy-preserving, distributed learning for medical image segmentation. However, the data-level heterogeneity as well as system-level heterogeneity makes it challenging to optimize. In this paper, we propose to improve Federated optimization via local Contrastive learning and Global Process-aware Aggregation (referred as FedContrast-GPA), aiming to jointly address both data-level and system-level heterogeneity issues. In specific, To address data-level heterogeneity, we propose to learn a unified latent feature space via an intra-client and inter-client local prototype based contrastive learning scheme. Among which, intra-client contrastive learning is adopted to improve the discriminative ability of learned feature embedding at each client, while inter-client contrastive learning is introduced to achieve cross-client distribution perception and alignment in a privacy preserving manner. To address system-level heterogeneity, we further propose a simple yet effective process-aware aggregation scheme to achieve effective straggler mitigation. Experimental results on six prostate segmentation datasets demonstrate large performance boost over existing state-of-the-art methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_62

SharedIt: https://rdcu.be/dnwzu

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposed a prototype-based federated learning methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. well-organized
    2. good experimental results;
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. the details of experiemnts are not clear;
    2. lack of related work on prototype based federated learning methods.
    3. the authors seem include validation set into trianing with Eq. 11. Please clarify this.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    code not avaialble.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. prototype aggregation for FL is not a new idea. please properly discuss this work with “FedProto: Federated Prototype Learning across Heterogeneous Clients”, “Deep Federated Anomaly Detection for Multivariate Time Series Data”, etc.

    2. How is the performance for centralized training and client seperate training?

    3. The hyperparmeter is crucial for the final performance. The authors are encouraged to provide detailed settings for their and compared methods. Also, the number of communication round is important for federated leanring, and authors should do experiments to show its influence.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method is not novel and reference is not proper. Experimental settings are not clear.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    I thank the authors for the reponse. I agree to raise my recommendation to weak-accept since the paper still lacks of insufficient experiments wrt to federated settings and disucussion with related work.



Review #3

  • Please describe the contribution of the paper

    The article discusses the challenges of optimizing federated learning for privacy-preserving, distributed learning for medical image segmentation due to data-level heterogeneity and system-level heterogeneity. The article proposes a new approach called FedContrast-GPA, which combines local contrastive learning and global process-aware aggregation to address both issues. The approach uses intra-client and inter-client contrastive learning to learn a unified latent feature space and achieve cross-client distribution perception and alignment while preserving privacy. Additionally, a process-aware aggregation scheme is used to mitigate stragglers. Experimental results on six prostate segmentation datasets show that FedContrast-GPA outperforms existing state-of-the-art methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • A Federated optimization via local Contrastive learning and Global Process-aware Aggregation algorithm for data-level and system-level heterogeneity issues.
    • Reasonable motivation statement.
    • Extensive case studies.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The outcomes of the experiment need to be made more convincing.
    • Requires another round of polishing for grammatical errors and typo.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors did not provide the code and data for reproduction but it can be reproduced by the algorithm provided in the manuscript.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This paper presents a well-written and comprehensive evaluation of the proposed method. However, the following suggestions may enhance the quality of the paper and better serve the community of researchers:

    Firstly, to facilitate the reproducibility of the research results, the authors should provide complete code and data, along with detailed instructions for reproduction. This will enable researchers to verify the findings and further explore the proposed method.

    Secondly, the paper’s client scale evaluation is limited to only six clients, which may not reflect the algorithm’s performance under more significant client scales. Therefore, it is recommended that the authors expand the client scale, such as evaluating the algorithm’s effectiveness when the number of clients equals 20. Additionally, the authors should consider using more datasets for evaluation to ensure the robustness and generalizability of the proposed method.

    In summary, addressing these concerns will help enhance the paper’s accessibility, reproducibility, and generalizability, and ultimately, contribute further to the advancement of the field.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents a well-written and comprehensive evaluation of the proposed method. However, the following suggestions may enhance the quality of the paper and better serve the community of researchers:

    Firstly, to facilitate the reproducibility of the research results, the authors should provide complete code and data, along with detailed instructions for reproduction. This will enable researchers to verify the findings and further explore the proposed method.

    Secondly, the paper’s client scale evaluation is limited to only six clients, which may not reflect the algorithm’s performance under more significant client scales. Therefore, it is recommended that the authors expand the client scale, such as evaluating the algorithm’s effectiveness when the number of clients equals 20. Additionally, the authors should consider using more datasets for evaluation to ensure the robustness and generalizability of the proposed method.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The paper proposes federated optimization via local contrastive learning and global process-aware aggregation (referred as FedContrast-GPA), to address the strggler issue and client draft issues of federated learning like FedAv. The approach is test for prostate segmentation in MR images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Well written
    • Well motivated
    • Sound formalism
    • Comparative experiments
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Most recent works in FL have not been discussed. For example what is the relationship between straggler issue and fairness: Hosseini et al., Proportionally Fair Hospital Collaborations in Federated Learning of Histopathology Images. IEEE Trans Med Imaging. 2023

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Generally difficult to reproduce FL results but if the authors share their code that would help.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please clarify the relationship between straggler issue and fairness.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea is new but partly not well-separated from fairness in FL. Formulations are sound and concise. Results are sufficient.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The reviewers concur that the paper is eloquently written and commend the robust experimental outcomes. However, they express concerns regarding the insufficient detail, clarity on the formula, and discussion concerning related work. Specifically, the authors should consider including a discussion and comparison with prototype-based or global adjustment-based Federated Learning (FL) methods. This would offer a more complete and fair evaluation of the effectiveness of the FedContrast-GPA.

    I would also like to highlight my own concerns regarding the comparison with baseline methods. The authors need to explicitly clarify that the methods they are comparing with, such as FedAvg (2016), FedLG (2017), FedProx (2018), and the more recent FedDG and MOON (2021), represent state-of-the-art techniques. Given the rapid proliferation of FL methods, the authors should contemplate including more recent non-iid FL baselines in their comparison. For instance, VHL (2022) is a recently introduced anchor-based FL method.

    In addition, the sharing of local prototypes across clients introduces potential privacy risks and communication costs, which should be addressed in the paper.

    The authors are encouraged to respond to both the review comments and this meta-review during the rebuttal phase.

    Reference: VHL(2022): Defending Against Data Heterogeneity in Federated Learning.




Author Feedback

We thank meta-reviewer (MR) and all reviewers for their comments.

MR: clarifying SOTA methods FedAVG, FedProx and MOON are mentioned in the introduction. Introduction to FedDG and FedAvg-LG will be added.

MR,R1: Discussion with prototype-based methods and VHL We differ from FedProto and “Deep Federated Anomaly Detection (DFAD)” in two aspects: 1) the global aggregation of prototypes in FedProto and DFAD are sensitive to learning failures (e.g., caused by noisy labels) at a local client, while ours allows cross-client matching via Inter-LPCL, which is more robust to failure at a client; 2) both FedProto and DFAD do not consider the intra-class variations, which is critical in medical image segmentation. Our local prototype design can properly handle the intra-class variations.

As for VHL, the cross-client features are aligned by matching the virtual and natural distributions conditioned on labels. However, since the segmentation labels of 3D medical images have similar but varying shapes, structures and sizes, it is not easy to craft virtual labels with the same distribution, while collecting and annotating additional dataset is laborious. Besides, VHL is orthogonal to ours. It can be integrated to our method to improve performance.

MR: privacy risks and communication costs on sharing prototypes As pointed out in FedProto, “prototypes naturally protect data privacy, as they are 1D-vectors generated by averaging the low-dimension representations of samples from the same class, which is an irreversible process”. In terms of communication costs (CC), the size of the prototypes in our method is 8.0KB, while the model parameter size is 14MB (~1800 times larger than the prototypes). Thus we can achieve global distribution perception with a small add-on CC.

MR,R1: hyper-parameter setting The hyper-parameters (HP) of our method are presented in the implementation details of our method. For the compared methods, the base settings are kept the same as ours, other HPs are chosen by grid-search (In FedAvg-LG, the number of layers for global aggregation is set as 13, the hyper-parameters in FedDG are the same as the original paper, the weight for proxy term in FedProx is 2.5e-4, and the model contrastive coefficient in MOON is 0.01).

R1: performance for centralized and separate training The Dice (%) performance for each client in centralized training (CL) are [73,77,84,72,86,76], and [85,79,86,73,91,27] in separate training. We can see that directly putting data together in CL does not bring performance gain due to data heterogeneity. Besides, in separate training, we can see a severe performance drop (e.g., a Dice drop to 27 for client 6) in some clients without enough learning data and knowledge from others.

R1: validation set used in Eq. 11 The datasets in each client are split into train/val/test sets, and train/val are utilized to update Eq.11 for robustness.

R3: evaluation on more datasets We carried out experiments on the federated fundus segmentation (used in FedDG), and the average Dice for FedAvg, FedDG, FedProx, MOON and ours are 86.3, 84.7, 86.3, 86.0 and 87.6, respectively.

R4: relationship between straggler and fairness (Prop-FFL) The straggler effect (SE) is mainly caused by the heterogeneity of hardwares and network connection, where some clients may fail to complete their local training and upload partially-trained models, which become “stragglers”. From this point, the SE may lead to un-fairness among local models, since the “straggler” may bring undesirable performance to others. Therefore, our method shares a similar spirit with Prop-FFL in that we both encourage the global model not to have undesirable performance on any of clients while minimizing the total training loss. The difference is that rather than ensuring uniform performance by modifying the model parameters as in Prop-FFL, we pursue a better global model first and guide the training of stragglers via intra and inter client contrastive learning.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have made efforts to address some concerns raised by the reviewers. I acknowledge these efforts; however, several significant concerns persist and the authors’ rebuttal lacks sufficient justification.

    Particularly, the issue of selecting appropriate methods for comparison was not adequately addressed in the rebuttal, a point that is also noted by Reviewer 1. I had initially asked for clarification on why more advanced non-iid FL methods were not included in the comparison. The authors’ response, while indicating additional introduction of FedDG and FedAvg-LG, failed to provide a clear plan for improving the selection or justification of comparison methods.

    Moreover, there seems to be a discrepancy in the authors’ interpretation of related work. Specifically, the DFAD method employed a balanced loss to mitigate the problems associated with noisy data. This contrasts with the authors’ assertion that DFAD is susceptible to learning failures, particularly those triggered by noisy labels at a local client.

    Given these reasons, I believe that the manuscript would benefit significantly from a more thorough revision and reevaluation before it can be considered for acceptance.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposes s federated optimization via local Contrastive learning and Global Process-aware Aggregation algorithm for data-level and system-level heterogeneity issues. The paper is generally well written with the methods being easy to understand and follow. It reaches the minimum requirement for publication.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After carefully considering the authors’ response and the opinions of the other reviewers, I am pleased to recommend the acceptance of the manuscript.



back to top