Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Meirui Jiang, Yuan Zhong, Anjie Le, Xiaoxiao Li, Qi Dou

Abstract

Despite recent progress in enhancing the privacy of federated learning (FL) via differential privacy (DP), the DP trade-off between privacy protection and performance is still underexplored for real-world medical use. In this paper, we propose to optimize the trade-off under the context of client-level DP, which focuses on privacy during communications. However, FL for medical imaging involves typically much fewer participants (hospitals) than other domains (e.g., mobile devices), thus ensuring clients be differentially private is much more challenging. To tackle this, we propose an adaptive intermediary strategy to improve performance without harming privacy. Specifically, we theoretically find splitting clients into sub-clients, which serve as intermediaries between hospitals and the server, can mitigate the noises introduced by DP without harming privacy. Our proposed approach is empirically evaluated on both classification and segmentation tasks using two public datasets, and its effectiveness is demonstrated with significant performance improvements and comprehensive analytical studies.



Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_47

SharedIt: https://rdcu.be/dnwy0

Link to the code repository

https://github.com/med-air/Client-DP-FL

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a novel method for enhancing the privacy-performance trade-off in federated learning (FL) for medical imaging, by using an adaptive intermediary strategy that splits the original clients into sub-clients to reduce the noise level and improve the utility. The paper shows some novelty and significance in the field of federated learning.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper develops a novel and effective method to balance the trade-off between privacy protection and performance in federated learning with client-level differential privacy. It provides new insights and analysis on the relations among noise level, training diversity, intermediary number, and privacy budget.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    My major concern is the reproducibility of the paper. The experimental details are not very clear.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This paper claims to have theoretical and empirical evidence to support its feasibility. It also provides analytical studies to demonstrate the effectiveness and stability of its method. However, the paper’s feasibility might be improved by providing more details on how it compares with other methods.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    In general, it is a well written paper, easy to follow and key contribution is clear.

    1. in line 7 of the 2nd paragraph of 3.1, “ Appendix ??” should be clear.
    2. how to deal with data imbalance and heterogenity?
    3. how to generalize to other methos?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper develops a novel and effective method to balance the trade-off between privacy protection and performance in federated learning with client-level differential privacy. It provides new insights and analysis on the relations among noise level, training diversity, intermediary number, and privacy budget.

  • Reviewer confidence

    Not confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The main contribution of this work is to show that partitioning the client data into multiple intermediaries and treating them as sub-clients can achieve better privacy-utility tradeoff in the case of client-side differential privacy. A method has also been proposed to determine the optimal number of intermediaries.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The motivation is clear and the proposed approach appears to be intuitively correct.

    2) It has also been backed up both theoretically and empirically.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The overall approach falls under the umbrella of privacy amplification in DP. Therefore, it would be good to compare the proposed method against other privacy amplification techniques such as random sub-sampling of local training data in each round.

    2) Some key notations have not been explained. For example, what is \zeta in eq. 2?

    3) The reported privacy budgets range from 36 to 600, which appear to be too high.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Appears to be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see comments provided under weaknesses.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed idea is interesting and has been backed by both theory and experiments.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #5

  • Please describe the contribution of the paper

    The authors propose a method to reduce the noise introduced with Differential Privacy protection for Federated Learning, thereby improving accuracy whilst retaining privacy.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors make some deductions and formulate some theorems about noise and diversity in federated clients and then empirically test them. This is a nice theory-led approach. The paper goes into the details of their assumptions and provides some useful and insightful metrics with which to analyse the client data contributions (e.g. noise and diversity). The supplementary materials elaborate on some of the underlying theorems used to construct the basis for this work, which is useful information.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper build on a lot of previous work and adds some novelty with respect to the intermediaries and the experimental exploration but whether this is sufficient is marginal. “Our objective is not to protect a single data point, but rather to guarantee that a learned model does not reveal whether a client participated in decentralized training.” - Not sure what the value of such an objective is, unless it protects single data points. Also, this may be a lot easier with IID data than real-world. The proposed method seems to involve more communication between the client and server, which could be a problem in the real-world. Other papers have previously shown that DP accuracy deficits are usually recovered after an initial slower convergence. This dimension in not explored experimentally (all experiments train to 100 epochs). The actual privacy protection is not really tested in this paper. This side of the investigation is left to theory, which is not ideal here. Especially in the light of some of the implications of the method (e.g. more clients = fewer individuals per model update, more data exchanges also increase the attack surface for any infiltrator) The IID assumption of the experiments is perhaps a weakness, although this is acknowledged in the conclusion. “In this regard, we propose to split the original client into disjoint sub-clients, which act as intermediaries for exchanging information between the hospital and the server. This strategy increases the number of client updates against queries, thereby consequently reducing the magnitude of noise.” - ablation studies showing this method compared to simply doing aggregation with the server more frequently would have been useful.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors state that the code will be made available and they explain the backbone models used + the training regime. The datasets are also in the public domain, so their tests should be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    “This strategy increases the number of client updates against queries, thereby consequently reducing the magnitude of noise” - Are ‘queries’ here polls to the server? It’s not clear what is meant. “Splitting a client into more sub-clients may increase the diversity of FL training” - Surely the diversity is the same? I think the use of the word diversity needs clarifying here. Also, each client having fewer data points would, intuitively, seem to increase the chances of an individual data point being recoverable. The assumption that the reverse is true needs some stronger evidence. “Through an analysis of the DP accountant…” - is this a specific tool or method? A reference would be helpful. “Definition 1. ((ϵ, δ)-Differential Privacy [7,8]) For a randomized learning mechanism M: X → R, where X is the collection of datasets it can be trained on, and Y is the collection of model it can generate, it is (ϵ, δ)-DP if:” - Y doesn’t seem to appear in the subsequent formulae, although it is later defined as “all possible datasets”. “We experimentally investigated the relationships between the final performance and the number of intermediaries, and find the optimal ratio lies in the range of 1/N.” - 1/N is neither a range (region?) nor a ratio. Also, isn’t N the number of clients? This doesn’t make much sense. “We regard each data source as an independent client,…” - What does this mean in practical terms? It’s not clear. I found the paper interesting, but rather lengthy textual descriptions might have benefitted from some more (or better) diagrams to illustrate the points being made and definitely more experimental evidence to provide confidence in the largely theoretical assumptions about the privacy of the data. Figure 2d is not clearly explained. What do the three levels on the three sub-plots relate to?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the method has been tested on some medical data, this is really motivated by some theory on FL and so its appeal might be of less interest to a typical MICCAI attendee, who is possibly more interested in the practical value in a medical setting. Whilst there is some value here, it needs more in the way of robust privacy testing to make it a sufficiently rounded submission

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #6

  • Please describe the contribution of the paper

    The paper proposes an “adaptive intermediary” method to enhance the performance of FL without affecting the patient privacy. The approach splits clients into sub-clients, which serve as intermediaries between hospitals and the server in order to “mitigate the noises introduced by DP” without harming patient privacy.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well-written. Ideas are formulated well. Two medical imaging datasets used (ICH diagnosis from brain CT, and prostate segmentation in MR images). Comparisons sufficient to establish the benefit.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Comparisons with recent approaches are missing, e.g. with [2].

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    If the authors share the code, the results may be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Selection of methods for comparisons in Table 1 has not been motivated.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Results show considerable improvements.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper presents a novel approach to improve the trade-off between privacy and performance in federated learning (FL) for medical imaging. The method introduces an adaptive intermediary strategy by partitioning clients into sub-clients, reducing noise and enhancing utility. The research highlights the significance of this approach in the field of federated learning and demonstrates its effectiveness in achieving a better privacy-utility balance, particularly in client-side differential privacy. Additionally, the paper proposes a method to determine the optimal number of intermediaries. By reducing the noise introduced by differential privacy, the accuracy of FL models is improved while maintaining privacy. The “adaptive intermediary” method effectively mitigates noise without compromising patient privacy, making it a valuable contribution to enhancing FL performance in medical imaging.

    Strengths:

    • The paper presents a novel method that balances privacy protection and performance in federated learning with client-level differential privacy.
    • The approach provides new insights on noise level, training diversity, intermediary number, and privacy budget, supported by theoretical deductions, theorems, and empirical testing.
    • The inclusion of supplementary materials enhances the understanding of the underlying theorems used in the research.
    • The paper is well-written, utilizes two medical imaging datasets, and provides comparisons to demonstrate the benefits of the proposed approach.

    Weaknesses:

    • Reproducibility concerns arise due to unclear experimental details.
    • Comparisons with other privacy amplification techniques and recent approaches are missing.
    • Ambiguity exists regarding key notations and the practicality of the reported privacy budgets.
    • The actual privacy protection provided by the proposed method is not thoroughly tested, relying heavily on theory rather than experimental validation.




Author Feedback

We thank the AC and reviewers for their time. Most reviewers are positive and supportive, highlighting our “novel and effective method” with “new insights and analysis”. Our work is “backed up both theoretically and empirically”, experiments on two datasets are “sufficient to establish the benefit.” To R1: Regarding suggestions on experiment details, we will address them in final version. We release our code (anonymous link) containing all comparisons, details: https://shorturl.at/joIQ7 To R4: R4 requests comparing with the sub-sampling method, we added results. DP-FedAvg with subsampling has Dice of 40.90,16.54,13.24 (z=0.3,0.5,0.7) on prostate MRI, which is inferior to ours. R4 mentions some unclear notations. We have introduced \zeta in line 4 in Sec2.2. We will improve the notation in final version. Regarding R4’s concern on high privacy budget in experiment, we clarify this is due to our aim for client-level DP under cross-silo setting, which typically has a small client number, causing larger budgets than sample-level DP. We have discussed sample and client-level DP in Sec2.3. We further show our method is effective against inversion attacks for practical use. The average structural similarity between original and reconstructed images on prostate MRI is 5e-2,1e-2,7e-3 for DP-FedAvg. Ours are 1e-2,1e-2,1e-2. Our method shows effective protection. To R5: Regarding R5’s concern on method contribution, our method is novel and new to tackle client-level DP problem. It follows standard DP setting without relying on previous methods. The table wants to show compatibility with existing DP methods for further boost, but our design is orthogonal to these methods. The first grey line represents our pure method, it is better than all compared methods with 80.77 AUC (z=1.5) and 58.14 Dice (z=0.7) on two tasks. R5 requests clarifying our objective. We aim to protect individual client privacy, which is very crucial in cross-silo/device FL. FL protects data by only sharing models, but differential attacks could identify which client contributes to FL and perform attacks on identified clients. It is critical to protect client privacy, and we validate our solution on non-iid data (prostate MRI). Regarding R5’s concern on communication costs, the cost of our method is much lower than IoT devices in the real world, where has hundreds to thousands of clients. Additionally, in our cross-silo setting, clients are hospitals with stronger bandwidth and communication stability than IoT device, making communication less likely to become a bottleneck. R5 requests more training to validate DP accuracy deficits recovery. We extend to 300 rounds, DP-FedAvg is 57.65,30.04,18.84 on prostate MRI and ours are 82.13,75.98,76.96 (z=0.3,0.5,0.7). Ours is still better. However, more rounds require larger privacy budgets, increasing privacy leakage risk. Regarding R5’s comment on actual privacy protection, we evaluated by deploying model inversion attack. The average structural similarity between original and attack reconstructed images on all prostate MRIs is 5e-2,1e-2,7e-3 for DP-FedAvg, 1e-2,1e-2,1e-2 for ours (z=0.3,0.5,0.7). Our method is effective against the attack and has a higher performance. Regarding R5’s concern on iid assumption in experiments, we have considered the non-iid data (prostate MRI). Each client’s data is collected from a different hospital, resulting in heterogeneous data distributions. R5 asks for comparison with more frequent aggregation. We increased the frequency of DP-FedAvg by 3 times, the Dice on prostate MRI are 55.40,31.46,32.22 (z=0.3,0.5,0.7). Despite high frequency, our method clearly outperforms this method. In addition, high frequency also incurs a higher communication cost and privacy leakage risk. To R6: R6 requests comparing with recent methods such as [2]. We have compared with a recent method DP2-RMSProp (Li et al, ICLR23). [2] applies DP-FedAvg on medical data, we have compared with it in the first row of table.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After carefully considering the authors’ response and the opinions of the other reviewers, I am pleased to recommend the acceptance of the manuscript.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper is generally well written with the methods being easy to understand and follow. The authors have addressed the reviewers’ concerns in the rebuttal. It reaches the minimum requirement for publication.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    None of the reviewers updated their scores, therefore the paper remains a candidate with mixed reviews. I feel like the authors have responded well to some of the cirtism raised by R5, therefore I would recommend acceptance.



back to top