Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Hyeongyu Kim, Yejee Shin, Dosik Hwang

Abstract

The rapid advancements in deep learning have revolutionized multiple domains, yet the significant challenge lies in effectively applying this technology to novel and unfamiliar environments, particularly in specialized and costly fields like medicine. Recent deep learning research has therefore focused on domain generalization, aiming to train models that can perform well on datasets from unseen environments. This paper introduces a novel framework that enhances generalizability by leveraging transformer-based disentanglement learning and style mixing. Our framework identifies features that are invariant across different domains. Through a combination of content-style disentanglement and image synthesis, the proposed method effectively learns to distinguish domain-agnostic features, resulting in improved performance when applied to unseen target domains. To validate the effectiveness of the framework, experiments were conducted on a publicly available Fundus dataset, and comparative analyses were performed against other existing approaches. The results demonstrated the power and efficacy of the proposed framework, showcasing its ability to enhance domain generalization performance.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_24

SharedIt: https://rdcu.be/dnwAW

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper designs a framework for enhancing domain generalization performance by utilizing transformer-based disentanglement learning and style mixing to identify domain-invariant features.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is well-written and easy to follow.
    2. Problem is well-formulated and extensive results effectively demonstrate the effectiveness.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. There are lots of hyper parameters in your paper. How to balance them?
    2. Are all comparison methods based on vision transformer? It seems that the results of FedDG in this paper are a little lower than those in original paper.
    3. You mentioned amplitude mixing and style mixing in Fig. 2, but don’t compare the experimental results of them. I am curious about the performance of amplitude mixing.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The author provides sufficient implementation details which helps reproduce the results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Ablation studies on hyper parameters should be added.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper solves a practical problem and proposes a novel method.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The contribution of this paper is the proposal of a novel approach for domain generalization in medical image segmentation that addresses the limitations of existing methods. The proposed method is based on disentanglement training strategy to learn invariant features across different domains. It leverages transformer-based disentanglement learning and style mixing to identify domain-invariant features, resulting in increased performance on unseen target domains. The paper also introduces a novel patch-wise discriminator to extract variant features in a more specific manner. Validation on a public dataset alongside comparative works demonstrates the effectiveness and powerful performance of the proposed framework.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strength of the proposed framework is its ability to successfully identify domain-invariant features through a combination of content-style disentanglement with image synthesis, resulting in increased performance on unseen target domains. The framework leverages transformer-based disentanglement learning and style mixing, which allows for the identification of domain-agnostic features. Additionally, the proposed method introduces a novel patch-wise discriminator to extract variant features in a more specific manner. Validation on a public dataset alongside comparative works demonstrates the effectiveness and powerful performance of the proposed framework.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed method may require significant computational resources due to its use of transformer-based disentanglement learning and style mixing. While the paper focuses on medical image segmentation, it is unclear how well this method would generalize to other domains beyond medicine.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Not claimed.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Overall, this paper presents an innovative approach that has the potential to significantly improve domain generalization performance in medical image segmentation. With further validation and refinement, this method could have important implications for improving medical diagnosis and treatment.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, the proposed framework for disentangle-based domain generalization for medical image segmentation is a promising approach that addresses the limitations of existing methods. The paper provides a clear and detailed description of the proposed method, including its use of transformer-based disentanglement learning and style mixing to identify domain-invariant features. The introduction of a novel patch-wise discriminator to extract variant features in a more specific manner is also a valuable contribution.

    However, it would be helpful if the paper provided more details on the computational resources required to implement this method. Additionally, while the validation on a public dataset alongside comparative works demonstrates the effectiveness and powerful performance of the proposed framework, it would be useful to see additional experiments on other datasets or tasks to further validate its generalizability.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    A method is proposed to improve generalisation of a segmentation method applied to optical disks. It is based on a transformed backbone used to disentangle the images into style and content and a generator to create domain invariant features.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The question is important and is a key barrier to greater impact of deep learning methods.
    • The network is compared to a range of baseline methods.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Unfortunately, there are multiple mistakes throughout the paper which make it hard to understand and follow the message of the paper. These detract from the paper and make it hard to tell whether the proposed approach is sensible or not. These are detailed below.
    • Details about the dataset and thus the domain shift are not detailed, and so its not clear what domain shifts are being dealt with.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    An open dataset is used but the details are limited, would require reading of the dataset paper for details. No indication of how hyperparameters were selected. Hyperparameters are reported.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The approach looks like it could be good, but the paper needs sharpening up to be able to convey this. Here are the main things I spotted but there are also typos and grammar issues throughout.

    • The components in the figure and in the methods are labelled differently. Pick one convention?
    • The encoder and segmentation network are described as discriminators and then there is also a generator with two discriminators? This section was unclear and D and PD are not shown in the figure.
    • Equation 1 is wrong, its currently I – I .
    • What is E*?
    • The legend of figure 2 is too small to read.
    • How were the lambda values chosen
    • Fig 3: it is really hard to see the circles online and impossible when printed.
    • The best results is not always the one highlighted in table 1, which means that the proposed approaches is only actually best for 2 of the experiments and DCAC is performing better for more categories.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Despite spending a long time trying to read this paper it was very unclear what was going on due to the number of mistakes throughout.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors have clarified many issues. My main concern now is around the hyperparameters and how readily these could be selected by other researchers for their task making this a week accept.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper tackles domain generalization on medical image segmentation with method design from the aspect of disentanglement training. Three reviewers have mixed comments on the manuscript, with specific questions on method and experimental details. In the rebuttal, the authors should verify method correctness and rational, and make it clear for the raised questions on method.




Author Feedback

We sincerely appreciate the valuable feedback provided by the reviewers. We have addressed most of the comments with comprehensive responses. We would like to express our gratitude for the insights shared by the reviewers, which significantly contributed to the improvement of our research.

Major Comments: R1 (Q3.1) & R4 (Q5,Q6): Hyperparameters Initially, we set the hyperparameters based on relevant references [15, 18, 19], which provided valuable insights and recommendations. Through empirical experiments, we fine-tuned and adapted these hyperparameters to optimize performance for our specific task and dataset.

R3 (Q3, Q8): Computational Resources During training, our model required 16.8GB vRAM for batch size 6, and the training process took ~18 hours. In terms of computational workload during inference, we compute the FLOPs for the segmentation models and compared them to RAM-DSIR [21] and DCAC [5]. Our model demonstrated comparable computational burden (12.5 FLOPs) with superior performance compared to [21] (6.1 FLOPs) and [5] (13.1 FLOPs) (*256,256 size). The results indicate promising performance with reasonable computational requirements.

R4: Dataset (Q3,Q5) The Fundus dataset, introduced in [14], is a prominent dataset for medical image segmentation generalization, comprising four domains of different patients and acquisition institutions. There are two segmentation classes (OC, OD), with varying numbers of samples for training and test (50/51, 99/60, 320/80, and 320/80). We apologize for the inadequate description and will include the details.

R3: Further Generalizability (Q3, Q8) We acknowledge the importance of further validation for generalizability, not only within the medical domain but also across other domains. Constrained by the rebuttal period, we only compared ours on a subset of the MR Prostate segmentation dataset (Domain 5), another well-known dataset for medical image generalization [*]. Our method demonstrates superior performance compared to RAM-DSIR [21] and DCAC [5]; Dice score : ours (86.32) , [21] (85.92) , [5] (80.68). We plan to expand on this aspect in future research.

[*] Liu, et al.: Shape-aware meta-learning for generalizing prostate MRI segmentation to unseen domains. (MICCAI 2020)

Minor Comments for Each Reviewer: R1: Q3.2: Only ours are based on transformers, and other methods are based on each official github. FedDG results are a little lower than in the original paper. However, by convention, most papers report the results of their own experiment, and it vary a lot. For example, in [5], OC of Domain 1 for FedDG is reported as 81.95, (our paper : 82.96). In [21], OC of Domain 4 for FedDG is reported as 81.9, (our paper : 83.21). We will take a careful look into our implementation again to ensure the validity of the results. Q3.3: Please see ‘RAM-DSIR’ in our table1, which is based on amplitude mixup.

R3: We have addressed all the comments provided above. We appreciate the reviewer’s feedback! :)

R4: We apologize for any typos or grammatical issues that may have affected the readability of our manuscript. We will carefully edit the manuscript to enhance clarity. Regarding specific comments:

  • Equation 1 : “ I - I’ ”.
  • We will include both D and PD to Fig.1.
  • E* was to highlight cyclic path. We will add further explanation.
  • We will fix the bold typo from the results of OD from domain A, where DCAC should be bold. However, it cannot be said that DCAC is the best performing model. Even though the DCAC is showing best results in a few domains, ours show the highest average in OC segmentation, comparable results in OD average, leading to the best results when averaged over the entire domain on both two classes, in a similar manner as in [21]. We will include this too. (All class Average Dice -> DCAC : 87.99, ours : 88.54).

We sincerely apologize again for any confusion.

We appreciate the reviewers’ comments and will strive to improve the readability and quality.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal addresses the detailed comments including hyperparameter setting, computation cost and experiments on generalization. After rebuttal, all three reviewers recommend acceptance, therefore this paper holds a clear decision. In the final version, the authors should incorporate all the clarifications provided in the rebuttal, in order to provide the published paper in a good shape.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper is clearly written with some novelty w.r.t. transformer-based disentanglement learning and style mixing. It dose have some novel insights to this problem. The rebuttal well addressed the major concerns. I would suggest the authors carefully revise their paper accridingly.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    On one hand, the specific techniques proposed in this paper seem to be interesting. On the other hand, disentanglement based methods have been well explored. The overall innovation of the work is unclear. The work is then convoluted with mixing-based methods for performance improvement.



back to top