Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Ziyang Chen, Yongsheng Pan, Yiwen Ye, Hengfei Cui, Yong Xia

Abstract

Although recent years have witnessed the great success of convolutional neural networks (CNNs) in medical image segmentation, the domain shift issue caused by the highly variable image quality of medical images hinders the deployment of CNNs in real-world clinical applications. Domain generalization (DG) methods aim to address this issue by training a robust model on the source domain, which has a strong generalization ability. Previously, many DG methods based on feature-space domain randomization have been proposed, which, however, suffer from the limited and unordered search space of feature styles. In this paper, we propose a multi-source DG method called Treasure in Distribution (TriD), which constructs an unprecedented search space to obtain the model with strong robustness by randomly sampling from a uniform distribution. To learn the domain-invariant representations explicitly, we further devise a style-mixing strategy in our TriD, which mixes the feature styles by randomly mixing the augmented and original statistics along the channel wise and can be extended to other DG methods. Extensive experiments on two medical segmentation tasks with different modalities demonstrate that our TriD achieves superior generalization performance on unseen target-domain data. Code is available at https://github.com/Chen-Ziyang/TriD.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_9

SharedIt: https://rdcu.be/dnwCN

Link to the code repository

https://github.com/Chen-Ziyang/TriD

Link to the dataset(s)

https://zenodo.org/record/8009107

https://liuquande.github.io/SAML/


Reviews

Review #3

  • Please describe the contribution of the paper

    A multi-source domain generalization method is proposed to obtain strong robustness by randomly sampling from a uniform distribution and to learn the domain-invariant representations explicitly.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Clear, concise, and easy to understand

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Similar to Fig.1, how about a Gaussian distribution is estimated directly from the already known samples?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors have declared related options as found from the checklist.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Beside maximum and minimun, mean and std values are more representative.
    2. ASD is suggested also be adopted to other tables besides Table 3.
    3. If the images are split into training and testing, how the authors ensure that the model is not over fitted or under fitted?
    4. What will happen if other distributions are adopted for the SR and SM modules?
    5. Similar to Fig.1, how about if a Gaussian distribution is estimated directly from the already known samples, as distribution of image features extracted by the proposed model looks like that?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The contributions, representation, and results.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The author presents a domain generalization method for training a CNN model. The method is composed of two modules. The first module is randomization (SR), which aims to modify and increase the feature space of the model’s convolutional blocks. The second module is mixing (SM), which aims to combine the original features with the output features of the SR module.

    The method was validated on two datasets (MRI and Optic Disc) each composed of several domains. It was compared to 7 reference methods (SAN-SAW, RandConv, MaxStyle, DCAC, MixtStyle, EFDM, DSU). The proposed method outperformed the reference methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The validation method used in the article is well conducted. Firstly, the method is validated on two datasets and compared to several domain generalization methods from the literature. The results are provided for all methods and clearly detailed.

    To evaluate the contribution and added value of the two modules of the proposed method, an ablation study is conducted. Additionally, the scalability of the method, especially the SM module, is evaluated on two methods from the literature.

    Finally, a last study is conducted to determine the position in the CNN model where the proposed method should be applied. In conclusion, the paper is understandable, with a clear focus and well-defined results.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The form of the article could be improved, particularly in the following ways:

    The abstract should be structured to include sections on introduction, methods, results, and conclusions. However, in the current article, the abstract only provides an explanation of the proposed method without giving any results.

    The method proposed by the author is almost more understandable in the introduction than in the method section. It would be helpful to provide more detailed explanations of the SR and SM modules in the method section.

    The MRI dataset is only used for comparison with the methods from the literature and not for the rest of the study, which is confusing.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It is mentioned in the abstract that the code will be made available, and the study is primarily based on public datasets.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Thank you for your feedback on the article. I will take note of the following points for improvement:

    Ensuring the consistency of tenses (present, past), particularly in the introduction. Adding citations when discussing the weaknesses of a method, such as when discussing the weaknesses of studies on domain generalization in part 2. Providing more detailed explanations of the SR and SM modules in the treasure in distribution section would be helpful. Conducting the ablation, scalability, location, and distribution studies on the MRI dataset or at least explaining why these studies were not conducted. Thank you for your suggestions, and I will incorporate them into future revisions of the article.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The comparison method used in the article appears to be quite robust. The proposed method is compared to several other domain generalization methods from the literature on two datasets, IRM and Optic Disc, each composed of multiple domains. The comparison is made using several evaluation metrics, including accuracy and area under the curve (AUC), which provide a comprehensive view of the performance of each method.

    Furthermore, an ablation study is conducted to evaluate the contribution of the two modules of the proposed method, and the scalability of the method is evaluated on two methods from the literature. The position in the CNN model where the proposed method should be applied is also determined. These additional evaluations provide a more comprehensive understanding of the strengths and weaknesses of the proposed method compared to other methods in the literature.

    Overall, the comparison method used in the article appears to be well-designed and provides a thorough assessment

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The paper proposes a new domain generalization method called Treasure in Distribution (TriD) that addresses the variable image quality of medical images in real-world clinical applications. TriD uses a multi-source approach to construct an unprecedented search space for obtaining strong robustness, and a style-mixing strategy for learning domain-invariant representations explicitly. The evaluation of TriD on two medical segmentation tasks with different modalities shows that it achieves superior generalization performance on unseen target-domain data, and the availability of the code makes it a promising method for real-world clinical applications.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The main strength of the paper is the proposal of a multi-source domain generalization method, called Treasure in Distribution (TriD), which constructs an unprecedented search space to obtain a model with strong robustness by randomly sampling from a uniform distribution.
    2. The paper also presents a style-mixing strategy in TriD to learn domain-invariant representations explicitly, which can be extended to other DG methods.
    3. The extensive experiments on two medical segmentation tasks with different modalities demonstrate that TriD achieves superior generalization performance on unseen target-domain data.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Although the results presented are comprehensive, the extensive use of abbreviations in the result section may hinder reader comprehension and make it difficult to follow the performance of the proposed method.
    2. While the paper successfully demonstrates the effectiveness of the proposed method on two medical segmentation tasks, there is a lack of detailed descriptions of the different domains used in the experiments. Providing more information about the specific characteristics of each domain would be helpful for readers to better understand the performance of the proposed method.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors will share code. However, I don’t see the link to downloadable version of the dataset in the paper although the authors indicated “Yes” in the reproducibility checklist.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. In table 4, 82.70 in the TriD row domain 2 needs to be bolded.
    2. It is not clear why domain 4 in segmenting OD and OC has inferior performance compared to other domains.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Novel methodology, comprehensive evaluation and good performance. The topic of domain generalization is important, and the proposed method will be interested to the community.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper is clear, focused and well validated, demonstrating good performance. The paper could be made a little clearer, with more explanation of the domains used and their characteristics, and some revision according to the reviewer comments, particularly in the abstract and methods sections




Author Feedback

We thank all the reviewers and AC for their invaluable comments. Reviewer 1

  1. The extensive use of abbreviations. Due to the space limitation, we have to replace the full names with their corresponding abbreviations.
  2. Lack of detailed descriptions of different domains. These images collected from different domains vary greatly in tone, contrast, and brightness. More details could refer to the public websites [https://liuquande.github.io/SAML/].
  3. In Table 4, 82.70 needs to be bolded. We apologize for this mistake and will correct it in the camera-ready version.
  4. Domain 4 in segmenting OD and OC has inferior performance. Since domain 4 has the largest number of data, the training data is relatively smaller when using domain 4 as the target domain, resulting in inferior performance. We will add the explanation in the camera-ready version.

Reviewer 2

  1. The abstract does not provide any results. We will add results in the abstract in the camera-ready version.
  2. Provide more detailed explanations in the method section. Due to space limitation, we will release our code to show more details.
  3. The MRI dataset is not used for ablation, scalability, location, and distribution studies. Due to the space limitation, it is unable to display the results of these studies, and the results on OD/OC segmentation task are sufficient to support conclusions.
  4. Ensuring the consistency of tenses (present, past). We will check it and improve our manuscript in the camera-ready version.
  5. Adding citations when discussing the weaknesses. We will revise accordingly in the camera-ready version.

Reviewer 3

  1. Mean and std values are more representative. Due to space limitation, it is unable to list the std values in the tables. We will improve it in the journal version of this work.
  2. ASD is suggested to be adopted. Thanks for this suggestion. Since the ASD is not commonly used for the OD/OC segmentation task, we only used it for the prostate segmentation task.
  3. How to ensure that the model is not over fitted or under fitted? Our TriD can augment the styles of features during training, which can alleviate over-fitting. We trained the model for enough epochs to avoid under-fitting.
  4. Adopting other distributions for the SR and SM modules. In Table 6, we list the results of using a Gaussian distribution in SR and SM modules, which are significantly inferior to our TriD.
  5. Similar to Fig.1, how about if a Gaussian distribution is estimated directly from the already known samples. Different from our TriD, the augmented samples based on a Gaussian distribution will also exhibit the distribution similar to a limited search space, which has more data points in the middle but less around, resulting in an inferior performance.



back to top