Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yuhan Zhang, Kun Huang, Cheng Chen, Qiang Chen, Pheng-Ann Heng

Abstract

Cross-domain distribution shift is a common problem for medical image analysis, because medical images from different devices usually own varied domain distributions. Test-time adaptation (TTA) is a promising solution by efficiently adapting source-domain distributions to target-domain distributions at test time with unsupervised manners, which has increasingly attracted important attentions. Previous TTA methods applied to medical image segmentation tasks usually carry out a global domain adaptation for all semantic categories, but global domain adaptation would be sub-optimal as the influence of domain shift on different semantic categories may be different. To obtain improved domain adaptation results for different semantic categories, we propose Semantic-Aware Test-Time Adaptation (SATTA), which can individually update the model parameters to adapt to target-domain distributions for each semantic category. Specifically, SATTA deploys an uncertainty estimation module to effectively measure the discrepancies of semantic categories in domain shift. Then, a semantic adaptive learning rate is developed based on the estimated discrepancies to achieve a personalized degree of adaptation for each semantic category. Lastly, semantic proxy contrastive learning is proposed to individually adjust the model parameters with the semantic adaptive learning rate. Our SATTA are extensively validated on retinal fluid segmentation based on SD-OCT images. The experimental results demonstrate that SATTA consistently improves domain adaptation performance on semantic categories over other state-of-the-art TTA methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_14

SharedIt: https://rdcu.be/dnwxW

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    Previous TTA methods applied to medical image segmentation tasks usually carry out a global domain adaptation for all semantic categories, but global domain adaptation would be sub-optimal as the influence of domain shift on different semantic categories may be different. This paper proposes Semantic-Aware Test-Time Adaptation (SATTA) to individually update the model parameters to adapt to target-domain distributions for each semantic category. The experimental results demonstrate that the proposed method consistently improves domain adaptation performance on semantic categories over other state-of-the-art TTA methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper proposes to update the model parameters to adapt to target-domain distributions for each semantic category, since the authors found global domain adaptation would be sub-optimal as the influence of domain shift on different semantic categories may be different.

    2. A semantic adaptive learning rate is developed based on the estimated discrepancies to achieve a personalized degree of adaptation for each semantic category. Besides, semantic proxy contrastive learning is proposed to individually adjust the model parameters with the semantic adaptive learning rate.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Evaluation is only done on retinal fluid segmentation. It would be better to evaluate the method on more datasets to show the generalization of the proposed method.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good. Code is provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    If the proposed method only limited to specific tasks? Otherwise it would be better to evaluate the method on more datasets to show the generalization of the proposed method.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method performs individual domain adaptation for each semantic category at test time. But the method is only evaluated on retinal fluid segmentation based on spectral-domain optical coherence tomography.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    This paper proposed Semantic-Aware Test-Time Adaptation (SATTA) for cross-domain medical image segmentation, aiming to perform individual domain adaptation for each semantic category at test time. SATTA was evaluated on retinal fluid segmentation based on spectral-domain optical coherence tomography (SD-OCT) images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper proposes SATTA method for cross-domain medical image segmentation, which provides semantic adaptive parameter optimization scheme at test time.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It should be better to compare the proposed method with dynamic learning rate adjustment TTA work DLTTA on the same dataset.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I think the obtained results can be reproduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    According to the title, this work seems a general cross-domain medical image segmentation method but only evaluated on one medical dataset. Besides, no clinical value descriptions in this paper.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper proposes SATTA method for cross-domain medical image segmentation, which provides semantic adaptive parameter optimization scheme at test time. The experimental results look good.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper
    • The paper proposes a Semantic-Aware Test-Time Adaptation (SATTA) method to tackle the problem of cross-domain distribution shift in medical image segmentation.
    • SATTA adapts the model parameters to target-domain distributions for each semantic category individually by introducing the adaptive learning rate, instead of applying global domain adaptation to all semantic categories.
    • The proposed method uses an uncertainty estimation module to measure the discrepancies of semantic categories in domain shift and a semantic adaptive learning rate to achieve personalized adaptation for each semantic category.
    • The proposed method is validated on retinal fluid segmentation based on SD-OCT images and demonstrates improvements over other state-of-the-art TTA methods.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • SATTA adapts the model parameters to target-domain distributions for each semantic category individually by introducing the adaptive learning rate, which is novel and different from existing TTA methods.
    • Code has been attached for reproduction.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Why can the discrepancy of adaptation degrees required by different semantic classes be addressed by using different learning rates? Please provide more in-depth analysis and intuition.
    • In Eq. (7), it seems that a larger uncertainty score leads to a larger learning rate for class c. The motivation of this design is not clear.
    • What if all learning rates are set as the average of \eta^{c}? It is necessary to verify the importance of assigning adaptive learning rate for each class, so as to demonstrate the rationality of the proposed method.
    • It is hard to follow the methodology due to some unclear explanations, as listed in the following section 9.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code has been attached in the supplementary.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Some notations lack explanation, such as \theta_{n}^{c}.
    • In Eq. (8), inputs of L_{spc} should be x, W, c, but in the right term, x is not involved.
    • In Eq. (10), how to decide the positive pairs of target domain data as no labels are available in target domain.
    • In the last paragraph of section 2, ‘The updated model parameters are stored in a memory bank and will be loaded for the next domain adaptation of category c.’ What does ‘the next domain adaptation’ mean?
    • Because of the lack of space, the authors modified the latex template of supplementary materials, which does not meet requirements in https://conferences.miccai.org/2023/en/PAPER-SUBMISSION-AND-REBUTTAL-GUIDELINES.html
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    n/a

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Standard test-time adaptation methods typically carry out a global domain adaptation for all the categories, which might be sub-optimal as the influence of the domain shifts on different semantic categories may vary. This paper investigates a category-aware test-time adaptation, which updates the model parameters for each semantic category. The idea is plausible and the experiments demonstrate competitive performances over state-of-the-art test-time adaptation methods in the context of retinal fluid segmentation. All reviewers recommended acceptance, and I concur with this. The paper is clear, well-written and has merit.

    One minor criticism is that the method, which seems to be a general-purpose method, is evaluated only on retinal fluid segmentation. Evaluation on other medical imaging tasks would strengthen the work and broaden its scope.




Author Feedback

We sincerely appreciate the time and effort invested in reviewing our paper and providing us with valuable feedback. We are delighted to hear that three reviewers recommended acceptance and that the meta-reviewer concurs with their recommendation. We are grateful for all reviewers’ kind words regarding the clarity, quality, and merit of our work. We would like to address several problems raised by the meta-reviewer and three reviewers. (1). (Reviewer #1; Reviewer #2; Meta-review). We acknowledge the importance of evaluating our method on a wider range of medical imaging tasks beyond retinal fluid segmentation. We agree that such an evaluation would not only strengthen the generalizability of our approach but also broaden its applicability to various domains within the medical imaging field. In response to this valuable suggestion, we have begun working on expanding the evaluation of our method to include other relevant medical imaging tasks, such as the OC/OD segmentation on fundus images. We are actively collecting datasets and collaborating with domain experts to ensure a comprehensive assessment of our method across different semantic categories. By doing so, we aim to provide a more thorough understanding of the method’s performance in diverse medical imaging applications. (2). (Reviewer #3: 3-(1)). Since we find that the influence of domain shift on different semantic categories may also be different, a unified learning rate for all semantic categories may be sub-optimal. In supplementary materials, we utilize t-SNE to visualize the features of three fluid types (IRF, SRF and PED) on three different domains, where the features are extracted from the pre-trained FCN network. We can observe that the features of three fluid types have obvious distribution discrepancies on three domains, indicating the reasonability of our motivation. We has added the feature visualization results into our paper to highlight our motivation. (3). (Reviewer #3: 3-(3)). Thank you for your valuable suggestion. We acknowledge the importance of verifying the significance of assigning adaptive learning rates for each class in order to demonstrate the rationality of our method. We understand that by conducting experiments and evaluations, we can provide concrete evidence of the effectiveness and benefits of utilizing adaptive learning rates. In our future work, we will incorporate this suggestion and conduct thorough experiments to compare models trained with adaptive learning rates against those trained with average learning rates. By analyzing the results and evaluating the performance of the models, we will be able to assess the impact and advantages of assigning adaptive learning rates to individual classes. We appreciate your input and are committed to continuously improving our methods to enhance the efficiency and accuracy of our method. By following your suggestion and conducting the necessary verification, we aim to provide a robust and rational approach for assigning adaptive learning rates for each class in the future. (4). (Reviewer #3: 6-(3)). For semantic aggregation, we first use the pseudo-labeling method to assign labels for each pixel, and then an uncertainty estimation module selects the pixels with high confidence to form the positive pairs with the corresponding category proxy weights for the semantic proxy contrastive learning.



back to top