Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zheang Huai, Xinpeng Ding, Yi Li, Xiaomeng Li

Abstract

In the domain adaptation problem, source data may be unavailable to the target client side due to privacy or intellectual property issues. Source-free unsupervised domain adaptation (SF-UDA) aims at adapting a model trained on the source side to align the target distribution with only the source model and unlabeled target data. The source model usually produces noisy and context-inconsistent pseudo-labels on the target domain, i.e., neighbouring regions that have a similar visual appearance are annotated with different pseudo-labels. This observation motivates us to refine pseudo-labels with context relations. Another observation is that features of the same class tend to form a cluster despite the domain gap, which implies context relations can be readily calculated from feature distances. To this end, we propose a context-aware pseudo-label refinement method for SF-UDA. Specifically, a context-similarity learning module is developed to learn context relations. Next, pseudo-label revision is designed utilizing the learned context relations. Further, we propose calibrating the revised pseudo-labels to compensate for wrong revision caused by inaccurate context relations. Additionally, we adopt a pixel-level and class-level denoising scheme to select reliable pseudo-labels for domain adaptation. Experiments on cross-domain fundus images indicate that our approach yields the state-of-the-art results. Code is available at https://github.com/xmed-lab/CPR.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43990-2_58

SharedIt: https://rdcu.be/dnwMd

Link to the code repository

https://github.com/xmed-lab/CPR

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors proposed a method to calibrate pseudo-labels adopting pixel-level and class-level denoising schemes to perform reliable pseudo-labels for domain adaptation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The comparison with other methods as reported in Table 1.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The segmentation task has outstanding state-of-the-art methods, thus authors could compare their proposed method with other methods such as UNet or DRIU (Deep retinal image understanding).

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of the paper is not an easy task. Although the method is a little clear, the trainable parts of the method would present several issues to obtain the same obtained results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The authors should do a better comparison with better state-of-the-art methods, and bigger databases to finally perform an ablation study to confirm the fair results.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The experimental setup could be better with bigger databases and comparison with state-of-the-art methods.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper
    1. Development of a novel context-aware pseudo-label refinement (CPR) framework for source-free unsupervised domain adaptation.
    2. Introduction of a context-similarity learning module to compute context relations from feature distances and take advantage of the intrinsic clustered feature distribution under domain shift.
    3. Design of a context-aware revision method to leverage adjacent pseudo-labels for revising bad pseudo-labels using learned context relations.
    4. Proposal of a calibration strategy to mitigate the negative effect of inaccurate learned context relations.
    5. Development of a denoising method that considers model knowledge and feature distribution to select reliable pseudo-labels for domain adaptation.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper presents a clear and well-organized description of the proposed framework.
    2. The paper presents a novel framework for source-free unsupervised domain adaptation, which combines several techniques including context-similarity learning, context-aware revision, calibration, and denoising.
    3. The proposed CPR framework is evaluated through experiments on cross-domain fundus image segmentation and outperforms the state-of-the-art source-free methods.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The introduction lacks a clinical background on optic disc and cup segmentation.
    2. There is a lack of theoretical and relevant literature support for the conclusion based on the experimental observation that the target features generated by the source model still form clusters.
    3. There are many variables in the method, which may pose difficulties in network optimization.
    4. The comparative experiments and visualizations are lacking, as described in the recommendations.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method in the paper is trained on a publicly available dataset and is also relatively well reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. In the Introduction, the paper tends to perform segmentation of the source-free domain adaptation for medical images, but only the fundus dataset was used for validation. This is a little confusing and the authors should have emphasized the context of the study in the paper.
    2. It is suggested that the variables in the method be discussed, for example, the value of r in equation 5 may be relevant to the dataset.
    3. In the domain adaptation experiment for REFUGE-Drishti-GS, the results of the DPL experiments in Table 1 do not match those of [3], please explain the reason.
    4. The experiment only uses three advanced unsupervised segmentation methods for comparison, and more comparison experiments need to be added to verify the validity.
    5. The experiments lacked visual displays of segmentation in other modules.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method and experiment design of this paper.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors have responded to most of the questions raised and have added some important comparative experiments and hyperparameter experiment studies.



Review #3

  • Please describe the contribution of the paper

    The authors claim that their method achieves a clear improvement over the state-of-the-arts. This is the case according to the results in Table 1 of this manuscript. However, on the baseline methods (e.g., DPL), the results in Table 1 in this manuscript may be different from those in Table 1 in Ref [3], even though they used the same datasets. Does the comparison results in this manuscript not truly reflect the ability of the baseline method (e.g., DPL in Ref [3])?

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Learning contextual relations to refine pseudo-labels for source-free unsupervised domain adaptation, which may be an alternative path to address the domain gap problem.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The authors claim that their method achieves a clear improvement over the state-of-the-arts. This is the case according to the results in Table 1 of this manuscript. However, on the baseline methods (e.g., DPL), the results in Table 1 in this manuscript may be different from those in Table 1 in Ref [3], even though they used the same datasets. Does the comparison results in this manuscript not truly reflect the ability of the baseline method (e.g., DPL in Ref [3])?

    2. Figure 2 is not clear, in which stages (e.g., (a) or (b)) should be marked in the corresponding position.

    3. Too many hyperparameters need to be set manually, which greatly reduces the clinical practicality of the method.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The source code is not open, which may affect the reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The results of all methods should be reported truthfully and impartially.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although there are doubts about the results of the baseline methods, this paper is well organized and the demonstration process is relatively specific and clear.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    The author’s response has basically solved the problems I concerned, so for me, it is recommended to accept.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    • The reviewers generally find the proposed approach for unsupervised domain adaptation in cross-modality medical image segmentation to be interesting and novel, but they also identify several weaknesses. These include the lack of comparison with better state-of-the-art methods, a limited experimental setup with small databases, and the complexity of the method’s trainable parts. Reviewers also suggest improvements such as more comparative experiments, larger ablation studies, and clearer figures. Overall, the clarity and organization of this paper are rated as good, but the reproducibility of the work is doubtful due to the large number of manually set hyperparameters and the lack of open-source code. So this paper is recommended for rebuttal.




Author Feedback

We thank the meta-reviewer and reviewers for valuable feedback and insightful suggestions to improve our paper. Overall, the reviewers found the paper well-organized, and the method was considered clear (R2, R3). R2 appreciated the novelty of our idea and recognized the effectiveness of our method. R1’s main concern involves the need for additional comparative experiments, larger ablation studies, and datasets. R3 raised concerns regarding the reproducibility of our method and manually set hyperparameters. We have carefully considered their concerns and addressed them accordingly.

1.More comparisons with existing methods (R1, R2). We thank reviewers for the suggestions, but here we kindly confirm that the compared methods in our paper are SOTA methods in SF-UDA for medical image segmentation, including U-D4R, 2022 MICCAI [24], FSM, 2022 MIA [25] and DPL, 2021 MICCAI [3]. To resolve the concerns, we further compare our method with [Ref-a] (2022 MIA) and the results are: [Ref-a]: Dice: cup: 72.07%| disc: 92.78%| avg: 82.43% Ours: Dice: cup: 75.02%| disc: 95.03%| avg: 85.03% [Ref-a] yields a lower outcome compared to U-D4R, thereby further confirming that U-D4R is indeed the previous best-performing method. As shown in our paper, our method surpasses U-D4R by a notable margin of 1.7%.

2.Experiments on larger datasets (R1). In our paper, we employ three datasets: Drishti-GS, RIM-ONE-r3, and REFUGE. These datasets are widely recognized as standard benchmarks in previous SF-UDA methods for fundus image segmentation [3, 24], as well as in a highly cited UDA method for fundus segmentation [Ref-b]. We believe that running experiments on these datasets enables a reliable and fair comparison with prior SOTAs and we consider experiments on larger datasets as our future work.

3.Comparison with UNet and DRIU (R1). R1 suggested that we should compare our method with UNet and DRIU. However, it is important to note that our task revolves around addressing domain differences without access to source data, whereas UNet and DRIU are mainly designed for segmentation purposes as the basic backbones. In our paper, we utilize a segmentation backbone based on MobileNetV2-adapted DeepLabv3+ to ensure a fair comparison with prior SF-UDA methods [3, 24].

4.Reproducibility (R1, R3). The code has been anonymously made available on GitHub at https://github.com/anonymousreleasecode/SFUDA-CPR. We have provided the pseudo labels and model weights at intermediate steps for further investigation.

5.Correctness of baseline results (R2, R3). Concerns were raised by R2 and R3 regarding the inconsistency between the results of DPL in Table 1 and those reported in [3]. In the original DPL implementation, the training set of REFUGE is used as the source data, while our study utilized the validation set of REFUGE, as stated in Section 3 - Datasets. Nonetheless, we followed the official code of DPL and carefully tuned the parameters to ensure fair and comparable results. In the interest of transparency and reproducibility, we have made our DPL experiments available at https://github.com/anonymousreleasecode/DPL_fixed.

6.Hyperparameters and larger ablation studies (R3). Despite the presence of several hyperparameters, the overall performance of our approach is robust against their variations. When altering the iteration rounds, t, from 4 to 2, 8, 16, or 32, or modifying the value of \beta in Eq. 5 from 2 to 1 or 3, or adjusting the radius, r, from 4 to 3, 5, or 6, the average dice shows a negligible decrease of less than 0.4%. Similarly, when the probability thresholds \gamma_low and \gamma_high fall within the ranges (0.2, 0.7) and (0.8, 0.95) respectively, the average dice demonstrates a marginal decrease of less than 1%. Therefore, our method is not sensitive to hyperparameters. [Ref-a] Source-free domain adaptation for image segmentation. MedIA 2022 [Ref-b] Patch-based output space adversarial learning for joint optic disc and cup segmentation. IEEE TMI 2019




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This work aims to address the domain adaptation problem by aligning the target distribution using only the source model and unlabeled target data. It aims to refine pseudo-labels generated by the source model by incorporating context relations and applying denoising techniques, resulting in improved performance in cross-domain fundus image experiments. The rebuttal has adequately addressed the major concerns of the three reviewers, including comparisons with existing methods using larger databases, comparisons with UNET and DRIU (more clarification), ablation studies, etc. Thus, this paper is recommended for acceptance.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper presents a clear and well-organized description of context-aware pseudo-label refinement (CPR) framework for source-free unsupervised domain adaptation. The rebuttal has addressed the major concerns of the reviewers, and the rating of R3 has been changed to 6 from 5. Therefore, I recommend accept.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Pros:

    • the proposed approach is interesting and novel.
    • the clarity and organization are good. Cons:
    • a limited experimental setup with small databases.
    • the complexity of the method’s trainable parts. After Rebuttal:
    • the authors failed to convince the reviewer gave low score, but to me, the clinical need and novelty is sufficient for a conference paper;
    • more details on the epxeriments can help to improve the revision;
    • the two positive reviews are general consistent to acknowlege the contribution of this work



back to top