Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zhe Xu, Donghuan Lu, Yixin Wang, Jie Luo, Dong Wei, Yefeng Zheng, Raymond Kai-yu Tong

Abstract

Recently, unsupervised domain adaptation (UDA) has been actively explored for multi-site fundus image segmentation with domain discrepancy. Despite relaxing the requirement of target labels, typical UDA still requires the labeled source data to achieve distribution alignment during adaptation. Unfortunately, due to privacy concerns, the vendor side often cannot provide the source data to the targeted client side in clinical practice, making the adaptation more challenging. To address this, in this work, we present a novel uncertainty-rectified denoising-for-relaxing (U-D4R) framework, aiming at completely relaxing the source data and effectively adapting the pretrained source model to the target domain. Considering the unreliable source model predictions on the target domain, we first present an adaptive class-dependent threshold strategy as the coarse denoising process to generate the pseudo labels. Then, the uncertainty-rectified label soft correction is introduced for fine denoising by taking advantage of estimating the joint distribution matrix between the observed and latent labels. Extensive experiments on cross-domain fundus image segmentation showed that our approach significantly outperforms the state-of-the-art source-free methods and encouragingly achieves comparable or even better performances over the leading source-dependent methods.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_21

SharedIt: https://rdcu.be/cVRyz

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a method for unsupervised domain adaptative segmentation via a coarse-to-fine label denoising scheme. With pseudo labeling and uncertainty-rectified label soft self-correction, the model trained on the source domain labeled data can be further finetuned without the access to the source data. Extensive experiments demonstrate its effectiveness.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The authors present an interesting problem to solve for domain adaptation.
    2. The paper is well-written and enjoyable to read.
    3. Problem is well-formulated and extensive results effectively demonstrate the effectiveness.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper builds on the previous UDA methods and the innovation is incremental. The accuracy boosts are also incremental. How clinically significant is the accuracy boost?

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The data is from an open challenge and the code will be open-sourced. So the reproducibility is great.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    How clinically significant is the accuracy boost?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall a very good paper to read. It’s well written, properly formulated and motivated.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    To address the source free domain adaptation (SFDA) problem, authors present a novel uncertainty-rectified denoising-for-relaxing (U-D4R) framework. Contributions can be summarized into three aspects:

    C1: Considering the unreliable pseduo labels of target data generated from source model, authors present an adaptive class-dependent threshold strategy as the coarse denoising process to generate the pseudo labels.

    C2: Authors introduce the uncertainty-rectified label soft correction for fine denoising by taking advantage of estimating the joint distribution matrix between the observed and latent labels.

    C3: Extensive experiments on cross-domain fundus image segmentation showed that the proposed approach outperforms SOTA SFDA methods and achieves comparable performances with source-dependent methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    S1: The contribution of uncertainty-rectified label soft correction is novel, where author estimate class-conditional label error map through the confident joint matrix and obtain uncertainty map through Monte Carlo dropout. Then, both estimated label error map and uncertainty map are utilized to correct pseudo labels.

    S2: Although the proposed method aims to solve SFDA problem, it can benefit many self-training based methods to alleviate the noisy pseudo label issues. Moreover, it can also be applied to unsupervised domain adaptation and semi-supervised learning tasks. Hence, the proposed method is of great application value.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    W1: This paper lacks related work review in self-training based SFDA methods. Authors only review one self-training based SFDA method [3], and hence the comparison makes the proposed approaches seem to be novel. However, there are many self-training based SFDA methods, such as [1-3]. For example, [*1] also propose an uncertainty-aware self-training method to generate reliable pseudo-labels. It is suggested to compare the proposed method with more self-training based SFDA method, so as to elaborate the contributions of proposed method.

    [1] Ye M, Zhang J, Ouyang J, et al. Source Data-free Unsupervised Domain Adaptation for Semantic Segmentation[C]//Proceedings of the 29th ACM International Conference on Multimedia. 2021: 2233-2242. [2] Prabhu V, Khare S, Kartik D, et al. S4T: Source-free domain adaptation for semantic segmentation via self-supervised selective self-training[J]. arXiv preprint arXiv:2107.10140, 2021. [*3] Kim Y, Cho D, Han K, et al. Domain adaptation without source data[J]. arXiv preprint arXiv:2007.01524, 2020.

    W2: As for contribute C1, the novelty of the proposed adaptive class-dependent threshold strategy is limited. It is similar to [*4], and this idea has been widely utilized in self-training based unsupervised domain adaptation methods.

    [*4] Zou Y, Yu Z, Kumar B V K, et al. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 289-305.

    W3: As for contribute C2, the definitions of symbols are not clear, making details of the whole process hard to understand. For example, it is not clear what ‘the out-of-sample predicted probabilities P_{t}^{hat}’ means, what ‘out-of-sample’ is and what the shape of P_{t}^{hat} is.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Authors claim that code will be available after review.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • It is suggested that authors should give specific definition of each utilized symbol.

    • It would be better if that each used symbol is mentioned in the whole framework figure.

    • Authors should compare with other label correction methods in unsupervised domain adaptation task or learning with label noise task, such as [*5], to demonstrate the superiority of the proposed label self-correction method.

    [*5] Zhang P, Zhang B, Zhang T, et al. Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12414-12424.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I gave the overall score considering the novelty and application of paper.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper suggests a method for unsupervised domain adaptation for a scenario where the source data is not available. Pseudo labels are generated on the target domain via an adaptive class-dependent threshold strategy. Then, uncertainty-rectified label soft correction is introduced for fine denoising. They apply their approach on multi-site retinal fundus images for the optic disc and cup segmentation, and show that they outperform existing models. Various ablation studies are performed.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Unsupervised domain adaptation in cases where the source domain is not available is of great interest in clinical applications. The motivation of this work is clear and well described in the introduction. Related work is very well embedded throughout the paper. The evaluation of the method and the comparison to other methods is extensive and clear.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    -In section 2.1, not all parameters are described before they are used. For example in equation 6, what does v stand for?

    • Details about the complexity of the model (e.g., number of model parameters, or training time) are missing. -The split into training and test data on the source domain is unclear to me. Are there 50 images in the training and 51 images in the test set? -I guess table 1 shows the mean and the standard deviation over all subjects of the test set? If so, this needs to be stated in the caption. The standard deviations are quite high, is this due to the small number of images in the test set? How big is the test set? -No hyperparameters were mentioned. If they are taken from other implementations, please indicate that in the paper. -In Section 2.2 in the last sentence, the mislabeling is only the case for i not equal to j, which is not stated in this section. -For the ablation study, it is not clear exactly what parts are ablated in the cases a) to e). I suggest to refer to the subsections of Section 2 to make clear what part of the pseudo labeling is left out.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors state in the abstract that they will make the source code available upon acceptance of the paper. The used datasets are cited and publicly available. The backbone architecture is also cited.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Please address the points listed under “weaknesses”. In section 3.1, a better description of the split into training and testing data on the source and the target domain would be helpful. Maybe a table could better visualize the different allocations.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well motivated, clearly written and well organized. The tackled problem is important for real-world applications, where the source data is not available for domain adaptation. The evaluation seems to be complete, as a lot of comparing methods and ablation studies are included. However, there are some explanations missing in the formulas, such as not defined variables.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The contribution of uncertainty-rectified label soft correction is novel, where author estimate class-conditional label error map through the confident joint matrix and obtain uncertainty map through Monte Carlo dropout. Then, both estimated label error map and uncertainty map are utilized to correct pseudo labels. Although the proposed method aims to solve SFDA problem, it can benefit many self-training based methods to alleviate the noisy pseudo label issues. Moreover, it can also be applied to unsupervised domain adaptation and semi-supervised learning tasks. Hence, the proposed method is of great application value

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2




Author Feedback

We are glad that reviewers and ACs found our work “novel”, “is of great interest in clinical applications and great application value”, “well-written and enjoyable to read”, “well-formulated problem and extensive results”, and “has broader potential applications such as other self-training tasks or SSL”. We are also grateful that R2 suggests many interesting and relevant works regarding the self-training-based SFUDA and noisy label learning methods. We will go through these papers and have a further investigation for our extended version. Regarding the definition of symbols, ‘out-of-sample’ means the predicted probability by the target model is independent of the {sample: image, pseudo label generated by source model}. The shape of P_{t}^{hat} is HxWxC (Class). Regarding some questions from R3:

  • v stands for the element in the tensor.
  • Yes, the source domain is divided into 50/51 images for training and testing, respectively.
  • Yes, the std is over all subjects of the test set. We will clarify this. We have elaborated the datasets in Sec.3.1. We thank all the constructive comments on our paper. Other suggestions will be carefully considered for clearer clarifications.



back to top