Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Wenjing Lu, Jiahao Lei, Peng Qiu, Rui Sheng, Jinhua Zhou, Xinwu Lu, Yang Yang

Abstract

Semi-supervised learning (SSL) has emerged as a promising approach for medical image segmentation, while its capacity has still been limited by the difficulty in quantifying the reliability of unlabeled data and the lack of effective strategies for exploiting unlabeled regions with ambiguous predictions. To address these issues, we propose an Uncertainty-informed Prototype Consistency Learning (UPCoL) framework, which learns fused prototype representations from labeled and unlabeled data judiciously by incorporating an entropy-based uncertainty mask. The consistency constraint enforced on prototypes leads to a more discriminative and compact prototype representation for each class, thus optimizing the distribution of hidden embeddings. We experiment with two benchmark datasets of two-class semi-supervised segmentation, left atrium and pancreas, as well as a three-class multi-center dataset of type B aortic dissection. For all three datasets, UPCoL outperforms the state-of-the-art SSL methods, demonstrating the efficacy of the uncertainty-informed prototype learning strategy.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_63

SharedIt: https://rdcu.be/dnwEb

Link to the code repository

https://github.com/VivienLu/UPCoL

Link to the dataset(s)

Pancreas dataset: https://wiki.cancerimagingarchive.net/display/Public/Pancreas-CT

Left atrium dataset: http://atriaseg2018.cardiacatlas.org

Type B Aorta Dissection dataset: https://github.com/XiaoweiXu/Dataset_Type-B-Aortic-Dissection


Reviews

Review #2

  • Please describe the contribution of the paper

    The paper presents a prototype-formulation based approach for semi-supervised learning in medical imaging semantic segmentation. The paper uses uncertainty as a metric to learn better prototypes over different classes within the datasets, and further enhances it via consistency regularization. The paper presents great results across three different datasets in medical imaging literature.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is well-motivated and written for the reader to understand.
    • The paper contributes a new way to utilize uncertainty from the predicted class-wise distributions on a per-pixel level to learn better prototypes
    • The paper also contributes a variant of contrastive learning (Eqn. 6) to further improve the learnt mappings
    • The proposed approach results in high difference of results, in a positive direction.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The paper states: As the training progresses, the predicted labels become more and more reliable, and the quality of the unlabeled prototypes also gets improved.

    There is no backing to this claim through the paper. It is assumed due to the nature of learning with respect to mean-teacher-student frameworks, and as such, should be properly cited.

    • The last point of comparison for this paper (across Table 1 and Table 2) is the URPC paper. URPC uses 3D UNET instead of V-Net, SGD instead of Adam. In addition, it is unclear if the datasets were resampled to URPC standards (96 x 96 x 96), as it is missing from the cited references for resampling.

    It is unclear to the reviewer if the gains achieved over URPC, even post the contributions in this paper, are a result of stronger set of hyper-parameters?

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • The paper addresses relevant points with respect to reproducibiility.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The paper is a well-written paper, clearly superseding the FussNet paper - which it uses as a foundation for uncertainty quantification. The proposed framework leads to better results across all three datasets used in the paper. The two main comments from this reviewer are mentioned in Section 6 of the review.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well-presented except for the points discussed in Section 6. The reviewer is inclined to change the recommendation based on the rebuttal.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    This paper proposes a novel uncertainty-informed prototype consistency learning framework for semi-supervised learning. The framework provides a solution to jointly learn from labeled and unlabeled data embeddings by the fused prototype learning scheme.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper proposes a novel uncertainty-informed prototype consistency learning framework for semi-supervised learning. Furthermore, the experiments on three datasets shows the effectiveness and superiority of the proposed framework.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1,This paper did not provide the vallidation that the proposed method can capture the embedding distribution by considering all voxels.

    2,In the part of Uncertainty-informed Prototype Fusion. Why didn’t the authors choose the part with the highest confidence in the uncertainty map for prototype extraction on unlabeled data?

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This paper is reproducibility

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see the main weaknesses of the paper

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    There is a certain novelty in the method, and the experiment shows the effectiveness and superiority of the proposed method.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper studies the semi-supervised medical image segmentation task. The paper proposes an Uncertainty-informed Prototype Consistency Learning (UPCoL) framework, which learns prototypes from labeled images to supervise the unlabeled images. The writing is acceptable and the experiments are complete. However, the proposed method is very similar to a recent related work that without discussions, which heavily limits the novelty. I think the authors should clarify the difference in the revision. I may change my rating if the authors can answer my questions well.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Motivation is clear;
    2. Organization and writing are acceptable;
    3. Experiments are completed.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Weakness:

    1. My biggest concern is that the proposed method is very similar to [1]. [1] is the first work that proposes to learn prototypes from labeled images and use them to guide the learning of unlabeled images, but it is not discussed in this paper. In addition, it can be seen that the formulations, e.g., prototypes generations with pooling, consistency learning based on cosine similarity, are also very similar to that of [1]. It seems that the authors just directly introduce [1] to the medical field. I think it will be better if the author can discuss and compare with [1] in the revision and highlight the difference.

    Refer:

    1. Querying Labeled for Unlabeled: Cross Image Semantic Consistency Guided Semi-­Supervised Semantic Segmentation. TPAMI 2022
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It will be better if the authors can release their code for reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This paper studies the semi-supervised medical image segmentation task. The paper proposes an Uncertainty-informed Prototype Consistency Learning (UPCoL) framework, which learns prototypes from labeled images to supervise the unlabeled images. The writing is acceptable and the experiments are complete. However, the proposed method is very similar to a recent related work that without discussions, which heavily limits the novelty. I think the authors should clarify the difference in the revision. I may change my rating if the authors can answer my questions well.

    Pros:

    1. Motivation is clear;
    2. Organization and writing are acceptable;
    3. Experiments are completed.

    Weakness:

    1. My biggest concern is that the proposed method is very similar to [1]. [1] is the first work that proposes to learn prototypes from labeled images and use them to guide the learning of unlabeled images, but it is not discussed in this paper. In addition, it can be seen that the formulations, e.g., prototypes generations with pooling, consistency learning based on cosine similarity, are also very similar to that of [1]. It seems that the authors just directly introduce [1] to the medical field. I think it will be better if the author can discuss and compare with [1] in the revision and highlight the difference.

    Refer:

    1. Querying Labeled for Unlabeled: Cross Image Semantic Consistency Guided Semi­Supervised Semantic Segmentation. TPAMI 2022

    Recommendation:

    1. The authors should discuss some related papers and highlight the difference between their method and others, which can show the novelty clearly
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The motivation is clear and the writing is good. But discussions with some related works are missed

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposes an uncertainty-informed prototype consistency learning framework for semi-supervised learning, which is effective and superior based on experiments conducted on three datasets. The paper is well-written, and the proposed approach contributes a new way of utilizing uncertainty and a variant of contrastive learning to improve learnt mappings. Reviewers find the motivation clear, organization and writing acceptable, and experiments completed.

    Although all reviewers have recommended weak accept, one reviewer noted that the proposed method is similar to a previous work and recommended discussing and comparing the method in the revision. Moreover, one reviewer pointed out that the paper lacks validation to demonstrate that the proposed method can capture the embedding distribution by considering all voxels, and another reviewer questioned why the authors did not choose the part with the highest confidence in the uncertainty map for prototype extraction on unlabeled data. In the final version, these need to be strengthened.




Author Feedback

We express our sincere gratitude to the reviewers and meta-reviewer for their valuable feedback. Their positive assessments of our paper as well-written (AC), satisfactory (R1), well-motivated (R2), and acceptable (R3) are truly encouraging. We are pleased that our approach is recognized as novel (R1), supported by complete experiments (R3), contributing a new way of uncertainty (AC), and yielding effective and superior results (AC, R2). We now provide detailed responses addressing the main concerns raised. Q1: Lack of validation on the ability to capture the embedding distribution by considering all voxels. (R1) A1: Through ablation experiments on prototype learning (Rows 4, 5, and 6 in Tab. 3), we have demonstrated the effectiveness of “UPCoL” in capturing the embedding distribution by considering all voxels. The superiority of “UPCoL” over “MT+PL” (which uses only labeled prototypes) highlights the importance of incorporating unlabeled voxel embeddings for improved segmentation. This demonstrates that considering all voxels allows for a better capture of the embedding distribution compared to considering only labeled voxels. On the other hand, “CPCL*” utilizing both labeled and unlabeled prototypes performs even worse than “MT+PL” with labeled prototypes alone. This emphasizes the significance of the uncertainty-informed fusion strategy employed by “UPCoL”, which better exploits the embedding distribution of unlabeled data while ensuring the quality of the labeled embedding distribution. Q2: Why not choose the most confident part on the uncertainty map for prototype extraction on unlabeled data? (R1) A2: The efficacy of selecting the most confident area on the uncertainty map may be restricted in two ways: 1) it can be challenging to define a reliable uncertainty metric for various tasks; 2) most consistency constraints assume low-density decision boundaries, overlooking the potential of the feature space of unlabeled data. As a result, we have chosen an alternative approach that accounts for the overall distribution of unlabeled data and employs uncertainty-based attention to extract prototypes. Q3: Lack of evidence to support the claim of improved predicted labels and unlabeled prototypes, requiring proper citation. (R2) A3: We have provided visualized results to showcase the improved predicted labels and unlabeled prototypes as the training progresses in SI. In the final version, we will include appropriate citations to further support our findings and provide a comprehensive understanding of the MT framework. Fig. 1 in SI demonstrates that as the training progresses, the model output and the prototype-based prediction of unlabeled samples become more reliable. Fig. 2 in SI further confirms the overall increase in voxel reliability of unlabeled samples with iterations. Q4: Missing from the cited references for resampling and settings of reproducing URPC for comparison. (R2) A4: Pancreas dataset was used with URPC standards (96 x 96 x 96) and the TBAD dataset was resampled to (128 x 128x 128) with reference to related studies. Optimization was performed using Adam. Q5: Need to discuss and compare with [1] in the revision and highlight the difference. (R3) A5: We will add the discussion of [1] in the final version. UPCoL differs from [1] by: a) using both labeled and unlabeled data for prototypes generation, whereas [1] solely relies on labeled data to generate prototypes, b) employing uncertainty-based attention pooling specifically for unlabeled prototypes, and c) utilizing reliability-based weights instead of distance-based weights in consistency learning. Using both labeled and unlabeled data is a fundamental difference between UPCoL and [1]. Q6: Release the code for reproducibility (R3). A6: Code will be released on GitHub later. We sincerely appreciate the reviewers’ valuable feedback. We will diligently incorporate their comments to enhance our work. Their time and effort are greatly acknowledged.



back to top