Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Peng Liu, Guoyan Zheng

Abstract

Medical image segmentation is a prerequisite for many clinical applications including disease diagnosis, surgical planning and computer assisted interventions. Due to the challenges in obtaining expert-level accurate, densely annotated multi-organ dataset, the existing datasets for multi-organ segmentation either have small number of samples, or only have annotations of a few organs instead of all organs, which are termed as partially labeled data. There exist previous attempts to develop label efficient segmentation method to make use of these partially labeled dataset for improving the performance of multi-organ segmentation. However, most of these methods suffer from the limitation that they only use the labeled information in the dataset without taking advantage of the large amount of unlabeled data. To this end, we propose a context-aware voxel-wise contrastive learning method to take full advantage of both labeled and unlabeled data in partially labeled dataset for an improvement of multi-organ segmentation performance. Experimental Results demonstrated that our proposed method achieved superior performance than other state-of-the-art methods.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16440-8_62

SharedIt: https://rdcu.be/cVRwP

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper develops a voxel-wise contrastive learning (CL) method to utilize both labeled and unlabeled data in partially labeled dataset to improve the multi-organ segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    – This work aims to solve the partial label multi organ segmentation, which is an important problem in medical image segmentation.

    – The contrastive learning loss is applied for unlabeled voxels to enhance the learned features.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    – The methodology contribution seems to be relative minor. The main contribution is adding the contrastive learning loss to unlabeled voxels. However, the organization and description of this paper is not clear. It is hard to see how authors handle the partial label dataset using CL and why it works. There are also too many math symbols in the main text affecting the readability.

    – There are several major concerns for the experimental evaluation. (1) The experimental setup description is not clear. It seems that the baseline nnUnet’s results in D1 to D4 is just a directly inference. If that is the case, then, reporting the retrained results of nnUnet on D1 to D4 should also be provided to see the upper limit of training directly on the target single organ dataset. (2) The improvement over the previous work is minor, e.g., there is a 1.7% DIce improvement over [12], this kind of improvement might be brought by the backone of nnUnet, instead of the developed CL method. In other words, if [12] is also eqquipped with nnUnet, what are the results? (3) The ablation study on the network architecture is weird. When using original UNet as baseline to segment D0, it achieves better performance than nnUNet in both Dice and HD95. I think this cannot be true. There is also no detailed description on the UNet training details.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I did not check the reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Please see my detailed comments above.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Consider that the paper’s methodology contribution is relative minor, quite a few major concerns for the experimental evaluation and the clarity of this work is relative low, I recommend weak reject.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    1) A novel loss function that can be used to train a multi-organ segmentation network based on both labeled and unlabeled information in partially labeled datasets. 2) A contrastive learning method is proposed to learn better feature representation. 3) Comprehensive study for the proposed method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The performance of the proposed method demonstrates superior performance than previous state-of-the-art loss functions [12], [21] in the task of partially labeled segmentation. 2) The motivation to learn the unlabeled information is good.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) To leverage the unlabeled information, the author proposed a new training approach: extract two overlapping patches and use contrastive loss to minimize the difference between features of the corresponding area. The motivation is that most of previously methods adopted patch/subvolume-based strategy. However, for organ segmentation in CT volume, it is always not necessary to adopt the subvolume-wise patches for training. 2) The work is more like an incremental work. Contrastive learning for enhancing the feature representation is not new in medical imaging area.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No code is provided. But the paper provides the details for reimplementing the work.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The motivation that introduces unlabled information as complementary information for training is interesting. Sequentially, the proposed method of using contrastive learning for better feature representation also demonstrates helpfulness in the network training. However, many weaknesses have to be addressed: a) the motivation to introduce the contrasitive learning is weak. b) Some typos, e.g. “it still challenging” –> “is still challenging” in Section 1.3. c) Statistical tests are required to demonstrate the performance improvement.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1) The motivation to introduce contrastive learning is weak. It’s always not necesary to sample two overlapping patches for training in organ segmentation. 2) The innovation is not sufficient. Constrastive learning has been shown to be useful for feature representation learning. The work is an incremental work.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    The authors proposed a context-aware voxel-wise contrastive learning method to train a network on partially labeled dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The motivation (utilizing partially labeled datasets) is practical.
    2. The idea of “context-aware” seems interesting. The authors would like the same voxels but from different patches (context) to be the postive pairs.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The effectiveness of “context-aware”, i.e., pulling feature of a voxel from different patches closer, needs to be justified in the ablation studies. What if designing the positive pairs by using same voxels from rotation-augmented patches instead of different cropping areas?
    2. The author mentioned a recent work DoDNet (Ref. [13]), which is also designed for partially labeled deep learning segmentation. However, there is lack of comparison to this method.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors will make the code available upon acceptance of paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. As mentioned in the weaknesses part, it would be good to add ablation studies on the effectiveness of “context-aware”, as well as a comparison to DoDNet.
    2. Some minor corrections (1) “a organ” -> “an organ” (2) Introduction section: “Alternative, one can design” -> “Alternatively, one can design” (3) Introduction section: “it still challenging” -> “it is still challenging”
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The use of contrastive learning in partially labeled segmentation problem and the improvement of the performance.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    4

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper receives mixed scores - 2 weak rejects and 1 weak accept. All reviewers acknowledge the importance of addressing the unlabeled entries from partially-labeled datasets and using a context-aware contrastive learning method to solve for it. However, The most critical issue is the lack of technical novelty - it seems that key contribution of using contrastive learning for enhancing unlabeled voxels is already extensive studied in the medical area, making this proposed approach somewhat incremental. Other issues include missing comparison with DoDNet, experimental setup not clear, multiple typos etc. Given the importance of the studied topic and the potential of the approach, the meta-reviewer decides to invite this paper for a rebuttal, and hope authors carefully address all concerns from the reviewers during the rebuttal phase to improve the quality of this paper.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    6




Author Feedback

We thank meta-reviewer (MR) and all reviewers for their comments.

MR,R1&2 incremental novelty As far as we know, we are the first to introduce context-aware voxel-wise contrastive loss (CAVWCL) to address partial labeled multi-organ segmentation problem. This is our main contribution which is not incremental. Additionally, our CAVWCL differs from existing contrastive losses (ECLs) in three-fold: 1) to allow for semi-supervised segmentation (SSS) of partial labeled data (PLD), we propose context-aware consistency (CAC) between voxels under different sampled patches to make models robust to contextual variance; 2) for contextual alignment, we design directional contrastive loss (DCL), which can apply contrastive learning in a voxel-wise manner whereas ECLs are applied to image-level feature; and 3) ECLs don’t consider the confidence of features, and simply aligns them with each other bilaterally, which may even corrupt the better feature by forcing it to align towards the worse one. In contrast, our DCL forces the less confident feature to be aligned towards the more confident counterpart.

MR,R3 Comparison with DoDNet The reason why we didn’t compare with DoDNet [13] is that this method can only use PLD (D1-D3) to train the model. In order to compare with DoDNet, we split D0 and D4 to multiple single class datasets together with D1-D3 to train DoDNet. Evaluated on the same test data, DoDNet achieved a mean Dice of 90.5% and a mean HD95 of 2.81mm. In contrast, our method achieved a mean Dice of 92.5% and a mean HD95 of 2.24mm.

MR,R1 Experimental setup As nnUNet is not designed for SSS of PLD, in our study we train nnUNet on D0 and directly apply the trained model to D1-D4. It is true that we can train nnUNet on the training split of each dataset and then test it on the testing split of the same dataset in order to get upper bound (UB). We tried this idea and trained nnUNet separately on D1 and D3 to get the UBs of liver and pancreas, respectively. In terms of Dice, UBs of liver and pancreas are 95.5% and 80.3%, respectively, which are worse than our results (95.7% vs. 95.5 for liver and 83.6% vs. 80.3% for pancreas). This can be explained by the fact that by incorporating CAVWCL, our method can not only use more labeled data (e.g., D0+D1 for liver and D0+D3 for pancreas) but also utilize unlabeled voxels of D1 and D3 to boost the segmentation performance, which is a clear advantage.

MR,R2&3 typos We will fix them.

R1 how CL handles partial labeled data For each 3D patch sampled from PLD, we can divide voxels in it into labeled part and unlabeled part according to whether a voxel has annotation or not. For labeled part, we use supervised loss to guide the learning process. For unlabeled part, we use our CAVWCL to guide the learning process. In Section 2.2, we presented how to use CAVWCL for SSS of PLD, which is based on context-aware consistency regularization. See Fig. 1 for an illustration.

R1&2 Improvement over [12] is minor Please note that [12] also used nnUNet as backbone. We did paired T-Test on Dice to compare results achieved by [12] and ours. We obtained a p-value of 0.028, which demonstrated that the improvement was statistically significant.

R1 Weird ablation study results nnUNet baseline as trained only on the fully labeled dataset D0. In contrast, for UNet baseline in the ablation study, in order to evaluate the effectiveness of our proposed CAVWCL , we modified the loss such that it can be trained on D0+D1+D2+D3+D4. Since the UNet baseline was trained on more data, its performance is better than nnUNet baseline.

R2 Why use patches for training To segment 3D volume data, due to GPU memory limitation, patch-based methods often used, which save GPU memory but also serve as a way of data augmentation.

R3 How about rotation-augmented patches? Rotation augmentation belongs to low-level augmentations, which don’t change too much the contextual cues, as to compute consistency on a voxel, one has to rotate it back.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal provides additional comparison with DoDNet and the statistical improvement over [12], which looks quite solid. Apart from the empirical results, the authors also clearly state the technical contribution to differentiate from existing constrastive learning works, which is to use context-aware learning modules along with tailored loss terms to address the multi-organ segmentation problem—an important medical imaging task. Overall the rebuttal addresses the major concerns from all reviews and looks very solid. I recommend accept.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    10



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This work introduces an effective approach to learn from partially segmented datasets. There were concerns as to novelty, but I think these concerns can be largely assuaged due to the interesting application of existing contrastive learning approaches and the tailoring that was done for this particular problem. I found that the authors addressed most concerns convincingly and experiments do demonstrate performance improvements over SOTA. However, I do not agree with them that nnUNet can only be applied to fully-supervised data, as it can be adapted to admit any appropriate segmentation loss, even unsupervised ones. This, combined with clarity concerns, make consider this work as borderline accept.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    8



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After carefully reading the paper, reviews and responses from the authors, I side with the reviewers in that the technical contribution is rather limited. Authors argue their main contribution is to be the first work to include the context-aware voxel-wise contrastive loss (CAVWCL) to address partial labeled multi-organ segmentation problem, which authors claimed as ‘their directional contrastive loss (DCL)’. Nevertheless, this was indeed proposed in [a], which authors failed to reference. The work in [a] focuses on semi-supervised segmentation, where the DCL term is applied over unlabeled pixels. This is exactly the same as the term proposed in this work. Therefore, I refute the arguments of the authors regarding the novelty and confirm that the technical contribution is very marginal, if any. Furthermore, authors argue that the proposed loss differs from existing contrastive losses in three-fold, but this is never supported by the empirical validation. Given that the experimental section is not exhaustive, and the technical contribution is marginal, without properly acknowledging relevant work from which this paper is inspired, I recommend rejection.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NR



back to top