Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yuhui Song, Xiuquan Du, Yanping Zhang, Chenchu Xu

Abstract

Despite the remarkable achievements made by deep convolutional neural networks in medical image segmentation, the limitation that they rely heavily on high-precision and intensively annotated samples makes it difficult to adapt to novel classes that have not been seen before. Few-shot learning is introduced to solve these challenges by learning the generalized representation of a semantic class from very few annotated support samples that can be used as a reference for unannotated query samples. In this paper, instead of averaging multiple support prototypes, we propose a multi-shot prototype contrastive learning and semantic reasoning network (MPSNet) for medical image segmentation. The multi-shot learning network exists independently within the support set, obtains effective semantic features for support images and gives priority to training the core segmentation model of prototype contrastive learning. We also propose a semantic reasoning network that takes the prior semantic features and prior segmentation model learned from the support set as the immediate and necessary conditions for the query image to deduce its segmentation mask. The proposed method is verified to be superior to the state-of-the-art methods on three public datasets, revealing its powerful segmentation and generalization abilities. Code: https://github.com/H51705/FSS_MPSNet.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_55

SharedIt: https://rdcu.be/dnwD3

Link to the code repository

https://github.com/H51705/FSS_MPSNet

Link to the dataset(s)

N/A


Reviews

Review #3

  • Please describe the contribution of the paper

    This paper propose a multi-shot prototype contrastive learning strategy to enhance the segmentation robustness with limited training samples with fully-supervised cotraining.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The problem setup is interesting, and the idea seems to demonstrate significant improvement by extending the average prototypes approach to multi-shot approach with multiple support images. Sufficient experiments is demonstrated to show the significant improvement comparing to the current baselines.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The method figure (figure 1) is difficult to follow with the clarity in method section. I have read several times, to finally get the novelty.

    Another major concerns is the generation of foreground predictions. Although the computation of the mask prediction is from equation 5, I am wondering of what is the advantages of using equation 5, instead a decoder network? Is your method limited to do single organ segmentation each time? More clarity in method section is needed with simple wordings.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good reproducibility

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Major comments: 1) Link simple description to the method figure with more simple words. It is pretty hard to follow. More equations doesn’t mean that your clarity in innovation is good.

    2) Why the encoder network is ResNet101? Is the generic backbone design further enhanced the performance? More ablation studies with different genric network structure may need to perform to evaluate the generalizability of your proposed approach.

    3) The description in semantic reasoning network is making me really confused. As in the introduction, “By learning the generalized representation of a certain semantic class from a few labeled samples (i.e., support images) to guide the segmentation of that class in a large number of unlabeled images (i.e., query images).”. If the query images are referring to the unlabeled images, why there is query mask can compute a loss for backpropagation? More clarity is needed.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Lack of clarity between the word description in innovation and figures, although the performance improvement seems to be increased signficantly.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    Authors describe a novel method for few-shot medical image segmentation based on prototipical contrastive learning. A thorough experimental procedure is conducted in multiple datasets and agains strong baselines, reaching very competitive, if not superior results to the SOTA.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Overall the paper is very well written, with just some minor details remaining to be fixed.

    The methodology is sound and innovative. The manuscript presents an architecture with some quite novel features, such as the idea of prototype contrastive learning and the clever use of support and query sets, departing from traditional few-shot methods that use them in more naive strategies.

    Results are compelling and show rather competitive results with strong baselines for few-shot medical image segmentation.

    Reproducibility should be very easy with code made available and only public datasets being used for the evaluation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    My concerns with this manuscript are very small, as I did consider this to be an interesting read and a well-written.

    Some mathematical notation and a few points about the structuring of the test can be improved, though. More details are given in my constructive comments.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method seems relatively straightforward and the pipeline is not overly complicated. Additionally, code is made available and all datasets are public, so there should be no concerns regarding reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Defining acronyms in the abstract should be avoided.

    “Especially for medical images with lower image intensity and contrast, prototype alignment regularization and the mean prototype strategy do not significantly increase the discriminant ability and prediction accuracy of prototypes.”

    • This should be backed up by either experiments or a citation.

    • Figure 1 could be placed in page 4 or below in order to be closer to the text description of its inner modules.

    “$C$ denotes the channel number of the feature maps” “Assumes that the semantic classes in $D_{train}$ is $C_{train}$ and in $D_{test}$ is $C_{test}$”

    • Some notation of Section 2 might be confusing. C and C_{train}/C_{test} might get confused. I’d suggest the authors to use K_{train} and K_{test} or something similar in order to avoid confusion between these notations.

    • As PCLM is an inner module of both MLN and SRN, Section 2.3 should be placed before Section 2.2 for a more readable bottom-up presentation of the architecture.

    • As there is still space for writing in the MICCAI template, authors could extend the conclusion, which is quite short. Additionally, it would be helpful to include a graphical abstract.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    8

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper should meet all criteria for publication in MICCAI, including well-motivated problem, intuitive and sound methodology, very easy reproducibility, good clarity, novelty and strong results in multiple datasets and in comparison to multiple baselines.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This article addresses few-shot segmentation of medical images. It introduces a multi-shot learning network that mines patterns from support images which serve as guidance for query image segmentation. The method is validated on three datasets and results look promising.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The experimental design is comprehensive and the reported results are of high quality. The authors have provided a link to the code, which potentially enhances the reproducibility.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The manuscript requires improvement in its writing quality and figure explanations. Figure 1 could be re-drawn or more clearly explained, as it currently does not effectively aid comprehension of the method.

    One major insight of the method is treating prototype of each support image as a classifier. However, this is not novel and has been studied in prior prototype-based segmentation works like Rethinking Semantic Segmentation: A Prototype View. It is essential to discuss relevant works so as to accurately position the study in the literature.

    Additionally, the “prototype contrastive learning module” needs clarification as it does not appear to involve any learning process beyond the segmentation losses.

    Some statements appear overclaimed, such as the method being a “new breakthrough” in medical image analysis. The method follows the prototype-aware few-shot segmentation paradigm. It introduces some improvements, but they are not fundamental. Overclaiming the contributions should be avoided.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Given the availability of code, it appears that reproducibility of the presented results could be relatively straightforward.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    please see weakness

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents an approach to few-shot segmentation (FSS) of medical images. It builds on the existing prototype-aware FSS paradigm and introduces a multi-shot learning strategy to mine more informative features from support images. The effectiveness of the method finds evidence in the experiments. Nonetheless, the contributions are still incremental in comparison to the existing literature. Moreover, the writing should be significantly improved to enhance clarity and readability.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a novel approach for few-shot medical image segmentation using prototypical contrastive learning. The validation on three datasets shows promising results. All reviewers praised the methodology, which departs from traditional few-shot methods, as well as the compelling and competitive results achieved with strong baselines. The experimental design is comprehensive, and the authors have provided a link to the code for potential reproducibility.

    However, Reviewer 1 suggests some improvements in mathematical notation and test structuring. Reviewer 2 suggests improvements in writing quality and figure explanations, and highlights the need to clarify the novelty of the proposed method and avoid over-claiming contributions. Reviewer 3 finds the method figure difficult to follow and calls for more clarity in the method section, particularly in the generation of foreground predictions. Especially, the authors need to carefully revise Figure 1 and provide sufficient responses to the major comments from Reviewer 3 in the final camera ready version.




Author Feedback

Thanks to meta-reviewer and all reviewers for recognizing the strengths of our paper (“All reviewers praised the methodology, … the compelling and competitive results”).

  1. Meaningful motivations (“Well-motivated”-R1).
  2. Novel innovations (“The methodology is sound and innovative”-R1).
  3. Substantial improvements (“Results are compelling and show rather competitive results”-R1, “The experimental design is comprehensive and the reported results are of high quality, the contributions are incremental”-R2, “The problem setup is interesting, and the idea seems to demonstrate significant improvement”-R3).
  4. Good writing (“My concerns with this manuscript are very small, as I did consider this to be an interesting read and a well-written”-R1).

All the constructive suggestions will be adopted and the writing will be checked carefully in the final version.

-The questions are clarified:

Q1: “Figure 1 could be re-drawn or more clearly explained, as it currently does not effectively aid comprehension of the method”-R2, “The method figure (figure 1) is difficult to follow with the clarity in method section”-R3. A1: Thank you for your valuable advices, we are very sorry for the trouble caused to your reading. The legend in Figure 1 provides the descriptions of network inputs, outputs, important methods and parameters. To make our method easier to understand, we will redraw Figure 1 in the final version and describe the method flow in a more direct and effective way.

Q2: “The “prototype contrastive learning module” needs clarification as it does not appear to involve any learning process beyond the segmentation losses”-R2. A2: Thank you for pointing out this question. Prototype contrastive learning module (PCLM) is a non-parametric learning module in structure, but it is embedded in the complete deep learning networks such as MLN and SRN. In the training stage, both MLN and SRN include segmentation loss learning. Meanwhile, the input of PCLM is the multi-shot prototypes that are formed by the support features, and the support features are always updated in the process of continuous learning, which undoubtedly makes the multi-shot prototypes constantly updated. Thus, a continuous contrastive learning relationship is formed between the multi-shot prototypes and the hypothetical prototypes (Please see Figure 1). Such a learning process can be realized by the segmentation losses in MLN and SRN.

Q3: “Although the computation of the mask prediction is from equation 5, I am wondering of what is the advantages of using equation 5, instead a decoder network? Is your method limited to do single organ segmentation each time?”-R3. A3: Thank you for pointing out these meaningful questions. We explain them separately. 1) The mask prediction is obtained by equation 5 that directly equivalent to the output of the activation layer in a decoder network. Therefore, using equation 5 to obtain mask prediction saves more computing time than using decoder networks. 2) Like the SOTA methods, our method follows “1-way 1-shot” few-shot learning paradigm that segment one organ each time. However, our method is not limited to segmenting one organ each time when the dataset allows.

Q4: “If the query images are referring to the unlabeled images, why there is query mask can compute a loss for backpropagation?”-R3. A4: Thank you for raising this confusion that give us the opportunity to better improve our paper. We demonstrate SRN in Figure 1 is for the training stage. Specifically, in the training stage, both the support image and the query image are labeled. The label of the support image is used as the input of the network, and the label of the query image is used as the supervisory information of the final segmentation result. In the testing stage, the query image has no label, we need to segment the query image through the trained few-shot segmentation model. According to your instructions, we will clarify this knowledge in the final version.



back to top