Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Tianyi Ling, Chengyi Wu, Huan Yu, Tian Cai, Da Wang, Yincong Zhou, Ming Chen, Kefeng Ding

Abstract

Colorectal polyps detected during colonoscopy are strongly associated with colorectal cancer, making polyp segmentation a critical clinical decision-making tool for diagnosis and treatment planning. However, accurate polyp segmentation remains a challenging task, particularly in cases involving diminutive polyps and other intestinal substances that produce a high false-positive rate. Previous polyp segmentation networks based on supervised binary masks may have lacked global semantic perception of polyps, resulting in a loss of capture and discrimination capability for polyps in complex scenarios. To address this issue, we propose a novel Gaussian-Probabilistic guided semantic fusion method that progressively fuses the probability information of polyp positions with the decoder supervised by binary masks. Our Probabilistic Modeling Ensemble Vision Transformer Network(PETNet) effectively suppresses noise in features and significantly improves expressive capabilities at both pixel and instance levels, using just simple types of convolutional decoders. Extensive experiments on five widely adopted datasets show that PETNet outperforms existing methods in identifying polyp camouflage, appearance changes, and small polyp scenes, and achieves a speed about 27FPS in edge computing devices. Codes are available at: https://github.com/Seasonsling/PETNet.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43990-2_54

SharedIt: https://rdcu.be/dnwL9

Link to the code repository

https://github.com/Seasonsling/PETNet

Link to the dataset(s)

CVC-ClinicDB: https://polyp.grand-challenge.org/CVCClinicDB/

Kvasir-SEG: https://datasets.simula.no/kvasir-seg/

CVC-ColonDB: http://mv.cvc.uab.es/projects/colon-qa/cvc-colondb

CVC-300: http://adas.cvc.uab.es/endoscene

ETIS-larib: https://polyp.grand-challenge.org/EtisLarib/


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors of this manuscript propose a new method for polyp segmentation called Gaussian-Probabilistic guided semantic fusion method that progressively fuses the probability information of polyp positions with the decoder supervised by binary masks. Their Probabilistic Modelling Ensemble Vision Transformer Network(PETNet) effectively suppresses noise in features and significantly improves expressive capabilities at both pixel and instance levels, using just simple types of convolutional decoders. Extensive experiments on five widely adopted datasets show that PETNet outperforms existing methods in identifying polyp camouflage, appearance changes, and small polyp scenes, and achieves a speed about 27FPS in edge computing devices (NVIDIA JETSON).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper is well-written and -organised. The figures are clearly presented.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There has several minor weaknesses

    1. In figure 2, what’s meaning of black lines? What’s the difference of black and red dotted lines?
    2. Could the authors compare the runtime of PETNet with other competitors on Nvidia Jetson device.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors claimed “The source code will be available upon acceptance of the paper” in the absratct.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please refer to comment#6

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is well presented. They build three core modules for polyp segmentation. It has good novelty and motivation. Thus, the reviewer recommend to accept this paper.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This work proposes a novel Gaussian-Probabilistic guided semantic fusion method that progressively fuses the probability information of polyp positions with the decoder supervised by binary masks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    A novel transformer-based polyp segmentation framework is proposed to address the aforementioned challenges and achieves satisfactory performance in locating polyps with high precision.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) Almost all the comparative methods used in the manuscript are outdated.

    (2) The novelty is limited.

    (3) Some relevant works are missing.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    n/o

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    (1) More related works for Polyp Segmentation should be added to the related work sections. Almost all the comparative methods used in the manuscript are outdated.

    (2) The novelty is limited. The proposed approach may have been appropriate at the time of your research, but recent advancements in the field suggest that newer methods could yield more robust results.

    (3) Some relevant works are missing. Most of them were published before 2021. The authors should add some advance and recent work published by more reliable journals or conferences.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Some relevant works are missing.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes PETNet (Probabilistic Modeling Ensemble Vision Transformer Network) which consists of three key module groups as Encoder Group, Gaussian-Probabilistic Modeling Group, and Ensemble Binary Decoders Group. The model architecture is original and it outperforms the previous methods for polyp segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Using the possible combination of Transformer with three key groups, the segmentation performance provides better results. Ablation study of Table 3 is performed and discussed with clearer comparisons.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Although smaller number of FP is good in PETNet, Polyp-PVT is sometimes better in mDice, nFP or NSen. Performance of PETNet is almost similar as that of Polyp-PVT from Table 1. Performance is still 85% in ETIS and further improvement is remained.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproduction is possible but providing code becomes easier.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The originality of the proposed method is that the authors introduced a new model with a transformer based three key groups architecture. It is better to have more difference in comparison with the Polyp-PVT since the performance is almost similar from Table 1.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper suggests original points but the paper is weakly accepted since the performance is not so much improved compared to the Polyp-PVT.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposes a novel approach called Probabilistic Modeling Ensemble Vision Transformer Network (PETNet) for polyp segmentation. It has received three reviews, and based on their feedback, the following meta-review summarizes the key strengths and weaknesses of the paper and provides a recommendation.

    Strengths: 1. Well-Written and Organized: Reviewer #1 commends the paper for being well-written and organized, with clear figures and presentation. 2. Evaluation: The paper provides a relatively comprehensive evaluation of the proposed method, including an ablation study with clear comparisons to other methods.

    Weaknesses: 1. Limited Novelty: Reviewer #2 and #3 highlight that the novelty of the proposed approach is limited, as recent advancements in the field could potentially yield more robust results. 2. Performance Improvement: Reviewer #3 suggests that the performance improvement of PETNet compared to Polyp-PVT is not significant, indicating the need for further enhancement. 2. Outdated Comparative Methods: Reviewer #2 points out that almost all the comparative methods used in the paper are outdated, suggesting the need to include more recent and reliable references.

    Based on the reviewers’ feedback, it is recommended to provide the paper with a rebuttal. The weaknesses regarding outdated comparative methods, limited novelty, and the need for performance improvement should be addressed in the rebuttal to strengthen the paper further.




Author Feedback

We appreciate the reviewers’ insightful feedback. In this rebuttal, we address major concerns regarding novelty, performance enhancement, and comparison methods.

Concern 1: Novelty and Model Innovations We acknowledge concerns about our method’s novelty and hightlight these key aspects:

  1. Clinical significance: Our approach addresses two paramount challenges identified in numerous multi-center randomized controlled clinical trials: a) high false-positive predictions, and b) imprecise segmentation of polyps with indistinct boundaries and diminutive size. To our knowledge, this is the first paper systematically tackling these critical issues in clinical practice and proposing targeted improvements.
  2. Model Innovations: We introduce novel components, such as Gaussian-Probabilistic modeling, ensemble decoding, and MTA, enabling our model to preserve distinct polyp boundary segmentation while suppressing polyp-like camouflages and focusing on global polyp pattern perceptions. Moreover, our model is designed for rapid convergence, lightweight real-time operation, and efficient deployment on edge computing devices to meet clinical real-time requirements. Our model’s originality and innovation also appreciated by Reviewers #1 and #3.
  3. Novel Evaluation Metric: We propose an innovative evaluation metric based on detected polyp count, providing a more intuitively accurate assessment in real-world situations.

Concern 2: Performance Improvement and Evaluation Metrics We emphasize that the most commonly used pixel-based segmentation evaluation metrics may not accurately reflect the model’s performance on complex colonoscopy images (Performance mismatch between two-levels’ evaluation in Table 1). Although mDice values for PETNet and Polyp-PVT exhibit minor improvement in some datasets, the overall significant enhancement in other essential metrics for clinical application (nPre, nF1, and nFP, as proved in Concern 1 response) demonstrates our model’s efficacy in enhancing clinical applicability.

Concern 3: Comparative Methods We offer the following clarifications and updates in response to Review #2’s feedback on comparison methods:

  1. Publication time does not necessarily indicate outdated performance. We selected comparative methods after conducting a comprehensive literature review. Polyp-PVT remains a leading contender in real-time polyp segmentation, as evidenced by a recent benchmark paper titled “Benchmarking Polyp Segmentation Methods in Narrow-Band Imaging Colonoscopy Images” published on April 26, 2023.
  2. We performed additional experiments with recent models, LDNet (MICCAI 2022) and SSFormer-S (MICCAI 2022). Updated results on two generalization test sets, CVC-ColonDB and ETIS, reveal PETNet outperforms both models in all key evaluation metrics. Detailed metrics: CVC-ColonDB
    • LDNet: mDic 0.75, nSen 0.90, nPre 0.73, nF1 0.81
    • SSFormer: mDic 0.79, nSen 0.90, nPre 0.84, nF1 0.87
    • PETNet: mDic 0.82, nSen 0.93, nPre 0.87, nF1 0.90

ETIS

  • LDNet: mDic 0.70, nSen 0.91, nPre 0.66, nF1 0.76
  • SSFormer: mDic 0.74, nSen 0.86, nPre 0.67, nF1 0.75
  • PETNet: mDic 0.78, nSen 0.90, nPre 0.82, nF1 0.85

Response to Reviewer #1’s Concerns

  1. In Figure 2, black lines indicate unprocessed variables at that step. Red and black dotted lines represent Gaussian features (red) and binary features (black) chunk split along the channel dimension.
  2. Due to the complexity of deploying models on Nvidia Jetson devices, we did not have adequate time to compare runtime with competitors. However, the benchmark paper mentioned in Concern 3 offers a comprehensive runtime comparison, and our model aligns with Polyp-PVT.

In summary, we provided a focused response to major concerns raised by the reviewers. Our work addresses real-world clinical practice challenges and presents a novel solution for polyp segmentation. We showed that our model outperforms existing SOTA models, including the latest models publish in MICCAI, in critical evaluation metrics.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Based on reviews and authors’ feedback, this work needs further improvement. The paper does not meet MICCAI’s standard presently. Thus, I recommend rejecting this work.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presents a transformer-based polyp segmentation approach. With additional novel components such as Gaussian probabilistic modeling, ensemble decoding, and MTA, this approach focuses on global polyp context while preserving distinct polyp boundaries and suppressing pseudo-polyps. The rebuttal addresses all major concerns from reviewers and hence, merits publication.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I find this paper has a mixture of interesting aspects and downsides. It appears to me that the authors are themselves close to reach the realization that polyps should be dealt with using object detection systems, and not segmentation. I believe the polyp segmentation problem has gained popularity in our community simply because there are open datasets available, but it has no real clinical relevance. Like, no one cares about dice score of a polyp segmentation, getting a handful of pixels right or wrong at their boundary. Now, detecting polyps, without a hell of false positives, and doing it fast, is clinically relevant. Even more relevant would be to deal with videos, not single frames. In this sense, the authors do care about efficiency (they include computation on nvidia jetson), and consider hard polyp cases (small ones, or easy-to-miss polyps).

    That said, I believe that in this case the authors received some unfair reviews. For instance R2 (the only one recommending rejection) limits himself to say that the compared works are outdated, because they are 2021, but he does not provide any guidance on what newer works should be considered. He also complains about novelty in a very vague manner (“The novelty is limited. The proposed approach may have been appropriate at the time of your research, but recent advancements in the field suggest that newer methods could yield more robust results.”), and again giving no clue of what the authors should do to address this concern. In addition, no reviewer went back after rebuttal to discuss anything.

    I have read the authors’ feedback and found it compelling and well-written. I would like to recommend acceptance of this paper, and also kindly suggest the authors to consider transitioning from polyp segmentation to polyp detection on video data, which I find is the way forward in this area, forgetting about pixel-wise performance metrics that do not bring any relevant insight.



back to top