Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Ao Wang, Ming Wu, Hao Qi, Wenkang Fan, Hong Shi, Jianhua Chen, Sunkui Ke, Yinran Chen, Xiongbiao Luo

Abstract

Automatic segmentation of intestinal lesions (e.g., polyps and adenomas) in colonoscopy is essential for early diagnosis and treatment of colorectal cancers. Current deep learning-driven methods still get trapped in inaccurate colonoscopic lesion segmentation due to diverse sizes (large scales) and irregular shapes of different types of polyps and adenomas, noise and artifacts, and illumination variations in colonoscopic video images. This work proposes a new deep learning model called cascade transformer encoded boundary-aware multibranch fusion networks for white-light and narrow-band colorectal lesion segmentation. Specifically, this architecture employs cascade transformers as its encoder to retain both global and local feature representation. It further introduces a boundary-aware multibranch fusion mechanism as a decoder that can enhance blurred lesion edges and extract salient features, and simultaneously suppress image noise and artifacts and illumination changes. Such a new designed encoder-decoder architecture can preserve lesion appearance feature details while aggregating the semantic global cues at several different feature levels. Additionally, a hybrid spatial-frequency loss function is explored to adaptively concentrate on the loss of important frequency components due to the inherent bias of neural networks. We evaluated our method not only on our in-house database with four types of colorectal lesions with different pathological features, but also on four public databases, with the experimental results showing that our method significantly outperforms state-of-the-art network models. In particular, it can improve the average dice similarity coefficient and intersection over union from (84.3\%, 78.4\%) to (87.0\%, 80.5\%) on the five databases.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43996-4_69

SharedIt: https://rdcu.be/dnwQg

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #3

Please describe the contribution of the paper

This paper presents a new method for automatic segmentation of four intestinal lesions. They also provide a new in house dataset of colonoscopic videos to which they compare the performance of their model and 3 other state of the art methods. They present that their proposed method outperforms the three other methods on both their in-house dataset and 4 other publicly available datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Strengths:
- Introduction: the presentation of related work is thorough and there is a clear explanation of the problems, motivations and highlights of the work
- There is a lot of really detailed information about the methodology in this text.
- The use of 3 other methods available in the literature clearly supports the claims being made about this current model and make the paper stronger.
- The new methodology presented is exciting and seems to have solved issues others before it have struggled with - if given greater context and experiment details, the authors work would be better supported.
- The authors have presented a large dataset that they intend to make public. This would be a great step towards allowing other to reproduce the results presented in this manuscript and allow for more research in this subject. Given the review period of this manuscript, the author should make it clear if there has been approval to actually release this data and either place a Link in the footnote or give details as to how this dataset might be used if this manuscript is accepted.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Weakness:
- Some figures are not referred to in the text and are therefore not given much context.
- Many of the figures and tables are unlabelled and very confusing making it difficult to discern the results clearly.
- Although the author has clearly spent a lot of time and energy sharing their methodology, there is very little detail given on the data flow and experiment details. These are very important details that are required to aid the reproducibility of this study and to help support the validity of the results that are presented.
- Training/Validation/Test set : There is no mention of data curation done for this experiment. Were the same folds provided for each run? How many cases are in the test set? Were the train/val/test set divided by patient or by video or randomly divided by image. It is important to know this information for reproducibility and to ascertain whether there may have been data leakage. Due to this - as a reader we don’t know what the metrics in the result Table represent. Explicitly defining a test set metric from a validation or training set metric would help the reader see the generalizability of the suggested model.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

This paper would be highly reproducible if the authors do indeed release their dataset. Their method is well defined and the data would be available. As mentioned in some of my feedback, the main hurdle to reproducibility would be the lack of detail provided regarding the handling and preprocessing of the datasets.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
Specific Comments:
- Abstract : “decoder that can enhance blurred lesion edges and extract salient features, and simultaneously suppress image noise and artifacts and illumination changes.” Lots of “ands”
- Introduction: “Unlike a family of U-Net driven segmentation methods, numerous papers have been worked on boundary constraints to segment colorectal polyps.” This sentence does not make sense
- Figures 1 and 2: The figures should always be referred to in the text. They should help support the work you are explaining. Figure 1 is clear based on its flow that is described the overarching network but Figure 2 is not as clear. The figure caption refers to variables B, G and D - which are not given more context anywhere in the section and cannot be found in any of the equations 1 through 6. The first place they are referred to is in the following section, 2.2. In general, you should include references to them in the text so that they are logically placed.
- Section 3: This is an impressive dataset you have collected - and will be very useful if it is indeed made public. A few details that are important to know : How many patients are are part of this dataset? Of the 4 pathological features, how patients have all 4 in one video, or only 1 etc..
- Section 3: authors mention images were resized to 352x352 for training - what was the original size of the images. Where they downsampled or cropped? These are important details for reproducibility
- Figure 3: This figure looks like it would be visually informative if the reader was given more information. What row or column is associated with each of the four methods? What does a blue outline or green outline refer to? For the public data, which datasets are each image taken from? How many different stills are from the same videos here?
- Figure 4: This is figure is well labelled. From what I understand, the top 4 box plots are the DSC scores of the 4 anatomical structures of the private dataset and the bottom four plots are one for each public dataset (encompassing all 4 structures within the datasets? If this is correct it should be explicitly defined either by a title on the left of the rows or in the figure description
- Figure 5: as with figure 3 - legend needed for blue and green outline on d
- Table 1: define FPS in the text - all acronyms should always be explicitly defined . Also the table caption says that the results are on all 5 databases however there is only one value for the DSC and the other metrics for each method. The text portion explains that this may be an average value? But then the text also mentions “Evidently, CTBMF generally works better than the compared methods on the in-house database with four types of colorectal lesions.” So that seems to suggest that I should be able to see the DSC score on only the in-house data in the table. Please verify and fix.
- Add a conclusion section header for the paragraph that begins with “In summary, this work proposes a new deep learning model of cascade pyramid transformer …”
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper produced great technical work and present a new model that seems to outperform the current state of the art. For all of the detail provided on the technical methodology, this paper lacks a lot of detail regarding their dataset handling and preprocessing. Many of these issues affect the reproducibility of the work and the validity of the results if we do not know how they managed their data across training/validation and testing sets. Many of the figures and tables need work to fully understand the results that are being presented. However, if the authors present more of this detail and rework the recommended section, I believe the manuscript could be much improved.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #5

Please describe the contribution of the paper

This paper aims at automatic segmentation of intestinal lesions in colonoscopy. The author proposes a deep learning-based model CTBMF with cascade transformers and multibranch fusion for polyp and adenoma segmentation. In CTBMF, pyramid transformers-based encoder is used to extract global semantic and subtle boundary features. Then, the author designs a boundary-aware multibranch fusion decoder to fuse feature maps. After that, a hybrid spatial-frequency loss is proposed to train the CTBMF. Quantitative and qualitative experiments are performed to evaluate the proposed method.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Novelty of Method: The carefully designed encoder-decoder structure can preserve lesion appearance feature details, which is also validated in Fig. 5. In this case, the proposed CTBMF can handle the large scales and irregular shapes of different types of polyps and adenomas, noise and artifacts, and illumination variations in the colonoscope. Further, the hybrid spatial-frequency loss is designed to use the features both in the spatial and frequency domain. Novelty of Training Data: A new clinical colonoscopic lesion image data will be released. The data was collected from several surgical procedures and contains four types of colorectal lesions with different pathological features.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Limitation of Experiments: The author performed an ablation study on modules D1, D2, and D3, residual connections, and frequency loss. However, D1, D2, and D3 are the attention map as described in Eq. (10), so this should be made clearer. Besides, the author only compares the proposed transformer-based segmentation method with other CNN-based methods. There are existing transformers-based segmentation methods. Therefore, the author may compare CTBMF with these methods to validate the effectiveness of the design of the proposed method.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The author will release the code of the method and the new colonoscopic lesion image data.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The ablation studies should be described more clearly. More comparisons should be provided to verify the efficiency of the designed transformed-based encoder-decoder architecture.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The author proposes a transformers-based network to achieve automatic segmentation of intestinal lesions in colonoscopy. The author will release the clinical colonoscopic lesion image data.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #4

Please describe the contribution of the paper

In this paper, the authors have proposed a cascade transformer based multi-branch fusion network for the segmentation of colonoscopic lesion, the experimental results on both in-house and the public available dataset can demonstrate the effectiveness of the proposed method.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The experimental design is appropriate, and the evaluation of the experimental data is satisfactory
2. The proposed cascade transformer encoded boundary-aware multibranch fusion (CTBMF) networks is novel and effective.
3. The paper is also presented in clear English and an easy way for audiences to follow.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The cascade transformer is widely used in the machine learning community. However, the authors do not introduce it in detail as a backgroud knowledge when reviewing the existing studies.
2. In Table 2, It is very strange that the segmentation accuracy decreases when the D3 module is added.
3. A conclusion section is also required to sum up the main contributions and findings in this study.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility of this paper is fine.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. Please introduce cascade transformers and its application in the introduction section of the revised manuscript.
2. More disscussions should be provided on the ablation study results.
3. A Conclusion section is also required for summing up the main contirbution of this paper.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is well structured and the results are convincing. The presentation of the formulas and the algorithm is clear.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper aims to achieve real-time and accurate segmentation of four intestinal lesions in colonoscopic images. Extensive experiments are conducted to validate the method. All reviewers agrees on rational of the method and soundness of validation. The weaknesses raised in comments are minor and could be easily addressed in final version. Therefore, I suggest this paper to be considered for early accept.

Author Feedback

N/A

back to top

Cascade Transformer Encoded Boundary-Aware Multibranch Fusion Networks for Real-Time and Accurate Colonoscopic Lesion Segmentation