Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Nikhil Kumar Tomar, Debesh Jha, Ulas Bagci, Sharib Ali

Abstract

Colonoscopyisagoldstandardprocedurebutishighlyoperator- dependent. Automated polyp segmentation, a precancerous precursor, can minimize missed rates and timely treatment of colon cancer at an early stage. Even though there are deep learning methods developed for this task, variability in polyp size can impact model training, thereby limiting it to the size attribute of the majority of samples in the training dataset that may provide sub-optimal results to differently sized polyps. In this work, we exploit size-related and polyp number-related features in the form of text attention during training. We introduce an auxiliary clas- sification task to weight the text-based embedding that allows network to learn additional feature representations that can distinctly adapt to differently sized polyps and can adapt to cases with multiple polyps. Our experimental results demonstrate that these added text embeddings im- prove the overall performance of the model compared to state-of-the-art segmentation methods. We explore four different datasets and provide insights for size-specific improvements. Our proposed text-guided atten- tion network (TGANet) can generalize well to variable-sized polyps in different datasets. Codes are available at https://github.com/nikhilroxtomar/TGANet.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16437-8_15

SharedIt: https://rdcu.be/cVRs1

Link to the code repository

https://github.com/nikhilroxtomar/TGANet

Link to the dataset(s)

https://datasets.simula.no/kvasir-seg/

https://github.com/dashishi/LDPolypVideo-Benchmark

https://www.kaggle.com/competitions/bkai-igh-neopolyp/overview

Reviews

Review #1

Please describe the contribution of the paper

The authors focus on polyp segmentation and propose TGANet to use auxiliary classification task to improve the final performance. TGANet predicts the number and size of polyps and embeds them as the weight in channel attention. Experiments show that the proposed method can improve the final performance on multiple datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The proposed label attention can further utilized classification information to constrain the feature maps for segmentation.
2. The experimental results demonstrate improved performance using four datasets.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. TGANet uses many channel and spatial attention modules in the baseline, such as FEM. From Table 3, it seems the FEM is more important which improves the performance by 2.4%. However, simply adding such channel and spatial attention in FEM is a well-known trick for DNNs and lacks of the novelty.
2. To better evaluate the effectiveness of label attention, the ablation study lacks of the comparison that only using label attention (without FEM and MSFA)
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The key details are sufficient to reproduce the main results.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
1. As shown in weaknesses part, the authors may need to add the ablation studies only using the label attention for further discussion.
2. Since channel and spatial attention introduce additional parameters and ﬂops, the authors may need to report the influence on FPS when adding different components in Table 3.
3. The authors may need to show how to determine the label of size classification for the cases with one or more polyps.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

As shown in the strengths and weaknesses parts, TGANet proposes the label attention which can use the classification results to guide the segmentation features directly by byte-pair encoding and channel attention. Although TGANet contains many existing attention modules, the ablation study shows the necessity of the label attention. Besides, experiments show that the TGANet achieves better performance than other advanced models.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The main contributions of the work include - 1) text guided attention, 2) feature enhancement module, 3) multi-scale feature aggregation, which improve the polyp segmentation performance. The proposed TGANet surpassed the recent related works on four public benchmark datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The motivation is clear and the idea is reasonable. The network is expected to be aware of the context attributes and the authors realize it in terms of size&number.
2. The text guided attention module is portable for different networks. The text labels of polyp attributes are easy to generate automatically, so that the training does not need extra manual annotations.
3. The experiments are comprehensive. The authors evaluated the proposed TGANet on four publicly available polyp datasets and compared it with five recent medical image segmentation methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The most important innovation, text guided attention mechanism, is not fully explained and investigated. The feature enhancement module and multi-scale feature aggregation are just new re-designs but not novel ideas. The authors should emphasize more on text guided attention mechanism. The so-called FEM and MSFA might be useful but not fresh at all.
2. The Label attention module should be introduced in detail, with more sentences. For example, byte-pair encoding (BPE) is proposed in NLP area. The authors should describe why and how BPE is used. Or the readers would be confused.
3. More experiments about text guided attention should be conducted and presented. For example, the classification performance, the visualization&discussion on the generated attentions.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper introduces the network architecture clearly. The experiments are conducted on four public datasets. Code is not released.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
1. Fig. 1(C) looks very twisted. The layout and arrow lines can be better designed.
2. There should be a section number “2.6” before “Joint loss optimization”
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The idea of text-guided attention is innovative and impressive. The experiments are comprehensive. However, the authors did not adequately focus on text-guided attention. The insights and evidences about text-guided attention are lacking.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #4

Please describe the contribution of the paper

The authors propose an auxiliary classification framework to weight the text-based embedding that allows network to learn additional feature representations. Also, they propose some attention modules to futher improve the performance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The authors exploit size-related and polyp number-related features in the form of text attention during training. It helps the network to improve the performance for the ‘small’, ‘medium’ and ‘many cases’. Also, they conduct lots of experiments to show the improvement on four different datasets.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

As for me, I think this paper does not bring many new insights. They add some attention-based modules (“Feature enhancement module”) and multi-scale feature learning modules to improve the performance. However, I think these similar components have been proved many times to be effective for network training in recent works, e.g. CBAM, Non-local, U-Net… Also, the text-based embedding method is more like a multi-task framework. I think the multi-task framework also has been used in many recent works.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors plan to release the codes.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
1. More comparisons with recent works. The proposed “Feature enhancement module” should be compared with recent attention modules, e.g., CBAM, Non-local.
2. What is the definition of “small”, “medium”, and “large”? Please show the specific measurement for these different groups.
3. Showing the performance if removing the labeling attention module but just keep the “Num polyps” and “Polyp size” tasks. I’d like to see the improvement of the labeling attention module besides the improvement from the multitasks of “Num polyps” and “Polyp size”.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

They conduct lots of experiments to show the improvement on four different datasets. As for me, I think this paper does not bring many new insights.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes to utilize size-related and polyp number-related features in the form of text attention during training. All reviewers consider the technical novelty of the approach sufficient and recommend acceptance unanimously. The final version of the paper should include reviewers’ comments, in particular: to clarify the novelty of the text guided attention mechanism and the Label attention module, and add more comparisons with recent works.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

4

Author Feedback

N/A

back to top

TGANet: Text-guided attention for improved polyp segmentation