Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yicheng Jiang, Luyue Shi, Wei Qi, Lei Chen, Guanbin Li, Xiaoguang Han, Xiang Wan, Siqi Liu

Abstract

An automated bleeding risk rating system of gastric varices (GV) aims to predict the bleeding risk and severity of GV, in order to assist endoscopists in diagnosis and decrease the mortality rate of patients with liver cirrhosis and portal hypertension. However, since the lack of commonly accepted quantification standards, the risk rating highly relies on the endoscopists’ experience and may vary a lot in different application scenarios. In this work, we aim to build an automatic GV bleeding risk rating method that can learn from experienced endoscopists and provide stable and accurate predictions. Due to the complexity of GV structures with large intra-class variation and small inter-class varia- tions, we found that existing models perform poorly on this task and tend to lose focus on the important varices regions. To solve this issue, we constructively introduce the segmentation of GV into the classification framework and propose the region-constraint module and cross- region attention module for better feature localization and to learn the correlation of context information. We also collect a GV bleeding risks rating dataset (GVbleed) with 1678 gastroscopy images from 411 patients that are jointly annotated in three levels of risks by senior clinical endoscopists. The experiments on our collected dataset show that our method can improve the rating accuracy by nearly 5% compared to the baseline. Codes and dataset will be available at https://github.com/LuyueShi/gastric-varices.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_1

SharedIt: https://rdcu.be/dnwGD

Link to the code repository

https://github.com/LuyueShi/gastric-varices

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    In this paper, an automatic GV bleeding risk rating method is proposed and the GVbleed dataset with 1678 images from 411 patients is collected to evaluate the proposed method. They argue that the mask of GV and attention between masked region and the whole image can improve the classification accuracy. They improved the accuracy by 5% compared to the baseline.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. GVBleed dataset is valuable.
    2. Calculating loss between CAM and grouth truth mask during training is interesting.
    3. Rating of GV bleeding is rare seen before.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Baseline is not novel. They have tried ResNet, DenseNet, and EfficientNet, but never tried transformer or other fine grained classification methods.
    2. They did not show whether the final accuracy is better than endoscopist. Is 70.97% a acceptable accuracy? If yes, please provide relevant references.
    3. They did not validate the effetiveness of cross-region attention-map. What if you compare the CRAM the self-attention of the whole image.
    4. The key equations of RCN module is missing. How to get the CAM when training classification model? Please provide references or more detailed description of this operation.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Hard to reproduce.

    1. The key equations of RCN module is missing.
    2. They didn’t mention the data augmentation.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Evaluate more recently published methods on GVBleed. Such as SwinTransformer.
    2. Provide the key equation of RCN module.
    3. Validate the CRAM is better than the self-attention.
    4. Simplify equation (1)(4)(5), because they are just the Dice Loss.
    5. Add normal gastroendoscopic image into training set and test set.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Baseline methods lack novelty, so subsequent improvements are limited.
    2. The key equations of RCN module is missing.
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    3

  • [Post rebuttal] Please justify your decision
    1. Reply to Q1 (The novelty of baseline models. (R2, AC)): I agree with the authors that Transformer (TF) based models require more training data. However, it must be admitted that pre-training with publicly available datasets like COCO may be advantageous before training on medical segmentation datasets. It is a pity that this aspect was neglected in the article. Therefore, I firmly believe that CNN based baselines are not suitable for this task.

    2. Reply to Q5 (Advantages of Cross-region attention module (CRAM) over self-attention. (R2)): It is worth noting that pre-training can enhance the performance of the self-attention module during training. Therefore, I encourage the authors to provide comprehensive experimental evidence for the necessity of cross-regional attention modules (CRAM).

    3. Reply to Q2 (the key equation for calculating CAM in RCM. (R2, AC)): I have previously implemented several CAM models that did not actively participate in the training process, although they were specifically designed for post-training interpretability analysis. In order to confirm the correctness of the article, the authors must provide a complete theoretical basis or code for utilizing CAM in the training process.



Review #3

  • Please describe the contribution of the paper

    The paper deals with the automatic risk classification of gastric varices in the endoscopy scenario. As stated in the paper, there are three contributions, a novel GV bleeding risk rating framework that constructively introduces segmentation to enhance the robustness of representation learning; 2) a region-constraint module for better feature localization and a cross-region attention module to learn the correlation of target GV with its context; 3) a GV bleeding risk rating dataset (GVbleed) with high-quality annotation from multiple experienced endoscopists.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    – the novelty of the application. There are many challenges in recognizing gastric varices, such as large intra-class variations and small inter-class variations. The authors specify these challenges and come up with corresponding solutions. The introduced Cross-Region-Attention-Module and the Region-Constraint-Module are reasonable and adequate to meet the challenges. – clarity. The necessary aspects of the task are all well stated, with clear and reasonable logic. The writing for data curation, workflow, experiments, and results are all clear. 

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    – Some details/ablation are missing. The segmentation module is critical in the whole framework. The authors only state the use of SwinUNet, but have not provided the detailed segmentation performance on their own dataset. Are there any failing scenarios for the segmentation mask generation? What are the side effects, in case of the wrong segmentation mask?

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors curate a in-house dataset, and experiment only on the closed source data.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The paper is clear and well structured. I believe there are many different ways of utilizing the segmentaion information, and the paper is only one of them. The workflow involves many steps, components. Is there a way to simplify the system?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors work on a novel application, and identify the challenges. They put forward a sound solution, with reasonable designing logic. The experiments are solid and adequate, comparing different configurations of the architecture. Despite minor weakness, it is good in terms of problem formulation, novelty, clarity, and conclusion.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The paper presents an approach for automated bleeding risk rating of gastric varices (GV). By incorporating segmentation into the classification framework and introducing region-constraint module for feature localization and cross-region attention module for correlation of target GV with its context, the proposed method improves the accuracy of GV bleeding risk prediction.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper introduces a new approach for automated bleeding risk rating of gastric varices. It incorporates segmentation, region-constraint, and cross-region attention modules, the approach enhances feature localization and captures the correlation of context information, resulting in improved accuracy.

    • The results show a significant improvement in rating accuracy compared to the baseline models.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The literature review appears to be somewhat limited in detail.

    • The paper lacks sufficient information regarding the complexity of the proposed architecture.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper is clearly written but, releasing the code will help improve its reproducibility. Also, the paper intends to release the dataset in the future.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • The literature review could benefit from more detailed information. It would be helpful for the paper to provide a more comprehensive review of the existing literature in the field, discussing relevant studies, methodologies, and findings. This would enhance the overall understanding of the research landscape and demonstrate the novelty or advancements of the proposed approach.

    • The paper lacks information about the complexity of the proposed architecture. It would be beneficial to include details regarding the computational complexity, model size, or any other relevant metrics that provide insights into the resource requirements and efficiency of the proposed method.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Evaluation is reasonable and the paper is well-written. The results show better performance compared to baseline methods.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    mixed reviews, critical component descriptions are missing, some of the results are not convincing, and novelty is questionable for some reviewers.




Author Feedback

We appreciate the reviewers for constructive comments. The main concerns of reviewers and AC are addressed as follows. Q1. The novelty of baseline models. (R2, AC) We had tested both simple models and SOTA transformer-based (TF) models as baselines. However, TF models require more training data, which is not available in our proposed dataset (1.3k images for training), and there is no other related public dataset. We tested SEViT model (customized for medical classification) and the results are far below the expectation (59% acc). Thus, we select three CNN-based models that achieve much higher acc (about 65%) as baselines. On top of them, the proposed framework improves more than 5% acc (71%) in general. We will put the current results of transformer models into the final paper if necessary. Q2. Missing key equations for calculating CAM in the RCM. (R2, AC) Due to limited space, the equation for CAM was not given since it is a widely used method to validate the interpretability of model’s attention. We followed the basic setting of CAM in [19] as mentioned in our paper, which computes the weighted sum of feature maps from the last Conv. layer using the weights of the FC layer. Details will be added to the revised paper. It’s worth mentioning that our framework is not limited to any specific CAM method and can be applied with other CAM techniques as well. Q3. Reproducibility. (R2, R3, R4) Common data augmentation techniques: rotation and flipping were adopted here. Our framework was only tested on the proposed dataset due to the unavailability of comparable datasets. Our code and dataset will be released upon acceptance. Q4. Whether the final classification accuracy is acceptable. (R2) Currently there is no standard criterion of usability in real application. Our model’s performance surpasses (20%) junior endoscopists and general practitioners (~50%), and even gives insights to senior endoscopists in some cases. This finding highlights the potential of our model to positively contribute to the learning and diagnostic capabilities of endoscopists. Q5. Advantages of cross-region attention module (CRAM) over self-attention. (R2) We agree that self-attention is a more popular approach and achieves large success in many tasks. However, in practice, they require a larger scale of training data to learn effective attention and this is not available in our task. We also observed unsatisfactory performances of transformer-based models with self-attention mechanisms. That is why we proposed the CRAM to encourage the model to correctly focus on the varices regions in a more efficient way. Q6. Segmentation performance, the failure cases, and their side effects. (R3) We identified two primary factors negatively impacting the performance: 1) Incorrect localization: Including false positive regions such as normal folds area and false negative regions such as varices regions appearing flat. 2) Ambiguous boundaries: The model consistently achieves correct localization but exhibits imprecise boundary delineation. Due to the rebuttal policy, we are not allowed to provide additional results. We indeed found incorrect classification caused by the segmentation errors in experiments. However, overall, the framework demonstrates improved localization and diagnostic performance. Q7. Model complexity. (R4) Given the input image resolution: 512x512, the parameters and computational cost of our framework are 40.2M, and 52.4GMACs, and 29ms inference time for a single image on GPU RTX2080. Q8. Written problems. (R2, R4) We acknowledge the feedback regarding the deficiencies of details in the literature review and the multiple similar dice loss equations. We assure you that these aspects will be refined and addressed in the revised version. Q9. Add normal images. Since classification of normal/varices is simple(95% acc in our test), we only used varices images to avoid diverse attention. We will add normal images to released dataset if necessary.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    reasonable rebuttal, main questions were answered. authors clearly indicate intuition and rationale about the comparisons and so. Regarding two positive reviews, and first reviewer’s potential strong stack papers in his/her review list, I consider this paper as a borderline accept.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    A mixed assessment from three reviewers (two reviewers had quite positive comments; one still has some concerns about the baseline methods after rebuttal, but those concerns seems addressable). I will be okay to recommend acceptance.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Though the concerns about some technique details are not satisfied, I think most issues have been addressed during the rebuttal. The authors are encouraged to revise their paper in the final version.



back to top