Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Mingze Yuan, Yingda Xia, Xin Chen, Jiawen Yao, Junli Wang, Mingyan Qiu, Hexin Dong, Jingren Zhou, Bin Dong, Le Lu, Li Zhang, Zaiyi Liu, Ling Zhang

Abstract

Gastric cancer is the third leading cause of cancer-related mortality worldwide, but no guideline-recommended screening test exists. Existing methods can be invasive, expensive, and lack sensitivity to identify early-stage gastric cancer. In this study, we explore the feasibility of using a deep learning approach on non-contrast CT scans for gastric cancer detection. We propose a novel cluster-induced Mask Transformer that jointly segments the tumor and classifies abnormality in a multi-task manner. Our model incorporates learnable clusters that encode the texture and shape prototypes of gastric cancer, utilizing self- and cross-attention to interact with convolutional features. In our experiments, the proposed method achieves a sensitivity of 85.0% and specificity of 92.6% for detecting gastric tumors on a hold-out test set consisting of 100 patients with cancer and 148 normal. In comparison, two radiologists have an average sensitivity of 73.5% and specificity of 84.3%. We also obtain a specificity of 97.7% on an external test set with 903 normal cases. Our approach performs comparably to established state-of-the-art gastric cancer screening tools like blood testing and endoscopy, while also being more sensitive in detecting early-stage cancer. This demonstrates the potential of our approach as a novel, non-invasive, low-cost, and accurate method for opportunistic gastric cancer screening.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_15

SharedIt: https://rdcu.be/dnwGS

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    proposed a cluster-induced Mask Transformer to segment and classify gastric tumor using a multitask learning framework. The method was evaluated on both internal and external datasets. The results look better even than radiologists.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The way of combining clustered mask transformer with CNN features looks interesting. 2) evaluating the technique on both internal and external is the major strength.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) the proposed cluster induced transformer needs to be better justified. e.g. investigating the network with or without this component comparing the networks dealing with single task. 2) the results of the GC segmentation task should be reported, regarding other metrics, i.e., Dice, volume correlation 3) The statement that AI Models Surpass Experienced Radiologists on Non-contrast CT needs to better verified. there are any patient selection bias? the model is better on any population?

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Not sure since the data and the code are not publicly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    More validation is suggested to support the argument as mentioned in the weakness section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1) since it is a mulitask learning, the two tasks should be independently evaluated. The segmentation part is not sufficiently evaluated. 2) The paper made a strong argument, which should be better verified.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    Even though the authors addressed most of my concerns appropriately, I will not change my initial decision, since the decision should be made based on the initial submission.



Review #3

  • Please describe the contribution of the paper

    Native non-contrast CT is used for novel screening on gastric cancer utilizing deep learning approaches. Results proof the high level of applicability with sensitivity similar to specific blood / endoscopic test. From a methodology point of view, the heterogenous nature of gastric cancer is addressed by using derived features prior to clustering techniques to achieve a high level of genericity. This strategy allows to significantly outperform common encoder/decoder networks such as the U-net. The utilized mask transformers lead to a good combination of both, classification (feature-based) and segmentation (encoder/decoder patterns) aspects.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Nice presentation of the technical foundation and the medical aspects. Very nice figures and a high level of soundness.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    no significant weaknesses. Marginal adaptions regarding layout and references are necessary

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    model parameters are provided. Nevertheless, without the datasets and ground truth labels of the proposed study, neither testing nor training can be reproduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    End of page 2: regular sentence instead of brackets (e.g., irregular gastric wall; liquid and contents in the stomach). Excessive descriptions in the images should be avoided - better done in regular text, cf. 11 lines caption in Fig. 1 At first glance, “hospital I” is hard to read – not clear that numbered; please give a hint! At first occurrence: Between YEARS 2018 and 2020! Please state the interpolation strategy in “Implementation Details. We resampled each CT volume to the median spacing while normalizing…” as it has a significant impact on the deep learning process “And successful localization of the tumors is considered when the overlap between the segmentation mask generated by the model and the ground truth is greater than 0.01, measured by the Dice score”  fragment of a sentence and a bit unclear. 1% overlap enough? Maybe the normalized surface distance would be a valid error metric. Table 1. Of course AUC in range [0;1] – nevertheless consider transforming to percentage to make it better fit to the other percentage columnsß All references should at least have a year of pub. References with URL should have something like a “last visited date” REFERENCES: either MICCAI [27] or full name [33]. Consistency, no mixing!!

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    clear presentation, easy to understand and a high level of novelty

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The paper proposes a mask transformer based architecture to detect gastric cancers (tumors in the stomach) on non-contrast CT scans.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper shows very good performance, even vs radiologists
    2. I appreciate the fact that they stratified their performance by the stage of the tumor.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. It is not clear to me why or how introducing the clustering aspect to the mask transformer is required/motivated. Related to this is the following point:
    2. Missing baseline: the key baseline would have been a vanilla mask transformer, which is missing
    3. Going by the way the train and test sets were selected (based on time of scan), it seems there may be overlap of patients in the train and test sets. This could lead to overfitting.
    4. Missing details: What was the resolution of the non contrast scans?
    5. Missing comparisons: Is the detection performance good or bad compared to previously published results?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    probably doable (method is based on cvpr and eccv papers)

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Larger figures: in particular, the CT scans in Fig.2 are very hard to see.
    2. Fill in the missing details if possible, or clarify.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The main factor is the strong performance shown in the paper, even against radiologists.
  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This is a signficant study focuses on segmeting gastric cancer on non-contrast CT, using a cluster-induced Mask Transformer that jointly segments the tumor and classifies abnormality in a multitask manner. The study has many strenghts, includes multi-institutional data, a careful comparison with radiologists, evaluation of the approaches based on the tumor grade, easy to read manuscript, clear clinical motivation. Some minor weaknesses are noted regarding a baseline with only the segmentation using Mask Transformers without classification, evaluation of both segmentation and classification tasks, inclusion of metrics, such as dice.




Author Feedback

We thank all the reviewers and the meta-reviewer for their thoughtful comments and constructive suggestions. We first address the shared questions and then the individual ones.

Q1: Ablation experiments of the network (R1) / vanilla mask transformer baseline (R4 & Meta R). R: We compared our approach with a baseline of vanilla Mask Transformer (MT). This provides a form of ablation study for our designed cluster-induced module. The table included below elucidates the consistent superiority of our approach over this baseline. We will add these results to the manuscript. Method | AUC | Sens. | Spec. | Ext. Spec. MT | 0.929 | 82.0 | 90.5 | 96.4 Ours | 0.939 | 85.0 | 92.6 | 97.7

Q2: Segmentation evaluation (R1 & Meta R). R: We are appreciative of your suggestion to integrate a segmentation evaluation. The Dice score for our method concerning gastric tumor segmentation stands at 0.425, thereby surpassing nnUNet’s performance at 0.392. This information will be included in the manuscript.

Q3: Further verification of statement & possible patient selection bias / overlap of training and test sets (R1 & R4). R: Thank you for the suggestions. We will ensure greater care is taken not to overstate the competencies of AI models and will modify the corresponding text. As part of future work, we are actively planning to expand the reader study by involving more doctors and a broader range of data, both in terms of quantity and diversity. Secondly, our dataset, spanning 2018 to 2020, was sequentially gathered from a single hospital. The most recent patients from the second half of 2020 were selected for the test set, while the remaining patients being assigned to the training set. This ensures no overlap between the training and test sets in our study. This prospective way of selecting test set helps reduce potential biases.

Q4: Interpolation strategy / evaluation for tumor localization / adaptations regarding layout and references (R3). R: In our study, we employ a third-order spline interpolation strategy for resampling, following nnUNet. We agree with the suggestion that the normalized surface distance is more suitable to assess successful tumor localization and will modify our evaluation to reflect this. Moreover, we sincerely value your thorough critique and we will revise the relevant texts as recommended.

Q5: Unclear that how introducing the clustering aspect is motivated (R4). R: The introduction of clustering mask transformers is motivated by two factors. Firstly, from an application standpoint, opportunistic screening necessitates joint segmentation and classification tasks. The cluster centers of pixels obtained from the segmentation branch inherently possess discriminative features and capture global content, thereby facilitating the construction of an effective classification model. Secondly, from a methodological perspective, these centers establish a valuable balance between intra-cluster similarity and inter-class discrepancy, which significantly contributes to gastric cancer detection.

Q6: Missing details of resolution (R4). R: The CT scans in our study have a resolution of 512x512 pixels, with an average depth of 107.8. The patch size used during training and inference is (40, 192, 224) voxel. We will add these details to the manuscript.

Q7: Missing comparisons to previously published results (R4). R: We are grateful for your attention to this issue. Our proposed approach for gastric cancer opportunistic screening using non-contrast CT scans is novel and challenging, particularly given the low contrast of such scans. As per our comprehensive search in the existing literature, we found no previously published work using non-contrast CT for direct comparison. Therefore, we have conducted a rough comparison with established gastric screening tools, as presented in Table 3. Our method demonstrates comparable performance to these tools while offering the advantages of being low-cost and non-invasive.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This is an novel and interesting study, that is accepted for publication based on the novelty of the approach to address a challenging problem, the careful evaluation compared with Radiologists and great rebuttal answers.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    While this paper has some considerable strengths especially in technical aspects, I found the clinical value of this work may be low. In the Introduction, the authors stated that “whether early detection of gastric cancer using non-contrast CT scans is possible remains unknown” due to technical and clinical changes. Thus developing such methods may not be useful because of the fundamental incapability of non-contrast CT scans to screen early-stage gastric tumor. Thus, I am not in favor of accepting the paper.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Proposed a cluster-induced mask transformer network for segmenting/classifying gastric cancer on CT. Strengths include multi-institutional dataset, comparison to readers, and good presentation. Critiques addressed in the rebuttal include requested comparison to vanilla mask transformer (marginal improvements), segmentation performance, clarification of no overlap in training/testing, as well as comparison to previous reported results. Additional motivation for the clustering aspect is provided. Resolution should be reported in mm. Worthy of acceptance.



back to top