Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Xuanye Zhang, Kaige Yin, Siqi Liu, Zhijie Feng, Xiaoguang Han, Guanbin Li, Xiang Wan

Abstract

Gastroscopic Lesion Detection (GLD) plays a key role in computer-assisted diagnostic procedures. However, this task is not well studied in the literature due to the lack of labeled data and the applicable methods. Generic detectors perform below expectations on GLD tasks for 2 reasons: 1) The scale of labeled data of GLD datasets is far smaller than that of natural-image object detection datasets. 2) Gastroscopic lesions exhibit distinct differences from objects in natural images, which are usually of high similarity in global but high diversity in local. Such characteristic of gastroscopic lesions also degrades the performance of generic self-supervised or semi-supervised methods to solve the labeled data shortage problem using massive unlabeled data. In this paper, we propose Self- and Semi-Supervised Learning (SSL) for GLD tailored for using massive unlabeled gastroscopic images to enhance GLD tasks performance, which consists of a Hybrid Self-Supervised Learning (HSL) method for backbone pre-training and a Prototype-based Pseudo-label Generation (PPG) method for semi-supervised detector training. The HSL combines patch reconstruction with dense contrastive learning to boost their advantages in feature learning from massive unlabeled data. The PGG generates pseudo-labels for unlabeled data based on similarity to the prototype feature vector to discover potential lesion and avoid introducing much noise. Moreover, we contribute the first Large-scale GLD Datasets (LGLDD), which contains 10,083 gastroscopic images with 12,292 well-annotated bounding boxes for four categories of lesions. Experiments on LGLDD demonstrate that SSL can bring significant improvement compared with baseline methods in GLD.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_9

SharedIt: https://rdcu.be/dnwGM

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper, the authors tackle the problem of gastroscopic lesion detection (GLD) for computer-assisted diagnosis. The authors explain that it is an important task that is difficult to perform with good results due to the lack of available labeled data and the distinct characteristics of gastroscopic images compared to other more widely available images. Authors propose a training pipeline for gastroscopic lesion detection that reaches better results than commonly used methods. Besides, this work contributes a novel, large, partially annotated dataset for gastroscopic lesion detection.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The tackled problem has concrete clinical applicability, seeing as it is a task performed routinely by doctors that could be automated by the use of Machine Learning.

    The experiments in the paper are performed on real medical images and the proposed framework is evaluated with commonly used metrics for the specific task.

    The proposed approach is very well explained and achieves better results than existing work in the presented experiments. The proposed framework combines self and semi supervised training strategies. The self-supervised step is a pre-training of the backbone of the detection model that combines two pre-existing self-supervised training strategies, patch reconstruction and dense contrastive learning. The second step is semi-supervised and trains the detection model using labeled data as well as pseudo-labels for unlabeled data. The pseudo-labels are generated based on previously seen real labels, kept in a memory bank. Models trained following the presented two-step pipeline achieve better results than state-of-the-art frameworks on the proposed dataset GLD, as well as on a publicly available dataset, while requiring a limited amount of annotated data.

    The acquired GLD dataset is a relevant contribution, it contains a large amount of real gastroscopic images obtained from various patients. A significant part of them is annotated with bounding boxes for 4 types of possible lesions.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some aspects of the training setup and experiments conducted, such as the two detection models used for evaluating the training pipeline, are lacking in the main paper (but they are described and explained in the supplementary material).

    The discussion in the experiments section does not clearly highlight the improvements achieved by the proposed approach.

    The novel GLD dataset is presented as a contribution but it is not clear how it will be shared.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors mention the release a new large dataset with partial annotations verified by experts. However, they do not clarify how and when the dataset will be made available, neither do they clarify whether the code for the proposed framework will be published along with the paper.

    Implementation details and training parameters are available in the supplementary material.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    A small paragraph presenting the experiments conducted (backbone and detectors used) would improve the clarity of the section.

    The clarity of the table presenting the results (Table 1) could also be improved by highlighting the framework configuration proposed by the authors, as well as the highest score reached for each of the IoU thresholds.

    In Figure 2, the label of each bounding box could be made slightly larger and more visible.

    It would be good to clarify how the contributed new dataset is being released to the community

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed approach has a concrete medical application, is evaluated on real data, and reaches better results than existing models on this data. Additionally, the authors release a new, large dataset with partial annotations verified by experts to allow for reproducibility and further research. The experiments section could be made clearer to highlight the improvements achieved by the proposed approach.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    I appreciate author’s responses and clarifications. After reading their rebuttal and the other review comments I find many requested clarifications are addressed, in particular the confirmation of the release of the dataset. I therefore keep my initial rating.



Review #2

  • Please describe the contribution of the paper

    This paper is to develop a novel method for detecting the Gastroscopic Lesion by adopting the concept of Siamese network and self-supervised learning. The main contributions of the proposed work are as follows: A Self- and Semi-supervise Learning (SSL) framework to leverage massive unlabeled data to enhance GLD performance. – A Large-scale Gastroscopic Lesion Detection datasets (LGLDD) – Experiments on LGLDD demonstrate that SSL can bring significant enhancement compared with baseline methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. the flowchart of the proposed work is clear.
    2. the simulations are sufficient.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. the performances are still low
    2. some paragraphs are too long making the work not readable.
    3. not evaluate the Pseudo-Label good or not.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    it can be reproducible as the work is clear.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The motivation should be dense and better be shown with an illustration figure;
    2. Pseudo-label is not evaluated whether they are good or not. If the bad Pseudo-labels are involved into the learning for detection, they will make detection badly.
    3. the accuracy or IoU are still not good to satisfying the real-world requirement. Can it have any strategy to improve.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    the simulation results are sufficient thougth it still has some limitations.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    In this work, the authors propose a new framework for gastroscopic lesion detection with partial labeled data. The main technical contributions of this work is the proposed hybrid self-supervised pre-training method (dense contrastive learning and masked image modeling) and a semi-supervised pseudo-label training process. Besides, the authors will release a large-scale gastroscopic lesion detection dataset (including 10,083 gastroscopic images).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The task the authors aim to solve has practice value;
    2. The collected dataset is relatively large and valuable;
    3. The high-level idea is interesting and demonstrates some novelty;
    4. The experimental results show noticeable improvements;
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper is not well-written, especially the methodology part.
    2. Notations and the order of equations for memory update strategy part are confusing;
    3. Some details are not clearly explained. For example, how the memory representations are initialized is unclear.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility looks good. The authors will release the dataset and the code. Essential experimental settings are included.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The authors should improve the language of this paper, especially the methodology part. For example, what means “The global contrastive learning uses the global feature vector f_g as query q and keys K”? The reviewer assumes the query means the anchor image feature, and the keys represent a feature pool containing the transformed anchor image feature and other features extract from totally different images. The authors should improve the language and avoid ambiguities;
    2. The authors should improve the writing clarity for memory update strategy part. On P5 last paragraph it says “the proposed strategy first finds the most similar feature vector f_s”, then the authors introduce the equation of f’_s, which is confusing.
    3. The authors should proofread the paper to avoid obvious errors. Some of them are listed below: (1) On P3, (Intuition->) Intuitively, such a challenge requires… (2) Pn P4, like DenseCL 10. The …… (3) Equation of L_l, the denominator might be S^2 not S_2 (4) On P6, Finally, they (annotated) 12,292 lesion bounding boxes (5) The polyp, ulcer, cancer, and sub-mucosal tumor numbers are 7,779, 2,171, 1,164 and 1,178 (,) respectively.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper aims to solve a practical clinical task with some novel designs. Experimental results also demonstrate the effectiveness of the proposed method. However, the major drawback of this paper is the clarity of the writing (especially for the methodology part). It seems this paper is written in a rush, making it hard to follow some technical designs. As a result, the reviewer would recommend a weak reject for the initial round of review.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper received mixed reviews, with two of three reviewers gave accept recommendation and one gave weak reject recommendation. The area chairs considered the paper and the reviewers’ comments and agreed with the following strengths of the paper: the interesting idea, the practical value, the dataset value, and the improvement against the baseline. There are also some concerns with the paper: the clearness of the motivation and method description, lacking evaluations to some aspects, and the availability of the collected dataset. In summary, the area chair decided to invite the authors to provide a rebuttal w.r.t. the concerns.




Author Feedback

We appreciate the AC and Reviewers for their constructive comments. We are pleased with the positive feedback on our work, which acknowledges its contributions in solving a concrete clinical task (R1, R3), proposing a novel approach (R2, R3), achieving notable improvements (R1, R3), and providing clear explanations (R1) and reproducibility (R2). Moreover, we are grateful for the recognition of the value of the LGLDD by all reviewers. In the following sections, we will address the main concerns raised by the reviewers as summarized by the AC. We also commit to releasing our code and models upon acceptance.

Q1. Datasets Release (R1, R2, R3, AC): We will release the datasets outlined in our plan upon acceptance.

Q2. Writing and Presentations (R1, R2, R3, AC): -Motivation of SSL(R2): Our SSL framework is tailored to address challenges in daily clinical practice. Gastroscopic images are abundant, but those containing lesions are rare, necessitating extensive image review for lesion annotation. Moreover, gastroscopic lesions differ significantly from natural images, and rare lesion appearances are observed in only a few patients, limiting transfer learning performance. To overcome these challenges, we leverage a large volume of unlabeled gastroscopic images using self-supervised learning for improved feature representations and semi-supervised learning to discover and utilize potential lesions, enhancing overall performance.

-Memory Update Strategy (R3): The proposed strategy follows this pipeline: 1) Acquisition of the lesion feature vector f’_c. 2) Identification of the most similar f’_s to f’_c from the memory. 3) Updating the memory by selecting more unique features from f’_s and f’_c compared to the class prototype feature vector p_c. In our revised paper, we will provide detailed explanations of f’_s and f_s, and include an additional Algorithm Figure in the Attachment to provide further clarity, considering space limitations.

-Memory Initialization (R3): To initialize the memories, we empirically select 50 lesions randomly for each class.

-Experiment Presentation (R1): We will emphasize the enhancements and update the figure to include bounding boxes.

-Ambiguity of 3.1(R3): We will provide further clarification that the keys (K) are extracted from an alternate view of the query image and other images within the batch. In this section, we intend to update certain expressions to align with the conventions of Moco and DenseCL, while also enhancing comprehensibility.

-Typos (R3): We will carefully correct the other typos.

Q3. Pseudo-Label Quality Evaluation (R2, AC): We acknowledge the importance of pseudo-label quality in SSL. In our approach, the Objectiveness score threshold (Parameter Analysis 2 of Sec. 4) controls the quality of pseudo-labels. We observe the following trade-offs: 1)A low threshold generates noisy pseudo-labels, leading to reduced performance (-0.6/-0.2 AP at thresholds 0.5/0.6). 2)A high threshold produces high-quality pseudo-labels but may miss potential lesions, resulting in only slight performance improvement (+0.3 AP at threshold 0.7). 3)Our PPG approach uses a low threshold (0.5) to identify potential lesions, which are then filtered using prototype feature vectors, resulting in the most significant performance enhancement (+0.9 AP). We will include this analysis in the revised paper.

Q4. Further enhancement (R2): Given the high similarity of gastroscopic images in global features and their high diversity in local features, transformer-based architectures are well-suited for capturing contextual information to address this challenge. Consequently, we are currently investigating the application of SSL to transformer-based detectors, such as DINO, to further improve performance.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper focuses on gastroscopic lesion detection by utilizing unlabeled gastroscopic images to learn informative features in a self-supervised manner. The method is generally novel, and the results outperform the SOTA. The rebuttal addresses most of the concerns about the presentation, memory update, and label quality evaluation.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper needs to re-organize to make it more easy to understand.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After integrating all information and reading the paper by myself, I am more inclinded for rejection for the current version.

    1, I agree with the third reviewer that the clarity is not sufficient, especially for the main technical part.

    2, Authors did not make it clear whether the dataset will be publicly available or not at the initial submission. It is about 500 patients.



back to top