Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Maosong Cao, Manman Fei, Jiangdong Cai, Luyan Liu, Lichi Zhang, Qian Wang

Abstract

Cervical cancer is a significant health burden worldwide, and computer-aided diagnosis (CAD) pipelines have the potential to improve diagnosis efficiency and treatment outcomes. However, traditional CAD pipelines have limitations due to the requirement of a detection model trained on a large annotated dataset, which can be expensive and time-consuming. They also have a clear performance limit and low data utilization efficiency. To address these issues, we propose a two-stage detection-free pipeline that maximizes data utilization of whole slide images (WSIs), which levarages only sample-level diagnosis labels for training. The experimental results demonstrate the effectiveness of our approach, with performance scaling up as the amount of data increases. Overall, our novel pipeline has the potential to fully utilize massive data in WSI classification and can significantly improve cancer diagnosis and treatment. By reducing the reliance on expensive data labeling and detection models, our approach could enable more widespread and cost-effective implementation of CAD pipelines in clinical settings.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_24

SharedIt: https://rdcu.be/dnwJG

Link to the code repository

https://github.com/thebestannie/Detection-free-MICCAI2023

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper
    • An detection-free framework has been implemented, which can automatically screen high-resolution cervical cancer images based solely on image-level labels.
    • The two-stage method from coarse to fine reduced the time cost of inference.
    • Experiments were conducted on a large dataset consisting of 5,384 cervical cell pathology images, demonstrating the effectiveness of the method.
    • The comparison with the recent methods shows the progressiveness.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Automated screening of high-resolution cervical cancer images using only image-level labels for supervision is a highly valuable task that can greatly reduce the costly and time-consuming process of manual annotation.
    • A coarse-to-fine process significantly reduces the time and redundancy associated with window-based scanning diagnosis.
    • The popular Transformer and Moco were adopted, suggesting their effectiveness for this specific task.
    • The effectiveness of the experiments on the large dataset demonstrated the feasibility of its application in actual clinical diagnosis, which provided the possibility of improving diagnostic efficiency.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Although it is interesting to achieve disease-assisted diagnosis based on image-level labels, multi-instance learning (MIL) is a common approach in weakly supervised processing of whole slide images (WSIs), which weakens the novelty of this article. e.g. 1 - “Multi-Scale Task Multiple Instance Learning for the Classification of Digital Pathology Images with Global Annotations” in MICCAI 2021 Workshop. e.g. 2 - “Multiple Instance Learning for Digital Pathology: A Review on the State-of-the-Art, Limitations & Future Potential”, 2022. e.g. 3 - “Attention-based Multiple Instance Learning with Mixed Supervision on the Camelyon16 Dataset” in MICCAI 2021 workshop. e.e. 4 - “Diagnose Like a Pathologist: Transformer-Enabled Hierarchical Attention-Guided Multiple Instance Learning for Whole Slide Image Classification”, 2023.

    • The coarse-to-fine method has been proven effective, although it may not be specifically used in the auxiliary processing of cervical cancer; the transfer and use of this strategy is not novel. e.g. 1 - “Cancer metastasis fast location based on coarse-to-fine network”, in CACML 2022. e.g. 2 - “Pancreas Segmentation in Abdominal CT Scan: A Coarse-to-Fine Approach”, 2016. e.g. 3 - “A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation”, in 3DV 2018.

    • The attention guided method is similar to that of “Diagnose Like a Pathologist: Transformer-Enabled Hierarchical Attention-Guided Multiple Instance Learning for Whole Slide Image Classification”.

    • Although a table was provided for accuracy comparison, the visualization of the results is lacking. I cannot determine whether the slight improvement in accuracy is a significant improvement in a special case or a general improvement in details.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • The article did not provide open source codes, but the relevant methods can refer to the codes of other papers, and the details were described clearly, so the methods can be reproduced.
    • The article did not use a public dataset, and the data used will not be publicly available, so the accuracy of this article cannot be proven through reproduction.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • I greatly appreciate this article’s focus on screening assistance for cervical cancer by reducing the burden of labeling, but more innovative methods are encouraged.
    • In addition to comparing through tables, visual comparisons need to be supplemented to demonstrate the performance of this method.
    • Due to the fact that the exploration for cervical cancer screening is not a new task, it means that there is already publicly available datasets. Therefore, experiments on public datasets are encouraged to compare with other methods fairly, which will not only prove the progressiveness of the method but also facilitate other researchers to repoduce the method.
    • Cross-cohort validation is expected to illustrate the ability in clinical assistance.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • Sample-level annotation-based multiple instance learning, attention guided selection, a two-stage strategy from coarse-grained to fine-grained, and pathological image processing based on Transformer modules have been published before, which weakens the novelty of this article.
    • But I think it’s interesting to aggregate these methods together to improve the efficiency of cervical cancer screening。
    • Only precision comparison without visual analysis is not convincing. We cannot determine the effectiveness of this method in practical applications through accuracy: through this method, only sample diagnosis will be obtained , or, suspected lesion regions will be displayed? I think the latter is more useful for pathologists.
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    Although the author emphasizes that many of the references I provided were about histopathology, my practical experience showed that these methods were also applicable to this task. In addition, the method proposed in this article was not specifically designed for cervical cytology and was also applicable to other WSIs, so the author’s response did not address my doubts about innovation. Furthermore, in clinical applications, providing only a high-resolution image classification result is completely unconvincing. Doctors must know where the lesion area is in order to issue a report, so extensive visualization is necessary. Although most of the methods in the article have traces to follow, no articles have indeed applied these methods to this specific field, so I am willing to accept this article as a reference for future research. It is sincerely recommended that the author provide supplementary materials including visualized results at the time of official publication, demonstrating which lesions can be captured by the algorithm, as this is the practical significance that doctors truly care about.



Review #2

  • Please describe the contribution of the paper

    This paper introduces a cervical cancer whole slide image classification approach that only needs sample-level diagnosis labels. The idea is to have a two-stage pipeline where in the coarse-grained stage local images that are likely to contain abnormal cells are identified and then in the fine-grained stage these local images are integrated to provide a classification for the sample.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The pooling transformer looks like a good design, as well as using the affinity propagation algorithm to cluster the inputs into several classes to remove redundant features between transformer layers. Experimental evaluation is extensive with comparison with other methods as well as ablation study.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The abstract could be improved since it does not mention or describe the methodological novelty of the paper.

    Under the subsection “contrastive pre-training of encoder”, it was mentioned that a patch and “its augmented patch” are treated as a positive pair. Please clarify how the “augmented patch” is derived or obtained.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproducibility of the paper is reasonably good. The framework is clearly described.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Abstract could be improved to emphasize the contribution/novelty in methodology.

    Captions for Fig. 1 and Fig. 2 could be made more detailed to help readers understand those figures.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper overall is well written. There are several novel ideas that were proven to be useful for classification. The proposed approach only needs sample-level diagnosis labels.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose a two-stage detection-free classification model for cervical cancer screen of WSI, which is only trained by the WSI-level annotation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    In the proposed framework, authors introduce a pooling transformer by clustering and aggregating redundant tokens, which is effective to reduce the redundancy and distortion from the input images.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    According to the description, it is currently not possible to confirm the effectiveness of primary screening. In the coarse-grained stage, can we really select positive samples by relying only on the high-attention samples of TOP-8? Since cell-level labeling is not considered, a lot of cell characteristics are lost when the field of view is reduced by 4x4 times. (By the way, the authors should inform the magnification of the original WSI scan and the distribution of TBS category of the dataset).

    It cannot be determined whether cells contained in Top-8 are enough to support WSI classification. In the fine-grained stage, the author carried out pre-training to strengthen the model’s representation. However, there may be a deviation between the pre-training task design and the actual fine-tuning task, and the extracted features may not necessarily be applicable to downstream tasks.

    Due to weak labels, some of the factors mentioned above may cause the model to overfit and produce unexplained classification results. Authors should state or give evidence to justify the model trained.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The author’s description of the algorithm is basically reasonable, clear, and reproducible. However, the lack of relevant experimental data cannot directly verify and reproduce the effectiveness of the method.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Refer to the weaknesses.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The author’s description of the proposed method is clear, logical, and the scheme is reproducible. However, since weakly supervised labels are used and only a limited number of TOP-k samples are relied upon, there may be a certain bias in correctly classifying WSIs, and further proof of effectiveness is needed.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    We have received mixed reviewer comments. While reviewers confirmed the merits of the paper on an interesting idea and performance improvement, they also raised major concerns including incremental methodological novelty, major differences from existing studies, lack of extensive experimental evaluation and analysis, etc. Therefore, a decision of Invite for Rebuttal is recommended for the authors to address the reviewers’ comments.




Author Feedback

Q1: R1 challenged our novelty. A: We agree that multi-instance learning (MIL), coarse-to-fine strategy, and transformer are now classical ways to analyze WSIs, but our novelty is not hurt by using above ideas.

  1. R1 mentioned four examples of MIL, most of which are applied to histopathology. However, our problem is about cytology, which is a different topic in the field of pathology. Particularly, cells are centrifuged in cytology, meaning that there is barely any spatial dependency or correlation of abnormal cells in WSIs. Such issue becomes more challenging in classifying normal and slightly-abnormal subjects (e.g., ASCUS in the clinical protocol of TBS), which is exactly the decision boundary seek by our screening task. The abnormal cells can be very scarce and scattered isolatedly in these WSIs, and that is why high-performance detection of abnormal cells is often a must in early works. In this work, however, we successfully avoid training and using such detectors, which is a major progress to cervical cytology analysis.
  2. R1 also mentioned three examples of coarse-to-fine strategy, and the first one on histopathology is related to our work. Although the coarse-to-fine design is common in medical image analysis, our two stages are connected through the nice engineering of “attention guided selection”. Moreover, our two stages share the same network design (i.e., encoder + pooling transformer), while their novelty is acknowledged by the other two reviewers.

Q2: R1 mentioned lack of visual comparison. A: Due to page limit, we did not show visualization results such as saliency maps in the paper. The suspicious abnormal cells can be found by our method, which is in high consistency with expert annotations. We will provide such visual results in our project page after acceptance.

Q3: R2 requested clarification to augmented patch. A: We follow typical contrastive learning paradigm to augment the patch, including rotating, flipping, and color jittering. Note that resizing image is not used, since it can cause significant changes to cell morphology.

Q4: R3 questioned whether the coarse-grained stage can effectively capture positive samples. A: Yes. First, the number of selected images by the coarse stage is critical. In our experiment, we have found top-5 images would be enough. More selected images provide more safety margin to ensure that the positive sample can be identified in the coarse-grained stage. We thus use “top-8” in the paper, balancing between the performance and the restriction of GPU memory. Second, and more importantly, our coarse stage benefits from big-data. We further extend the dataset to ~10,000 samples (twice the size reported in our paper) after MICCAI submission. Our tentative result now shows that, with more data available, the coarse-grained stage alone can achieve ACC=83.5%, under a fair comparison reported in Table 1 of our paper. It further demonstrates the potential of our method. And we will release the latest version of checkpoint if our work is accepted.

Q5: R3 requested more data information. A: We use a 20x lens for WSI. The dataset in our paper has 5384 samples, including 2853 negative, 962 ASCUS, and 1569 high-level positive samples.

Q6: R3 questioned whether pre-training in the fine stage is useful, especially as the pre-training task deviates from the actual classification task. A: It’s true the pre-training task differs from real task in the fine stage. But such strategy is now widely adopted, such as in SimCLR. Through pre-training, the encoder increases its capability of representing input data, as verified by many literature reports. In our experiment, we also confirm the performance gain of the contrastive pre-training (c.f. last two rows in Table 2).

Q7: Reproducibility of our work (R1/R3). A: We will release our code once the paper is accepted.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed most of the concerns from reviewers, although the paper could be further enhanced by adding more experimental results from existing deep learning methods for WSI classification. Given its application to cytology task, this paper is worthy being discussed by the MICCAI community.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Very nice study, with interesting results and great comparisons.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    All reviewers agree with the acceptance of this paper and after reading the rebuttal, I find most concerns are solved and the authors are encouraged to revise their final version according reviewers’ minor comments.



back to top