Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Manman Fei, Xin Zhang, Maosong Cao, Zhenrong Shen, Xiangyu Zhao, Zhiyun Song, Qian Wang, Lichi Zhang

Abstract

Automated detection of cervical abnormal cells from Thin-prep cytologic test (TCT) images is essential for efficient cervical abnormal screening by computer-aided diagnosis system. However, the detection performance is influenced by noise samples in the training dataset, mainly due to the subjective differences among cytologists in annotating the training samples. Besides, existing detection methods often neglect visual feature correlation information between cells, which can also be utilized to aid the detection model. In this paper, we propose a cervical abnormal cell detection method optimized by a novel distillation strategy based on local-scale consistency refinement. Firstly, we use a vanilla RetinaNet to detect top-K suspicious cells and extract region-of-interest (ROI) features. Then, a pre-trained Patch Correction Network (PCN) is leveraged to obtain local-scale features and conduct further refinement for these suspicious cell patches. We design a classification ranking loss to utilize refined scores for reducing the effects of the noisy label. Furthermore, the proposed ROI-correlation consistency loss is computed between extracted ROI features and local-scale features to exploit correlation information and optimize RetinaNet. Our experiments demonstrate that our distillation method can greatly optimize the performance of cervical abnormal cell detection without changing the detector’s network structure in the inference stage.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_63

SharedIt: https://rdcu.be/dnwKj

Link to the code repository

https://github.com/feimanman/Cervical-Abnormal-Cell-Detection

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a cervical abnormal cell detection method optimized by a distillation strategy based on local-scale consistency refinement. The proposed distillation method is shown to greatly optimize the performance of cervical abnormal cell detection without changing the detector’s network structure in the inference stage.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors constructs RetinaNet with the PCN module, which provides the refined scores and local-scale features of extracted patches. Specifically, they propose the ranking loss by utilizing refined scores to optimize the RetinaNet proposal classifier by reducing the impact of noisy labels.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    For evaluation, please also provide the results of sensitivity and specificity.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code is not available now.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    For evaluation, please also provide the results of sensitivity and specificity.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents a cervical abnormal cell detection method optimized by a distillation strategy based on local-scale consistency refinement. The proposed distillation method is shown to greatly optimize the performance of cervical abnormal cell detection without changing the detector’s network structure in the inference stage.

    The authors constructs RetinaNet with the PCN module, which provides the refined scores and local-scale features of extracted patches. Specifically, they propose the ranking loss by utilizing refined scores to optimize the RetinaNet proposal classifier by reducing the impact of noisy labels.

    For evaluation, please also provide the results of sensitivity and specificity. In addition, it is also recommended to provide run time analysis.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper presents a novel method for cervical abnormal cell detection in Thinprep cytologic test images. The proposed method aims to address the challenges of high false positive detection rate in the existing object detection methods when applied to this problem. In the proposed approach, a RetinaNet is used as the detector and an extra classification network is used as cell patch classification network. The two contributions are: i) a ranking loss is computed between the detection score from the detector and the classification score from the cell patch classifier. Such ranking loss enforces the detector to output a confidence higher than the classifier; 2) another contribution is the object-context relationship learning loss. This loss enforces the detector to learn a better representation of the object with consideration of the surrounding contextual information of that object. The proposed method outperforms a few existing SOTA approaches, e.g., RetinaNet, YoloV8 and etc.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The proposed method bears some novelty. The proposed cell-patch-to-ROI relationship learning loss is novel. This loss employs the feature similarity obtained in a classification network to improve the robustness of the feature representation in the detection network.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Some details of the implementation is not mentioned. For example, what value is chosen for K, H, and W. Are the H and W the same for both ROI and the cell patches in the Eq. (7). They don’t have to be the same but what are these numbers?
    2. It is not clear how the K proposals of each cell patch in a mini-batch are determined for the input to the PCN. Since the input to the PCN are cell patches, the top K proposals must also be cell patches, and they are from the same cell patch. Then Eq. (6) is calculating the self-similarity for each cell patch and the cross-similarity within each mini-batch.
    3. A concern from the reviewer is the motivation of the ranking loss is unclear. The input to the classification network are the local cell patches without contexture information, while the bounding boxes predicted by the detection network is actually based on local and contextual information. This means the problem for the detection network is easier and the problem for the classification network is harder. That implies the classification network could tend to output lower confidence values compared to the detection network. If so, what contribution can this rank loss make to the training? What was the observation on the confidence scores of the same positive and negative cells when the two networks, detection network and classification network, are trained separately?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Given the current status, it is impossible to reproduce the result. Many details are not given. For example, what value is chosen for K, H, and W. Are the H and W the same for both ROI and the cell patches in the Eq. (7). In addition, there is no description to how the SOTA methods are trained. Are they also heavily tuned on this dataset. Or the hyper-parameters are kept the same as the original papers where a different dataset was used.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Some details of the implementation is not mentioned. For example, what value is chosen for K, H, and W. Are the H and W the same for both ROI and the cell patches in the Eq. (7). They don’t have to be the same but what are these numbers?
    2. It is not clear how the K proposals of each cell patch in a mini-batch are determined for the input to the PCN. Since the input to the PCN are cell patches, the top K proposals must also be cell patches, and they are from the same cell patch. Then Eq. (6) is calculating the self-similarity for each cell patch and the cross-similarity within each mini-batch.
    3. A concern from the reviewer is the motivation of the ranking loss is unclear. The input to the classification network are the local cell patches without contexture information, while the bounding boxes predicted by the detection network is actually based on local and contextual information. This means the problem for the detection network is easier and the problem for the classification network is harder. That implies the classification network could tend to output lower confidence values compared to the detection network. If so, what contribution can this rank loss make to the training? What was the observation on the confidence scores of the same positive and negative cells when the two networks, detection network and classification network, are trained separately?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. It is unclear how the K proposals of each cell patch in a mini-batch are determined for the input to the PCN. Since the input to the PCN are cell patches, the top K proposals must also be cell patches, and they are from the same cell patch. Then Eq. (6) is calculating the self-similarity for each cell patch and the cross-similarity within each mini-batch? 2. A concern from the reviewer is the motivation of the ranking loss is unclear. The input to the classifier network are the local cell patches without contextual information, while the bounding boxes predicted by the detection network is actually based on local and contextual information. This means the problem for the detection network is easier and the problem for the classification network is harder. That implies the classification network could tend to output lower confidence score. If so, what contribution can this rank loss make to the training? Which network has higher confidence for the same cells when trained separately?
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    This paper bears some novelty, but some details of the design is still counterintuitive, e.g., the authors claim that the confidence of the detection network is lower than the confidence of the classification network, even the classification network has observed less contextual information. The authors did not address the reviewer’s concerns directly. So, the original rating is maintained, weak accept.



Review #3

  • Please describe the contribution of the paper

    This study proposed a framework for automated detection of cervical abnormal cells from Thin- prep cytologic test (TCT). The framework integrated RetinaNet for initial prediction, Patch Correction Network (PCN) for further refinement on the prediction and ROI-Correlation Consistency (RCC) Learning to further regulate model training procedure. The framework showed superior performance on the cervical abnormal cells detection compared to other state-of-art techniques.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed framework combining RetinaNet, PCN and RCC for cervical abnormal cell detection is novel. The clinical implication of the model is also relatively impactful. Technique details of each component in the framework are overall clearly described enabling a high reproducibility potential.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The writing and organization of paper makes the method section less easy to follow. For example, the term “distillation” is mentioned in introduction and conclusion but not explained or described in the method section. The confidence scores and detection scores were used interchangeably throughout the paper to create some unnecessary confusion. Will also suggest integrating section 2.1 with 2.2 to make the PCN more comprehensible for the readers. The collection process and quality control on the ground truth data is not described. This information could be important given that one of the major issues as mentioned in the paper is the noisy label (ground truth) due to the subjectivity in the visual inspection by the cytopathologists. Some important technique details are missing. What is the number K (top k patches were selected and fed into PCN). How to control the number so that a good balance between false positive and false negative can be achieved. How is the value of margin in formula (2) set.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    relatively good given the technique details were provided

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    In section 2.1 “We adopt SE-ResNext-50 [8] as the RCN”, is RCN a typo for PCN? Also provide number of WSI in the experimental results section Figure 1 can be further improved to help the reader better understand the technology. For example, the panels are not properly labeled. It is not shown in the figure that top-k patches were selected based on RetinaNet. Currently, the loss components (e.g RCC, Lrank) in the final loss function seem disconnected, using arrows to show that all these components were integratively used as loss to regulate the training process might be helpful .

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The model in the study is novel, the writing and organization of the paper is relatively less satisfactory

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper presents a method for detecting abnormal cells from cervical images. The method is mainly designed to address noisy labels. Evaluation is conducted on a private dataset. The reviewers recognize the method novelty, but also noted various questions, particularly for method and dataset details. These are important to ensure the validity of the experimental results. In addition, the method performance has only been compared with standard objection detection models. More domain specific state-of-the-art methods should be included in the comparison.




Author Feedback

We thank all reviewers and AC for the insightful comments. Our responses to the major concerns are itemized as follows. We will also make revisions based on the responses in the final version. 1.What value is chosen for K, H, and W. Are the H and W the same for both ROI and the cell patches in the Eq. (7). (R2, R3) We clarify that H and W are not the same for both ROI and the cell patches in the Eq. (7), and we set H=56, W=56 for the cell patch features and H=7, W=7 for ROI. As for K, the experiments showed that the detection performance archives the best when K=10, which can have a good balance between false positives and false negatives. Additionally, we set the value of the margin in Formula (2) as 0.05 (as addressed by R3). 2.How the K proposals of each cell patch in a mini-batch are determined for the input to the PCN. (R2) We give an input mini-batch with B samples, where B denotes the batch size. Each sample undergoes the ROI Align layer to obtain the top K ROIs (3rd paragraph, Page 5). The top K proposals are from the same cell patches. 3.How SOTA methods are trained. (R2) We experimented with our own dataset on SOTA methods, whose hyper-parameters are tuned to obtain their best performance. 4.What is the motivation of the ranking loss. (R2) R2 wondered if the classification network would output lower confidence values compared to the detection network, which undermine the design of ranking loss that is to guide the detection network by the classification network. However, due to the intrinsic architecture limitation of detector, it often results in less discriminative classifier. From our observation, the classification network output higher confidence values compared to the detection network when trained separately. So, we pretrained the classification network to classify positive and negative cell patches. During training, only the weights of the detection network are optimized, while the classification network remains frozen. The ranking loss enforces the detection network to generate more confident predictions according to the confidence of the classification network, thereby suppressing false positives and enabling the detection network to better distinguish between positive and negative cells. 5.The collection process and quality control on the ground truth data is not described. (R3,AC) The collection and quality control of our private dataset follow the standard protocol, which involved three pathologists as A, B and C for manual annotations. A had 33 years of experience in reading cervical cytology images, while B and C had 10 years of experience. First, the images were randomly assigned to either B or C for initial labeling, and then reviewed by the other reader for verification. After that, A checked for their discrepancies, and re-labeled such images. However, there may still exist some noisy label situations due to false positive/negative errors made by B and C together. 6.More domain specific state-of-the-art methods should be included in the comparison. (AC) As addressed by AC, we have compared our proposed method with the one proposed by Liang et al.[1] in the field of cervical cancer abnormal cell detection, and conducted experiments on our own dataset using their method. The results showed that the average precision (AP), AP.5, AP.75 and average recall (AR) values obtained were 44.6, 77.5, 47.7, and 60.0, respectively, which were all inferior to our proposed method. [1]Liang Y, Feng S, Liu Q, et al. Exploring contextual relationships for cervical abnormal cell detection[J]. IEEE Journal of Biomedical and Health Informatics, 2023. 7.For typos/punctuation errors (R1,R2,R3). We have proofread the text and will perform in the final version. (1)We will explain the “distillation” in Section 2 and provide number of WSIs in Section 3. (2) In Section 2.1 “We adopt SE-ResNext-50 [8] as the RCN”, the RCN is a typo for PCN. (3) We will modify Figure 1 according to R3.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal has provided relatively satisfactory responses. The final version should be revised to include more dataset description, clearer design motivations, and more experimental results.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This is a boaderline case in my rebuttal papers. With one accept and two weak accepts, I would recommend to accept this paper.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors present a cell detection method that relies on object detection network and reduction of noisy labels. The authors addressed most of the major concerns raised by the reviewers. However, the paper can be further improved by further clarifying the motivation of their designs.



back to top