Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Chen Shen, Jun Zhang, Xinggong Liang, Zeyi Hao, Kehan Li, Fan Wang, Zhenyuan Wang, Chunfeng Lian

Abstract

Forensic pathology is critical in analyzing death manner and time from the microscopic aspect to assist in the establishment of reliable factual bases for criminal investigation. In practice, even the manual differentiation between different postmortem organ tissues is challenging and relies on expertise, considering that changes like putrefaction and autolysis could significantly change typical histopathological appearance. Developing AI-based computational pathology techniques to assist forensic pathologists is practically meaningful, which requires reliable discriminative representation learning to capture tissues’ fine-grained postmortem patterns. To this end, we propose a framework called FPath, in which a dedicated self-supervised contrastive learning strategy and a context-aware multiple-instance learning (MIL) block are designed to learn discriminative representations from postmortem histopathological images acquired at varying magnification scales. Our self-supervised learning step leverages multiple complementary contrastive losses and regularization terms to train a double-tier backbone for fine-grained and informative patch/instance embedding. Thereafter, the context-aware MIL adaptively distills from the local instances a holistic bag/image-level representation for the recognition task. On a large-scale database of 19, 607 experimental rat postmortem images and 3, 378 real-world human decedent images, our FPath led to state-of-the-art accuracy and reliable cross-domain generalization in recognizing seven different postmortem tissues.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_51

SharedIt: https://rdcu.be/dnwJ6

Link to the code repository

https://github.com/ladderlab-xjtu/forensic_pathology

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    Forensic pathology plays a crucial role in determining the cause and time of death by analyzing microscopic aspects, which can provide reliable factual bases for criminal investigations. In this paper, we introduce a novel framework called FPath, which combines a dedicated self-supervised contrastive learning strategy with a context-aware multiple-instance learning (MIL) block to learn discriminative representations from postmortem histopathological images acquired at different magnification scales. We evaluate the effectiveness and generalization of FPath using two collected datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper is well-organized and easy to follow.
    2. It represents the first attempt to apply advanced AI techniques to forensic pathology with promising results.
    3. The study includes a sufficient ablation analysis and visualizations.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1.What are the technical difficulties in identifying forensic pathology compared to conventional pathology? The author claimed that postmortem changes make the manual differentiation between the tissues of different organs very difficult. However, there is a lack of references and visual comparisons. Furthermore, from the experimental results, it can be seen that this seems to be a relatively simple classification task. The conventional pathological analysis method (ABMIL, 2018) also shows promising results. 2.The motivation of extracting the feature representation of patch by self supervised learning is not clear? How about ImageNet pretrained Resnet50? 3.The authors compared self-supervised methods on natural images to show the superiority of their strategy, which lacks persuasiveness. Comparison with the self-supervised models in the field of pathology is needed, such as HIPT[1], RetCCL[2], etc. 4.Low margin between the proposed MIL with ABMIL(2018). Therefore, significance tests are required. Moreover, There is lack of comparison with advanced MIL methods, such as DSMIL[3], TransMIL[4], DTFD-MIL[5], etc.

    Reference [1] Richard J. Chen, Chengkuan Chen, Yicong Li, Tiffany Y. Chen, Andrew D. Trister, Rahul G. Krishnan, Faisal Mahmood; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 16144-16155 [2] Wang X, Du Y, Yang S, et al. RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval[J]. Medical Image Analysis, 2023, 83: 102645. [3] Li B, Li Y, Eliceiri K W. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 14318-14328. [4] Shao Z, Bian H, Chen Y, et al. Transmil: Transformer based correlated multiple instance learning for whole slide image classification[J]. Advances in neural information processing systems, 2021, 34: 2136-2147. [5] Zhang H, Meng Y, Zhao Y, et al. Dtfd-mil: Double-tier feature distillation multiple instance learning for histopathology whole slide image classification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 18802-18812.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    the reproducibility of the paper is fine.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    please refer to the strength and weakness.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    the novelty of the method and the experiments.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposes a deep computational pathology framework (dubbed as FPath) for forensic histopathological analysis. The paper established a relatively large-scale multi-domain database consisting of an experimental rat postmortem dataset and a real-world human decedent dataset. The result shows FPath achieves state-of-the-art accuracy in recognizing seven different postmortem organs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The motivation of this paper is clear.
    2. A relatively large-scale multi-domain database is constructed.
    3. A thorough experimental comparison with related work approaches. The approach is shown to outperform others.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Lack of clarity in description of the details of Eq. 1 and 2. Why do they use two same integers (i.e., 2)? Why not using 1 or 3 or a non-integer? Could them be different hypeprarameters?
    2. The image categories of Human forensic histopathology dataset are not provided. Are they similar to the seven categories of Rat postmortem histopathology dataset? It should be specific and do not let the reader to guess.
    3. Lack of repeated Linear classification experiments to alleviate the variance by random parameter initialization.
    4. This paper designed two modules with more parameters, i.e., a self-supervised contrastive learning strategy and a context-aware multiple-instance learning (MIL) block. It’s unclear how much gain is actually from the novel designs, versus other factors (variability/data split, increased number of parameters).
    5. The rat and human of the dataset are dead. The clinical significance of forensic histopathological recognition is confusing. While the method yields SOTA performance, it is helpless for treatment and rehabilitation of diseases.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    While the authors claims the source code will be publicly released on GitHub, the dataset is not given. The reproducibility of the paper is limited.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please read the weaknesses section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper proposes a deep computational pathology framework (dubbed as FPath) for forensic histopathological analysis. A thorough experimental comparison shows the approach outperforms others. But the clinical significance is weak. I will adequately consider other reviewers’ opinions to fully evaluate the innovativeness. Thus, I recommend weak accept.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The contributions include: (1) A double-tire backbone combining Resnet50 and Swin Transformer is applied in patch-level feature embedding. (2) Self-supervised constrative learning is employed to establish the double-tier backbone. (3) A context-aware MIL approach is designed to recognize postmortem organ tissues.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    There are a couple of strenghts of this paper. First, the authors leveraged the self-supervised contrastive learning to establish the double-tier feature extraction backbone, where several loss and regularization terms are adopted in training the model. Second, the context-aware MIl that is implemented based on multi-head self-attention meachnism is adopted to perform postmortem tissue recognition.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There are several weaknesses. (1) the section 2.2 context-aware MIL is not well explained. It is hard to get detail operations of the method from this section and the correponding Fig.1. (2) The dataset is not explained clearly. The authors have not provided the dimensions of the histological image patches they used in the study. Also it is not clear why patches at different magnifications, e.g., 5x, 10x, 20x, 40x are mixed together for classfication. Histological patches of different magnfications contain different-level information, which may need to be considiered separately.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    the reproducibility is intermedicate.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    (1) The descriptions for context-aware MIL is not clear. More details should be provided. (2) Dataset should be more clearly explained, such as dimensions of image patches. Why patches of multiple magnifications are mixed together in classification? (3) What is the full name for the evaluation metric: MCC?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Experimental results have thoroughl comparisions. The proposed method looks novel to some degree. But the dataset used is not well described. It is hard to judge if the dataset used is very convincing.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper presents a classification method for postmortem histopathology images mainly based on self-supervised learning. Two datasets from mice and humans re collected for experimental evaluation. Results show relatively good performance. There is however a lack of comparison with state-of-the-art methods that are designed specifically for histopathology images and the design motivation needs justification. The dataset description needs to be much more detailed.




Author Feedback

We thank deep reviews and appreciate for affirming our contributions. The main concerns are addressed below: *[AC & R1] Compare to histopathological SOTAs: 1) In Tab 1, we did have compared our SSL with multiple pathology-specific SOTAs, e.g., [19,20]. As required, we have further included HIPT and RetCCL for comparison. On the Rat dataset, ACCs obtained by them are 0.7281 and 0.9794, respectively, both lower than ours in Tab 1 (0.9831). 2) We have also further compared our context-aware MIL with more SOTAs (DSMIL & TransMIL). On the challenging human dataset, the ACCs for them are 0.9176 & 0.8824, respectively, both lower than ours in Tab 3 (0.9229). These results further imply the efficacy of our method. *[R1] Our SSL vs pretrained ResNet50: Accordingly, on the human dataset, we have further compared our SSL+AB to ResNet50+AB. ACC of the latter approach can only reach 0.7718, which is lower than ours in Tab 3 (0.9011). These results can help justify the significance of our SSL. *[R1] The margin between ABMIL & our MIL: 1) Notably, the good performance of ABMIL in Tab 3 is mainly determined by our SSL-based instance embedding. As has been discussed above, when coupled with pretrained ResNet50, ABMIL drops significantly. 2) We have accordingly conducted a paired t-test. Results show that based on our SSL, the difference between our MIL and AB is statistically significant (p-values<0.01 for all metrics). *[AC, R2, & R1] Motivation & significance of our method designs

  1. Practical significance: As has been discussed, rather than clinical applications, we focused on forensic pathology, a practically significant task in criminal investigations. We believe it is as meaningful as clinical pathology, but obviously with much fewer studies.
  2. Motivation of our SSL: Our SSL is dedicated to capturing fine-grained discriminative patterns from forensic pathological images for accurate postmortem recognition. Its efficacy & advantages can be justified by detailed comparisons & ablations presented in the paper & above responses. References, e.g., (Wu et al., SAA 2022), do exist to demonstrate the challenges in forensic histopathology due to postmortem changes. We’ll accordingly include more references & visualizations for better explanation.
  3. Method gains vs. other factors: Notably, our SSL did not bring more params but trained a more powerful patch-embedding backbone. Also, the number of params & flops of our MIL is relatively small (0.59 M & 14.21 M), considering that a microscopic image has a much-limited number of patches than WSI. The improvements in our designs can be justified by experiments & ablations in the paper & above responses. *[AC, R3, & R2] More method, implementation, & data details We’ll update the paper to describe the details below more clearly:
  4. Context-aware MIL: Given patch embeddings, our MIL part contains two main steps, i.e., an MSA to refine each patch’s feature and an adaptive pooling step to aggregate all patches’ info.
  5. Implementation: 1) We should indicate that the integer 2 in Eqs. 1 & 2 is just to normalize their values between 0 & 1. The weights of these two terms were empirically set as 1. 2) To ensure reproducibility comparisons, we generally used the same random seed in all experiments. We have further conducted 5-time random initializations, and our results are stable (ACC=0.922±0.002). 3) The metric MCC denotes Matthews Correlation Coefficient.
  6. Dataset: 1) The rat and human datasets do have the same seven categories. 2) The patch dimension in our implementation was 224*224. 3) Learning with multi-scale patches is driven by the practical needs of microscopy analyses. In forensic practice, microscopes are much more commonly used than WSI. A practitioner could evaluate a target with varying fields of view, thus desiring an efficient & accurate recognition model generalized across different scales. 4) We’ll release the rat dataset to reproduce all results, while the human dataset needs more work.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal has provided more experimental results. Overall, this paper presents a good contribution although the writing and presentation need to be improved as well. The authors should revise the paper following the reviewers’ comments.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper introduces a novel framework called FPath, which combines a dedicated self-supervised contrastive learning strategy with context-aware multiple-instance learning.

    During the first round of review, the reviewers appreciate the well-organized and high-quality writing, the use of a large-scale dataset that includes both human and mice data, and the proposed methodology. However, they also raise concerns about the lack of comparison with state-of-the-art methods specifically designed for histopathology images and the incremental performance gain achieved. The author responds with a comprehensive rebuttal, summarizing and addressing these concerns. As a result, the paper receives two positive reviews and one negative review.

    I believe the paper demonstrates high-quality writing and presents a rigorous evaluation. However, my main concern is the lack of clear clinical motivation in the paper. It is not evident when the tissue classification task proposed in the paper would be necessary for either human or mouse cases. The tissue classification task seems to be pursued solely for the purpose of utilizing as much data as possible, rather than being driven by a clinically intrinsic application. A more clinically relevant application would greatly enhance the value of the work. In clinical practice, when performing invasive biopsies, we typically already know the tissue type.

    Considering this, my recommendation leans towards rejection.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    There was no further discussion among the reviewers after the authors’ responses. In my opinion, the techniques that use contrastive learning followed by MIL (Multiple Instance Learning) have been previously employed, so there seems to be a lack of novelty. Additionally, the performance does not significantly differ from existing methods althoug the authors diligently presented results in the rebuttal that address the issues pointed out by R1. This technique appears to be a general MIL (Multiple Instance Learning) method that does not necessarily require application in forensic pathology. To assess whether the proposed technique is superior to state-of-the-art methods, experiments on open datasets are needed. Comparisons with existing techniques are challenging in this paper as both datasets used are private data.



back to top