Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Adrian Galdran, Katherine J. Hewitt, Narmin Ghaffari Laleh, Jakob N. Kather, Gustavo Carneiro, Miguel A. González Ballester

Abstract

Tissue typology annotation in Whole Slide histological images is a complex and tedious, yet necessary task for the development of computational pathology models. We propose to address this problem by applying Open Set Recognition techniques to the task of jointly classifying tissue that belongs to a set of annotated classes, e.g. clinically relevant tissue categories, while rejecting in test time Open Set samples, i.e. images that belong to categories not present in the training set. To this end, we introduce a new approach for Open Set histopathological image recognition based on training a model to accurately identify image categories and simultaneously predict which data augmentation transform has been applied. In test time, we measure model confidence in predicting this transform, which we expect to be lower for images in the Open Set. We carry out comprehensive experiments in the context of colorectal cancer assessment from histological images, which provide evidence on the strengths of our approach to automatically identify samples from unknown categories. Code is released at \url{https://github.com/——–/t3po}.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16434-7_26

SharedIt: https://rdcu.be/cVRrI

Link to the code repository

https://github.com/agaldran/t3po.git

Link to the dataset(s)

https://zenodo.org/record/53169#.Yr1vGNLMLMU

https://zenodo.org/record/1214456#.Yr1vTNLMLMU


Reviews

Review #1

  • Please describe the contribution of the paper

    The focus of this paper is to develop a method that can identify clinically relevant patches that are present in the train set (closed set) while ignoring the irrelevant ones, not present in the train set (open set). To this end, a model is trained with two tasks, one to predict the class and the other to predict the colour transform applied to the input image. During inference, the input patch without any transform is processed through the model. The confidence score for predicting the transform is used to distinguish between open and closed sets since it tends to be lower for the open set. The proposed model is evaluated on two colorectal datasets and is shown to perform better as compared to other techniques.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Proposed a simple yet novel self-supervised way of filtering out the irrelevant patches.
    • The source code will be made public with the publication.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • There are some points that needs clarification.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • The source code will be made public with the publication.
    • The authors have used public dataset for their experiements.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Ablation study: Train a simple model with the same data split as in the paper. The train data would comprise all classes in the closed set in addition to the open set as a single class. Comparison with this experiment would show the benefit of the proposed approach.
    • Tables 1 and 2: Add columns for average ACC and AUC over all three splits.
    • It is not clear which loss function was used for classification and transform prediction.
    • What is the form of ground truth labels and expected network output for the transform prediction task?
    • Was thresholding applied to softmax probability to decide whether the input image belongs to an open set or closed set? If so, what was the value and how was it obtained?
    • Do you think the trained model will generalize well to the unseen dataset from another domain?
    • What do values in parentheses mean in Tables 1 and 2?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I would recommend accepting this paper if ablation study is provided to justify the benefit of using color transform.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    In this study, the authors proposed a new approach for Open Set histopathological image recognition based on training a model to accurately identify image categories and simultaneously predict which decoupled color-appearance data augmentation has been applied.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors proposed a decoupled color-appearance data augmentation strategy and a test-time transform prediction model.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    This work lacks of the performance investigation of appearance transformation stage.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of this work seem to be reasonable, but should be further validated on more datasets.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    1.The Abstract should address more novelties of the proposed model and detailed results for attracting the readers.
    2.With respect to the transform space in T3PO, there are seven appearance (i.e., Identity, Brightness, Contrast, Saturation, Hue, Gamma, Sharpness) generated by appearance transform. How does this different appearance affect the performance of T3PO? Does generating more appearances improve the performance of T3PO? 3.Reference 3 citation format should delete “conference Name: IEEE Transactions on Pattern Analysis and Machine Intelligence”.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This research work seem to be reasonable, but should be further validated on by additional experiments.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    The authors have addressed the raised comments.



Review #4

  • Please describe the contribution of the paper

    The paper presents a methodology to detect Out-of-Distribution data in addition to classifying “known” regions of histopathology images

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is clearly written and easy to follow. It offers a good depth of background information, rationale for approach, and contains valuable ablation studies and comparisons.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    While the paper is meant to provide output of detailed classification of known regions, this specific performance is not well analyzed. In addition, the performance improvements with regards to the other evaluated methods seem minimal. Therefore, the added value of the approach (in technical terms) is unclear.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I feel the authors present enough information on the datasets and algorithm to allow for reproducibility of the paper. If they indeed release a working software, the impact of the paper will increase.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    I would have liked to see a more detailed analysis of the classification outputs for the “known” image regions, especially in comparison with the state of the art. The reported metrics are a good aggregate value of performance, but in detailed images such as the histopathology ones, it’s usually the rare classes / limited pixels that contribute the most in semantic understanding.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please see strengths and weaknesses above.

  • Number of papers in your stack

    3

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Somewhat Confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    Having read the reviews and the response of the authors I maintain my borderline acceptance. The suggestion of R1 to provide an upper bound for the performance is a sensible one and I cannot see why it cannot be fully reported.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper introduces a novel approach to analyze Open Set histopathological images and categorize them.

    The paper is well written. The authors perform studies on two histological datasets that corroborate their claims. In its current state the paper is borderline.

    The reviewers made good suggestions regarding methods that could improve the paper. Particularly, see the questions of the reviewers on choices made by the authors regarding the algorithm design. See the proposal to perform ablation study to corroborate the choices made.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4




Author Feedback

We are pleased that all three reviewers found our technique interesting and potentially worthy for presentation at MICCAI. The main critiques we received were about the experiments, where further evidence on the advantage of our approach was requested by R1, and some clarifications on performance analysis in the closed set were mentioned by R4. R2 also asked about the selection of the colour transforms we made, and its possible impact in our method. We address these and other comments point by point below.

  • The experiment proposed by R1 consists of labelling all examples in the open set with a new category (O), and training a model on fully-labelled data from classes {K1,…,Kn, O}. Note however that this defeats the purpose of Open Set Recognition, where we attempt to tell apart images from O without having to label them and use them for training. As such, rather than an ablation experiment, this model would represent an ideal upper bound on what is the maximum Closed/Open Set separability we could achieve for each split, if we were to invest effort in annotating the Open Set. We carried out the experiments and found a high AUC for this problem (in the high 90s), but again, the purpose of our work is to avoid labelling other categories than the ones of interest, so we believe this experiment is a bit out of the scope of the paper.

  • The main concern of R2 was about our choice of colour/appearance transforms. We adopted a recently proposed [17] data augmentation pipeline that is simple but extremely powerful. The very nature of that work is revealing that such a small set of transforms is enough to reach top performance in image classification, and we stick to it for that reason, without further experimentation on expanding the transform set. Regarding using less transforms, we would expect that reducing the amount of colour transforms would increase the risk of overfitting the model on the closed set, since it would result in less diverse data augmentation during training.

  • R4 asks about the accuracy in the “known” Closed Set, particularly for the rare classes. In this paper we deal with datasets in which class frequencies are balanced, and for this reason accuracy is a meaningful metric. Although it must be admitted that this is a slightly artificial setup (no rare classes), we believe it serves as a reasonable proof of concept of our technique. In addition, expert pathologists designed our closed/open splits in such a way that interesting classes (e.g. tumoral tissue) are always in the closed set, whereas less relevant classes belong to the open set. This speaks about our models being able to perform well for identifying rare and relevant classes.

  • Other comments:
  • Add columns for average ACC and AUC -> these will be added.
  • It is not clear which loss function was used for classification and transform prediction. What is the form of ground truth labels and expected network output for the transform prediction task? -> During training, aside from learning tissue type classification, we also train a multi-class classifier to predict the image transformation used for each training image. In both cases we minimise the standard cross-entropy loss.
  • Was thresholding applied to softmax probability to decide whether the input image belongs to an open set or closed set? -> No it wasn’t; it is standard practice in OSR literature to only report AUC, although we agree with R1 that it would be useful to use a hard threshold to compute open-set classification accuracy. We leave this for future work.
  • Do you think the trained model will generalise well to the unseen dataset from another domain? -> We believe that it should generalise well to other open set images. However, generalisation in the closed set would be a domain adaptation problem, which depends a lot on the gap between domains.
  • The Abstract should […] detailed results for attracting the readers -> the abstract will be improved based on the reviewer’s suggestions.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors addressed well most of the reviewers concern. This is a good paper and after incorporation of the reviewers comments it can become an excellent one.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    3



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposed a simple novel self-supervised way of filtering out the irrelevant data. The reviewers except for one recommend acceptance. One reviewer who recommended borderline reject did not respond to the rebuttal.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes a method based on “open sets” used to filter out irrelevant data when training a classifier. The reviewers are mixed on this paper, also after the rebuttal, and the paper comes out as rather borderline. Compared to the remaining papers in my batch I tend to rejection.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



back to top