Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yiqing Shen, Yulin Luo, Dinggang Shen, Jing Ke

Abstract

Stain variations often decrease the generalization ability of deep learning based approaches in digital histopathology analysis. Two separate proposals, namely stain normalization (SN) and stain augmentation (SA), have been spotlighted to reduce the generalization error, where the former alleviates the stain shift across different medical centers using template image and the latter enriches the accessible stain styles by the simulation of more stain variations. However, their applications are bounded by the selection of template images and the construction of unrealistic styles. To address the problems, we unify SN and SA with a novel RandStainNA scheme, which constrains variable stain styles in a practicable range to train a stain agnostic deep learning model. The RandStainNA is applicable to stain normalization in a collection of color spaces i.e. HED, HSV, LAB. Additionally, we propose a random color space selection scheme to gain extra performance improvement. We evaluate our method by two diagnostic tasks i.e. tissue subtype classification and nuclei segmentation, with various network backbones. The performance superiority over both SA and SN yields that the proposed RandStainNA can consistently improve the generalization ability, that our models can cope with more incoming clinical datasets with unpredicted stain styles. The codes is available at https://github.com/yiqings/RandStainNA.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16434-7_21

SharedIt: https://rdcu.be/cVRrD

Link to the code repository

https://github.com/yiqings/RandStainNA

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    The authors propose a pre-processing method to perform joint stain augmentation and stain normalization (SA & SN) in computational pathology. The SN process generates color templates using the LAB space intensities averages and standard deviations. The SA generates only synthetic images within the ranges of the generated SN templates. The approach is evaluated on the downstream tasks of colorectal cancer image classification and nuclei segmentation outperforming (outdated) SA and SN approaches separately.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Combination of SA and SN is a great way to augment the dataset size and train more robust deep learning networks in computational pathology. RandStainNA has a simple yet effective manner to combine both.

    • The evaluation is done in two standard open-access computational pathology datasets for colorectal cancer classification and nuclei segmentation.

    • The paper is well written and easy to follow.

    • The proposed method is evaluated with many recent backbone DL architectures which highlights the superior performance of RandStainNA to the baselines.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The combination of SA and SN is not novel, and has been already presented in the past in combination with more recent techniques (domain adversarial learning) and more pathology-informed color space extraction (stain absorption matrices), see for example[1,2,3].

    • The method is evaluated at the patch level, and thus computing average and standard deviation is fast, but I don’t see this method scaling well at the whole-slide image level.

    • The method was not compared with state-of-the-art methods such as CycleGANs (that keep morphological information) or domain adversarial learning, not even with other combinations of SA & SN.

    • The main weakness of the paper is that there is not a statistical analysis of several runs of the methods. Given the stochastic nature of the generation of the templates and augmentations, the average (and std) for several runs for the method and baselines should have been reported to have a more robust estimation of the real performance.

    [1]: Van Eycke, Yves-Rémi, et al. “Image processing in digital pathology: an opportunity to solve inter-batch variability of immunohistochemical staining.” Scientific reports 7.1 (2017): 1-15. [2]: Marini, Niccolo, et al. “H&E-adversarial network: a convolutional neural network to learn stain-invariant features through Hematoxylin & Eosin regression.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. [3]:Otálora, Sebastian, et al. “Staining invariant features for improving generalization of deep convolutional neural networks in computational pathology.” Frontiers in bioengineering and biotechnology (2019): 198.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The datasets are available, but the code has not been released. The method is simple so it should be easy to reproduce. There is no information about hyperparameters such as learning rate used in the experiments or how many epochs were used, limiting the reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Providing computation times for the whole pipeline with and without preprocessing would have been useful.

    • This might be a great contribution to the computational pathology community if it is included in a library and distributed to researchers to use it. Or include the method in actively developed computational pathology libraries such as[4,5]

    • I think the method is indeed useful, but for me its lack of statistical analysis and comparison with more recent methods makes a borderline accept decision needed.

    [4]https://github.com/TissueImageAnalytics/tiatoolbox [5] https://histolab.readthedocs.io/en/latest/index.html

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Lack of statistical analysis given the stochastic nature of the method. Lack of comparison with state-of-the-art methods

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    The authors have adressed my doubts regarding scalability and performed several runs to assess robustly the performance of the method. The authors now uploaded the code to a repository and the implementation seems clean and easy to use. There is still two main points missing in the paper: 1) Comparisson with a GAN-like method, since there are plenty of state-of-the-art implementations for this problem. 2) statistical significance test (at the results level) that should have been performed. If it get’s accepted it will be welcome to have it in the final version. If not, think about including them for future submission. I’ve updated my vote from weak accept to accept.



Review #3

  • Please describe the contribution of the paper

    The article proposes an image augmentation metod that combined stain normalization and stain augmentation. At first they use Lab color space and then increase the number of color-spaces used during processing to 3. The method seems to work regardless of the task (segmentation/classification).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • the idea of combining SN&SA is interesting
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • a lot of description is unclear
    • it is difficult to understand even for experienced reader
    • lack of cross-validation of the results
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    it seems that method is reproducible

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Comments:

    • The subsection ‘Virtual stain normalization template’ consist of not clear description, please rephrase
    • sigma_j is not defined (p.4)
    • (p.4)”The empirical results suggest […]” - meaning that there is no proof, please expand on that
    • (p.5) please provide a proper reference for the MoNuSeg dataset
    • (p.5)”we generate different virtual templates for images that vary at every epoch during the training” - this is not described enough, please expand
    • on p.7 please double check if there should be SN1 & SN2 or maybe there should be SA1 & SA2 instead
    • the proposed ablation study is questionable - please reconsider with proper testing.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The overall idea is quite interesting but with the unclear description and no cross-validation of the results it seems that the reliability of this article is questionable.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    4

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #4

  • Please describe the contribution of the paper

    This paper proposed RandStainNA which unifies stain normalization (SN) and stain augmentation (SA) for histology image analysis. Specifically, randomness is introduced in the conventional SN process to generate more realistic stain variations, i.e., random virtual templates from pre-estimated stain style distributions are generated and incorporated into the SN process. Additionally, random color space selection scheme is also introduced in the framework.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The attempt of combine SN and SA into one unified framework is interesting and there are few previous research works address on this topic. Overall, this topic has some degree of novelty.
    2. The experimental design is relatively complete. It considers two different tasks (classification and segmentation), different baseline CNN architectures, three color spaces (LAB, HSV, HED).
    3. The organization of this paper is clear and easy to follow.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. In my own opinion, the idea of combining SN and SA is straightforward and is trivial to deal with. For example, a simple solution is a naive serial combination of the two components. The authors should state clearer on the advantages of your work against some simple combinations of the two components.
    2. From methodology perspective, the overall pipeline of the proposed framework is lack of depth. For example, the random color space selection is just doing random choice among three color spaces with equal probability.
    3. Some experimental settings are intricate. Since the proposed method belongs to data pre-process and augmentation, did other data augmentation methods (geometric transforms, noise, rotation, contrast, cutout, mixup and so on) are performed when running baseline models? In other words, although the proposed method has large performance improvements against baseline, I am concerning whether other augmentation methods can also achieve such improvements and whether the proposed method can consistency gain performance improvements besides these augmentations. Also, why all models just running for 50 epoches? Did all the models convergence?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Please refer to the weakness section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the combination of stain normalization and augmentation is interesting and has some degree of novety, the overall novety of this paper is limited and the propsosed framwork is lack of depth in methodology perspective.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Somewhat Confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    Since major of my concerns in the first round have been fixed by authors’ response, although this method is relatively simple but it shows great performance gains for nuclei analysis and i’d like to give it a weak accept.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The augmentation method proposed here is relatively straightforward and the results are impressive. The idea of combining stain normalization and augmentation is, however, not new so the authors should place their work in context and more clearly explain the improvements proposed in this paper. Some discussion of how the method scales to WSIs is needed as there were concerns over time needed for processing Need more in-depth analysis of results including some assessment of statistical significance.statistical analysis.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    6




Author Feedback

1.Motivation (All) Our method integrates two independent approaches to recognize non-IID histology ie. augmentation(SA) and normalization(SN) into one process, as well one maths formulation. The performance is superior to state-of-the-art competent SA or SN approaches in terms of efficiency, and generalized to CNNs. It effectively avoids complicated training, and importantly, achieves high reliability. Prevalent successful GAN-based SA or SN cause unpredictable mis-transform of phenotypes, that will retreat the liability in the downstream tasks e.g. tumor segmentation. By contrast, our method improves the diagnostic performance in a more understandable and interpretative manner without phenotype mis-transform, which is of important medical significance clinically.

2.Code & Reproducibility(Rw2) Our code is accessible at https://anonymous.4open.science/r/RandStainNA/, with hyperparameters already provided in Tab.2 (Supplementary File). PRs are going to be submitted to the suggested libraries.

3.WSI Training time & scale(Rw2,Meta) Our approach consumes less training time than SA: for example, one epoch (~72K images) with ResNet18 on NCT-CRC takes 51s (baseline), 63s (SA) and 61s (ours), and comparable results are achieved with other CNNs. Plus our implementations support patch-level parallelization, hence WSIs’ clinical diagnoses is fast.

4.Statistical analysis(Rw2,Meta) The results are stable and repeatable, where STD <= 0.24 over 3 runs (seed=97,77,47) is promised empirically in test, far away to offset the considerable performance improvement rate of 1.00~4.00. A fixed seed (97) was reported for short.

5.Compare with CycleGANs(Rw2) Compared with CycleGAN, StainGAN or other GANs, we provide a straight, fast, and effective non-training approach, with better predictability and interpretability. Current GANs are not our rivals, such as StainGAN, can be combined with our RandStainNA to achieve extra performance gain.

6.Compare with naïve serial combination of SN & SA(Rw2&4,Meta) Tab.1 in [ref.22] pointed out a serial combination is worse than solo SA. Our approach unifies SA and SN in one-shot process with one equation, and this synergy produces realistic augmentations (fig3) with a hierarchical normal distribution. Neither a serial combination of SN and SA, nor either, can get through the crucial phenotype mis-transform, yet we managed to.

7.No cross-validation(Rw3) We follow the benchmarks’ origin train/test data partition for a fair comparison, so the cross-validation is skipped. Instead, we perform random runs (Response4) for liability.

8.Writings(Rw3) ‘Virtual stain normalization template’ refers to a randomly generated stain template with 6 parameters to perform RandStainNA. SN1 & SN2 on p7 should be SA1 & SA2, thanks for pointing out the typo! Will correct.

9.Simplicity vs Novelty(Rw4) Yes, our method advances towards both paths of simplicity and novelty, like CutMix as a synergy of CutOut and MixUp to achieve a performance boost. As existing augmentations are discipline irrelevant hence lack specificity in histology, we address this issue by stain-style agnostic regularization with the proposed RandStainNA.

10.Why color space with equal probability(Rw4) Minor improvement was achieved by tuning the probability (<0.3), so we skipped it as a trade-off between manual tuning effort.

11.Settings(Rw4) The convergency is reached in 50 epochs with all backbone CNNs, that more epochs do not contribute to increased performance. The baselines, SA, SN and ours all apply random flipping, brightness, contrast+Gaussian noise&blur to align with [ref.22], without additional preprocess.

12.Compare with other augments(Rw4). Regarding performance percentage, MixUp improves ResNet18 by 4.75, ours by 12.83, MixUp+Ours by 13.28, RandomErase by 2.59 and RandomErase+Ours by 12.90. Existing augmentations achieve good performance on arbitrary images, but we aim at histology images specifically and gain extra advance when combined.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After rebuttal two of the reviewers improved their scores. Some of the concerns about scaling have been addressed. The simplicity of the method, the good performance and the general applicability of the method all make it potentially useful.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    5



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors prpose a single scheme to combine stain augmentation and normalization, to learn stain-agnostic features for histopathological images. The method is simply yet effect, as demonstrated in the quantitative comparison results. Most of reviewers concerns about technical details and presentation has been addressed in the rebuttal and I therefore recommend paper acceptance.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper combines stain normalization and augmentation in a single framework to learn stain-agnostic feature representations for histopathological image analysis. The method is simple and the experimental results are impressive compared with the baseline. The rebuttal has addressed most of the reviewers’ concerns, e.g., the contribution, the scalability of the method, and the experimental setting.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4



back to top