Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Joseph Boyd, Irène Villa, Marie-Christine Mathieu, Eric Deutsch, Nikos Paragios, Maria Vakalopoulou, Stergios Christodoulidis

Abstract

In whole slide imaging, commonly used staining techniques based on hematoxylin and eosin (H&E) and immunohistochemistry (IHC) stains accentuate different aspects of the tissue landscape. In the case of detecting metastases, IHC provides a distinct readout that is readily interpretable by pathologists. IHC, however, is a more expensive approach and not available at all medical centers. Virtually generating IHC images from H&E using deep neural networks thus becomes an attractive alternative. Deep generative models such as CycleGANs learn a semantically-consistent mapping between two image domains, while emulating the textural properties of each domain. They are therefore a suitable choice for stain transfer applications. However, they remain fully unsupervised, and possess no mechanism for enforcing biological consistency in stain transfer. In this paper, we propose an extension to CycleGANs in the form of a region of interest discriminator. This allows the CycleGAN to learn from unpaired datasets where, in addition, there is a partial annotation of objects for which one wishes to enforce consistency. We present a use case on whole slide images, where an IHC stain provides an experimentally generated signal for metastatic cells. We demonstrate the superiority of our approach over prior art in stain transfer on histopathology tiles over two datasets. Our code and model are available at https://github.com/jcboyd/miccai2022-roigan.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16434-7_35

SharedIt: https://rdcu.be/cVRrS

Link to the code repository

https://github.com/jcboyd/miccai2022-roigan

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces a GAN-based stain style transfer method for WSI. The method adopts Cycle-GAN as baseline and introduce a ROI-based discriminator in GAN.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper proposes an interesting application in WSI analysis. The manuscript is well-written and easy to follow.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The generation of the bounding boxes for region-guided discriminator needs manual operations. That is, a series of carefully-designed image processing techniques is required. This weakens its feasibility in practice.
    • The experimentation setting is problematic. The introduced method focuses on an ROI-based discriminator that can take patches in various sizes as input. However, in the experimental setting, the authors fix the bounding box size as 48x48. The motivation to introduce an ROI-based discriminator and the experimental setting is conflict. The original patch discriminator also works if the bounding box size is pre-fixed.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors claim the reproducibility of the results reported in the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Please refer to my comment in the weakness section. In addition, (1) the authors claim that the original CycleGAN fails for the task. Is there any reason for the failure? Is the failure attributed to the discriminator? The discussion would help to motivate the work. (2) It is suggested to include a discussion on the mis-localizing DAB by the proposed method in figure 2.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    My two concerns on bounding boxes (the generation of bounding boxes and their fixed size) weaken the paper’s contribution. So I would like to give rejection to the paper before reading the rebuttal.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    5

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors answered my questions and I would like to go with weak accept.



Review #2

  • Please describe the contribution of the paper

    The authors extend the CycleGANs with the proposed “region of interest discriminator”, naming it Region-guided cycleGAN. The proposed discriminator performs a soft segmentation on the generated stain transferred image. This leads to performance improvement in stain localization. Results are validated on one public (Camelyon16) and one private dataset on which it outperforms the existing methods. The qualitative results are presented on the private dataset, while qualitative and quantitative results are provided for both datasets. The generated binary mask from synthesized DAB stain is compared with the GT mask for qualitative performance evaluation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Strengths: Methodology: Introducing region guidance in the discriminator through segmentation is interesting. The proposed discriminator removes fixed size constraints for the PatchGAN, and can be applied to varying size regions. As a natural implication of this modification, the authors have utilized this discriminator for the detection task. This task is also introducing supervision in the GAN that leads to better performance in stain localization as compared to CycleGAN that lacks any supervision. As the quality of the stain transfer depends upon the stain localization, the proposed modifications lead to better quantitative and qualitative results (Fig-2 and Table-1) in comparison to CycleGAN (without any supervision). It also motivates to introduce some sort of supervision for the related applications.

    Analysis: A detailed analysis is presented on two datasets in terms of qualitative and quantitative performance. The analysis is able to highlight the effect of supervision introduced through the proposed region-guided discriminator in the CycleGAN. Quantitative results (Fig-2) shows the superior stain transfer quality because of region guidance and in turn supervision over vanilla CycleGAN. Similarly, Table-1 shows the better performance of the proposed method as compared to CycleGAN trained under different settings. However, the analysis is limited to one application but it validates the proposed approach.

    Performance: Qualitative performance shows significant performance improvement over the existing methods (Table-1). The quantitative results also show good performance of the proposed region-guided cycleGAN.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Annotation efforts: The method requires additional annotations to train the region-guided discriminator. The authors have used an image processing-based approach for generating the library for training the discriminator. This implies that the quality of the annotations will be a key factor in the performance of the discriminator.

    Limited application: The usage of the proposed discriminator is provided for a specific application of stain transfer. Can there be other potential applications of this approach?

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproducibility response is followed in the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    I would recommend following for the future work:

    1. Extension to other applications
    2. Is it possible to make the approach less dependent on the annotations?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The proposed modifications improve the baseline as validated by the qualitative and quantitative results on two datasets.
    2. A bottleneck in the model performance is due to the annotation quality. However, the utilized annotation pipeline is based on image processing that does not require manual efforts for GT preparation (to train a segmentation model). With this annotation pipeline also, the performance is better in contrast to compared methods. However, the effect of this factor should be explored in future work.
    3. Although the analysis is limited to one application, it is sufficient to validate the proposed approach.
  • Number of papers in your stack

    6

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Based on the author’s feedback, the original decision is retained.



Review #3

  • Please describe the contribution of the paper

    This paper proposes a method to transform H&E stains to IHC stains for histopathology images under an unpaired setting. The idea is based on CycleGAN, but instead of using a patchGAN discriminator (from the original CycleGAN approach), it proposes a region-based discriminator. The discriminator takes cell bounding boxes as additional inputs so that the GAN loss is computed on the RoIs, and thus better generation could potentially be achieved.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed method is sound. When extra knowledge about the image is available, it is a good idea to find ways to incorporate such knowledge during training. This paper achieves this via a region-based discriminator, and the knowledge is leveraged in the form of bounding boxes. The use of RoIAlign properly consumes the bounding box inputs and induces the generation of more biologically meaningful outputs.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The clinical effectiveness of this approach is not clear. What is the actual clinical application of this approach? In other words, who will be the eventual consumer of the synthesized IHC images? If it is the medical professionals, then there are missing evaluations of how this method may facilitate the identification of metastatic cells by medical professionals. In particular, how many chances will they find the transferred stains make it easier to spot metastatic cells, and how many chances it may compromise the reading of images? If the synthesized IHC images are going to be used to train models to automatically detect metastatic cells, then there is missing comparison of methods using H&E stains directly.

    • There is missing ablation study on the effectiveness of library generation. The validity of the proposed method largely relies on the correctness of the cell boxes. From Section 3.2, it seems that the generation process is quite heuristic, and thus the generated boxes may not be accurate. What is the actual accuracy of the library generation, and how poor and good cell boxes may affect the performance of stain transfer?

    • There are missing comparisons with baseline ideas. While the proposed method is reasonable, it is unclear how this method is positioned when compared with the straightforward alternatives (if not better, why apply this idea then?). Two possible ideas: 1) applying loss attention masks to the patchGAN map. The loss attention mask is generated from cell boxes, i.e., we have high attention values in box regions and low attention values otherwise. 2) Cropping the input image to the cell box regions (we will also need negative samples, of course) and using the cropped samples to train CycleGAN models. During testing, we could simply use input with larger sizes thanks to the translation invariance of CNN.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The core idea of the proposed method is straightforward and can be easily implemented. The implementation details of the CycleGAN are sufficiently discussed. The missing part is the datasets. Although this paper used a public dataset, to train the model it still requires the private IHC images.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    I made some suggestions to help improving the experiments of this paper. Please see the Weaknesses section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think methodology-wise this paper is good. My concern is the experimental results part which I think should be improved.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    For clinical value, what I would like to see is clinical evidence of the potential use of this method (as well as the evaluation). No discussion here in the rebuttal. I also would like to see the comparison of some baselines which also do not get discussed. Therefore, my rating remains the same.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The manuscript introduces an improved CycleGAN model for stain transfer in histopathological images. While the reviewers appreciated the idea of using a region-based discriminator to enhance image generation, they also raised some concerns. R1 and R2 commented that additional efforts are needed to build a bounding box library for model training, and this may weaken the feasibility of the method in practice. R1 also questioned the experimental setting, which might be problematic. R3 pointed out that the clinical effectiveness of the approach is not clear, the effectiveness of library generation is not well verified, and lack of comparisons with (straightforward) baseline methods weakens the experiments. Please consider addressing the reviewers’ concerns/comments in the rebuttal.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7




Author Feedback

We would like to thank the reviewers and the area chair for their constructive criticism and positive evaluation of our proposed methodology. Their main concerns can be summarized as i) the clinical application of our method, ii) the manual overhead required to build the library and iii) aspects of the experimental setting. In the following, we address these major comments.

  • R3 raised concerns about the clinical application of our method, although appreciated by R1. To our knowledge, this work is the first to provide robust models for WSI-level stain transfer (supp. Fig 3) providing artificially-stained images. The artificially-stained WSIs could entail eventual use both as a clinical tool (visualization) and could be utilized in soft segmentation pipelines (diagnosis, quantification) while also enabling cross-stain applications (e.g. registration pipelines).

  • R1 identifies the library as a manual overhead. However, the proposed pipeline is a semi-automatic means for extracting additional supervision “for free”, greatly improving the unsupervised baseline. For IHC tiles, the annotation is provided directly by the experimentally-generated DAB stain. For H&E tiles, we indeed rely on an expert annotation. However, our experiments revealed exciting possibilities for further automation: a) due to a parsimonious design (only assumptions about cell size and clustering are made), the pipeline was readily applicable to both datasets and would likely generalize to others (R2); b) datasets can be combined (e.g. Table 1, right-hand) and supp. Fig 3 shows qualitatively that a model trained on CAMELYON H&E transfers well to the private dataset. In the latter case, only annotations from CAMELYON have been used, implying reusability of a library once computed on CAMELYON (a free resource).

  • R2 considers the library may be unreliable and R3, although judging it to be sound, would like to know the effect of noise on D_ROI. Note that, by design, the library prioritizes precision over recall, as D_ROI does not require a full bounding box annotation to train. Cells that are hard to isolate are thus excluded from the library, first by incorporating the ground truth annotation, and also by expanding the exclusion mask (the masked regions of supp. Fig 2e) for detecting cancer cells (lowering recall), increasing the confidence of true positives (increasing precision) found outside the mask. This yields a partial annotation, but with limited noise.

  • R1 is skeptical whether the RoI concept has been faithfully demonstrated in the experiments. The proposed D_ROI is a mechanism for injecting additional supervision into training the otherwise unsupervised CycleGAN, achieving the correct localisation of DAB in the tumoral regions (Fig 2 and Table 1). Although a fixed 48x48 bounding box is used, this is for convenience and is not a limitation of our method (RoIAlign will interpolate to a fixed output whatever the input size). In our experiments, D_ROI is still “zooming in” on the cell regions, performing discrimination directly centered on cells. A typical PatchGAN cannot do this, as it divides the image indiscriminately into a grid of overlapping regions. It is for these reasons we hypothesize our method succeeds over the standard, unsupervised CycleGAN.

  • R3 points to potentially simpler architectures for incorporating the library. We very much appreciate the idea of using attention to reweight the PatchGAN loss. However, an advantage of using RoIAlign is that the region annotation can be sparse, with only a few cells considered from the population per mini-batch (we sample 8 cells per tile–Section 3.3). This is well adapted to the library, which is non-exhaustive to maintain precision. Also, although the use of RoIAlign means D_ROI treats the problem essentially as object-level discrimination, it is also able to enforce a consistent receptive field around cells by taking as input the full tile.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper introduces an improved CycleGAN model with a region-based discriminator for stain transfer in histopathological images, and the model can produce better performance than the original CycleGAN. The rebuttal has addressed most of the reviewers’ concerns, including manual effort for library generation and the experimental setting. The paper can be improved by clearly explaining the clinical values of the proposed method.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper introduces a GAN-based stain style transfer method for WSI and introduces a ROI-based discriminator. The paper is well motivated and the rebuttal has sufficiently addressed most reviewers’ concern.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    5



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The manuscript presents an improved CycleGAN model with region-guidance for stain transfer in histopathological images. The proposed method shows clear improvement over baseline methods. The rebuttal has addressed most of the concerns from reviewers. Thus I vote for acceptance. However, the discussion on the clinical usage and comparison with SOTA cross-modality staining methods are suggested to be provided in the final version.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    9



back to top