Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Lei Fan, Arcot Sowmya, Erik Meijering, Yang Song

Abstract

Formalin-Fixed Paraffin-Embedded (FFPE) and Fresh Frozen (FF) are two major types of histopathological Whole Slide Images (WSIs). FFPE provides high-quality images, however the acquisition process usually takes 12 to 48 hours, while FF with relatively low-quality images takes less than 15 minutes to acquire. In this work, we focus on the task of translating FF to FFPE style (FF2FFPE), to synthesize FFPE-style images from FF images. However, WSIs with giga-pixels impose heavy constraints on computation and time resources. To address these issues, we propose the fastFF2FFPE for translating FF into FFPE-style efficiently. Specifically, we decompose FF images into low- and high-frequency components based on the Laplacian Pyramid, wherein the low-frequency component at low resolution is transformed into FFPE-style with low computational cost, and the high-frequency component is used for providing details. We further employ contrastive learning to encourage similarities between original and output patches. We conduct FF2FFPE translation experiments on The Cancer Genome Atlas (TCGA) Glioblastoma Multiforme (GBM) and Lung Squamous Cell Carcinoma (LUSC) datasets, and verify the efficacy of our model on Microsatellite Instability prediction in gastrointestinal cancer. The code and models are released at https://github.com/hellodfan/fastFF2FFPE.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16434-7_40

SharedIt: https://rdcu.be/cVRrX

Link to the code repository

https://github.com/hellodfan/fastFF2FFPE

Link to the dataset(s)

https://portal.gdc.cancer.gov/

https://zenodo.org/record/2530835

https://zenodo.org/record/2532612


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a GAN based model to synthetize FFPE image from FF samples. It proposes to use Laplacian Pyramids to increase computational performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is well written and it is easy to follow.
    • The method is clearly explained
    • The idea of using Laplacian Pyramids even though already used in natural images, has not been exploited in Digital Pathology. *Good ablation study
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    *The paper focuses on increase in computational performance, and it does not devote any time to analyze the network. It would be interesting to see what the intermediate results of the Generator look like , for example the masks, etc *It was not clear in the MSI task prediction if the gain comes from the combination of FF->FFPE + data augmentation or only from data augmentation.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • The method is well described using public data, and the authors engage to share the code upon acceptance. The reproducibility seems good.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The paper is well written and the idea is interesting because it has not been used in digital pathology before. The quality of the paper would increase if a deeper analysis of the LP method is included e.g. show what the internal representations in the generator look like, the masks, etc. If space is an issue, this can go in the Supplemental material.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written and it is easy to follow. The method however seems like a small incremental modification of the now commonly used CycleGAN.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    The authors proposed a GAN-based model using Laplacian Pyramid frequency decomposition and Contrastive Learning via a memory bank to translate low-resolution FF into high-resolution FFPE-style slides.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The motivation for the work looks interesting. Collecting FFPE takes tremendous time when comparing it with acquiring FF.

    The work focused on providing a framework with efficient computation resource usage, i.e., training time and memory usage.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Using the unpaired dataset for training is totally making sense. However, for evaluation, since the motivation of this work is to transfer FF to FFPE, I strongly recommend the data be paired or at least ask a domain expert to grade the synthesis result, i.e., if the synthesized image is useful in clinical.

    The result shown in Table 1 does not convince me to use the fastFF2FFPE (vs. AI-FFPE) even though the training time, memory usage, and inference throughput are better than the other two baselines because of the FID performance. Authors might need a sensitivity analysis to show results with different hyper-parameters.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The author claimed that the data, code, and models would be released if accepted. The paper should be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    T_m should be defined similarly to c_h.

    In the memory bank section, how is FF patches’ size decided? 65535.

    In the dataset section, what is the relationship between images at 512x512, 1024x1024, and 2048x2048? Downsampling or independent? If the images in 1024x1024 are downsampled from 2048x2048, why are there only 4K images in 2048? Please clarify.

    What is the batch size when training/testing fastFF2FFPE, AIFFPE, and vFFPE?

    Fig2. could add a zoom-in view to show how fastFF2FFPE overcame the artifacts.

    I assume the results are validated by the statistical tests. Please clarify.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The testing data and evaluation part is confusing.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    4

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors clarified most of my concerns with a detailed explanation and discussion about the current work’s limitations. Since the paired data evaluation is not feasible, validating other downstream clinical analyses should help better understand the usefulness of the synthesis images. The current results could only show comparable performance to other SOTA methods but utilize fewer computation resources.



Review #3

  • Please describe the contribution of the paper

    This manuscripts describes a methodology that translates one type of histological staining process (frozen, FF) to another (Parafin FFPE) as the frozen is much quicker to acquire but has lower resolutions and quality that the FFPE. The rationale behind this is good as the FF can be acquired quickly and then translated to a higher quality for further investigations.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is generally well written, with a clear rationale and proposes a methodology that revisits the oldie-but-goodie technique of the Laplacian Pyramid. The methodology allows the use of larger patches (from 512x to 1024x and 2048X) which indicates a more compact use of memory. The processing is also much quicker than alternatives.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Whilst the paper has many merits, there are some weaknesses.

    • since the methodology is a naturally off-line process in which the FF is post-processed, the speed of processing is less important than the quality. After all, it is possible to process the tissue with FFPE and get good images anyway. Thus the focus should have been the quality.
    • The quality obtained does not seem to be that good! Best case on the GBM data was 46.85 but that is not comparable with other methodologies. At the same resolution the results of the proposed method are 49.67 against 46.89. For the LUSC set, the best of the proposed is 43.64 whilst the alternative was 34.81. *The paper is fairly well written but at times is confusing, more details to follow.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Seems to follow all the requirements on reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The paper makes several assumptions from the reader, who may or not be familiar with these. As such, it would be good that the details were clarified and not left to educated guesses or having to search for these. Examples

    “steps, e.g., dehydrated, saturated with formalin and stained with dyes, whereas FF slides are produced in ultra- low temperature freezers with liquid nitrogen. …” There are well defined protocols to obtain FF and FFPE, please add references for these.

    “However, there are many artifacts in FF slides and variations between FF and FFPE slides (see Fig. 1.a).” Which artifacts exactly? Please add a list and if these are visible, add arrows to illustrate these in the figure.

    In Figure 1a

    What is the green line around the samples? I make an educated guess that the background has been removed, but readers should not be making educated guesses.

    Fig 1b uses FF patches as input and Isynth as output, but Fig 1c uses Iin and Isynth, there should be consistency or if the terminology is correct, the caption should explain the differences.

    Fig 1b ises Lrec, Lcl and Ladv, which are explain many pages afterwards. Same with M,N.

    Fig 2 The visualisation of the results is rather useless if I do not know what I am looking for. I do not even know if the proposed method is supposed to be better, why not add the accuracy metric for each case so that the reader see if the proposed method is better.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is interesting but the quality and not the speed should be the focus

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper presents a GAN based model to synthetize FFPE image from FF samples using Laplacian Pyramid frequency decomposition and Contrastive Learning. The reviewers agreed on the motivation behind the work and the adopted approach. The method is also very efficient compared with other methods. However, the reviewers also raised a few concerns that should be addressed, 1) deeper analysis of the model; 2) paired data set; 3) reader evaluation of the reconstructed results.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    6




Author Feedback

We sincerely thank all reviewers and ACs for their time and constructive feedback. We are encouraged that reviewers R1, R2 and R3 found our motivation and idea to be interesting. R1, R2 and R3 recognize that this paper is well written and organized. R1 and R3 recognize that our method is well described. R1, R2 and R3 recognize that our framework is highly efficient and compact.

R1-5.1 and R1-8.1, deeper analysis of model: We have shown internal mask results in Fig1.c (seem like black pictures, could see more when zoom-in), and we will add more result discussion and enhance the visualization. R1-5.2, MSI task prediction: Sorry for the confusion. In Tab.3, e.g. for R50, baseline+colorjit=66.84%, baseline+FFPE-style+colorjit=67.13% when adding synthetized FFPE-style data. We will clarify and re-organize the description about MSI experiment results. R1-10.1, incremental modification: Compared to CycleGAN, which uses two pairs of generators and discriminators and hence requires substantial computational resources, we adopt a vanilla-GAN architecture and introduce LP to replace the traditional down- and up-sample operations so the model architecture is much more efficient.

R2-5.1, paired data experiments: We agree that inviting domain experts is ideal for validating the FF-to-FFPE performance. However, due to covid-19, we have limited resources, and also that’s the main reason why we introduce the MSI prediction since this task can be done without the aid of experts. We will conduct such evaluation for an extended journal version. R2-5.2, sensitivity analysis of hyper-parameters: Since we do not have such space for evaluation of hyper-parameters, we will release detailed results with different hyper-parameters on GitHub together with our code. R2-8.1 and R2-8.5, definition of T_m and zoom-in view visualization: We will re-define T_m inline with c_h, and add zoom-in visualization in supp. R2-8.2, the size (N) of memory bank: Since memory bank is to record extracted features of all samples, the maximum number of samples in our datasets is 20k(512x512 images), so N is set to 20k. We limit the maximum of N to 65535, because we refer an index (pytorch uint16 variable) for N. R2-8.3, data: Images with different resolutions are sampled from raw WSIs (in 20x magnification) independently. R2-8.4, training batchsize(BS): For the 512x512 setting, we trained vFFPE and AI-FFPE with BS=8 (BS=1 for each GPU), and trained our fastFF2FFPE with BS=32 (BS=4 for each GPU). R2-8.6, statistical tests: Since our evaluation mainly focuses on FID, statistical tests are usually not conducted in such studies. For MSI prediction, we are working on conducting T-tests. Due to the rebuttual time limitation, we split the data with k-fold=10, train baseline models (58.6%±0.55), and train baseline+FFPE-style models (59.6%±0.7). Finally, we calculate the p-value=0.0032. We will add all p-values in the updated version.

R3-5.1 and R3-10.1, speed vs. quality: We agree that the FF-to-FFPE translation quality is more important than the translation speed, especially for clinical proposes. In this paper, we aim at contributing a method from another direction, that could achieve comparable FID (could be quantified) performance yet with very cost-efficient and compact models. After doing this, we will go further to improve the FF-to-FFPE translation quality while keep the high-efficient translated speed under limited computational resources. R3-8.1, WSI knowledge: We will add references and also describe the artifacts in more detail. R3-8.2, Fig1.a: The green lines indicate the tissue regions without removing backgrounds after performing our data pro-process procedure. We will clarify this. R3-8.3 and R3-8.4, inconsistency in Fig.1b and c and notations in Fig.1b: Thanks for pointing out. We will redraw the figure, keep the consistency and minimize unnecessary notations. R3-8.5, Fig.2: Thanks and we will follow your suggestion and add accuracy metrics.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal has sufficiently addressed most reviewers’ concern and now all reviewers have positive ratings.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    All reviewers vote to accept paper. Using LPs for DP is interesting and it does seem to reduce complexity.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposes a GAN based approach to translate frozen (FF) histological staining process to Parafin (FFPE). Frozen stained tissue is much quicker to acquire but has lower resolutions and quality that the FFPE. The proposed approach is interesting. After the rebuttal, all the reviewers moved their decision to accept, and I concur with this decision.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4



back to top