Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jianan Cui, Yutong Xie, Anand A. Joshi, Kuang Gong, Kyungsang Kim, Young-Don Son, Jong-Hoon Kim, Richard Leahy, Huafeng Liu, Quanzheng Li

Abstract

Deep learning-based methods have shown their superior performance for medical imaging, but their clinical application is still rare. One reason may come from their uncertainty. As data-driven models, deep learning-based methods are sensitive to imperfect data. Thus, it is important to quantify the uncertainty, especially for PET denoising tasks where the noise is very similar to small tumors. In this paper, we proposed a Nouveau variational autoencoder (NVAE) based model using quantile regression loss for simultaneous PET image denoising and uncertainty estimation. Quantile regression loss was performed as the reconstruction loss to avoid the variance shrinkage problem caused by the traditional reconstruction probability loss. The variance and mean can be directly calculated from the estimated quantiles under the Logistic assumption, which is more efficient than Monte Carlo sampling. Experiment based on real 11C-DASB dataset verified that the denoised PET images of the proposed method have a higher mean(±SD) peak signal-to-noise ratio (PSNR) (40.64±5.71) and structural similarity index measure (SSIM)(0.9807±0.0063) than Unet-based denoising (PSNR, 36.18±5.55; SSIM, 0.9614±0.0121) and NVAE model using Monte Carlo sampling (PSNR, 37.00±5.35; SSIM, 0.9671±0.0095).

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16440-8_17

SharedIt: https://rdcu.be/cVRvI

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors present an approach to denoise PET images that relies on a VAE using quantile regression loss, which enables uncertainty estimation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Enabling uncertainty estimation when processing images is important to increase trust in computational tools, especilly in a clinical context.
    • Thanks to the quantile regression loss, uncertainty is estimated in a more computationaly efficient way than in previous works.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The novelty compared with a previously published approach also based on the NVAE and enabling uncertainty estimation seems very limited. The gain in synthesis accuracy is very small and the improved computational efficiency is not quantified.
    • The usefulness of estimating uncertainty is not really demonstrated.
    • The data set used is small (20 subjects for training, 3 for validation, and 3 for testing), meaning that no strong conclusion can be drawn.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors ticked “Yes” for most items of the reproductibility checklist, which is sometimes in contradiction with what is provided in the paper (e.g. range of hyper-parameters considered). It seems that the code will be made available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • The assumption that the output distribution follows the logistic distribution should be justified.
    • The authors mention supplementary material but none was available.
    • From the authors’ point of view it is a good thing that the proposed approach has a larger variance but would the clinicians, i.e. the users, think the same way?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea is interesting and valuable but the novelty seems limited and the results not strong enough.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    5

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors have addressed my main concerns regarding the novelty and strength of their contribution. I am not sure if the new results presented in the rebuttal will fit in the 8-page paper.



Review #2

  • Please describe the contribution of the paper

    This manuscript, “PET denoising and uncertainty estimation based on NVAE model using quantile regression loss”, reports an improved PET denoising method by applying a NVAE model using quantile regression loss method (NVAE-QR) compared to a Unet-based and another NVAE model using Monte Carlo sampling (NVAE-MC). The proposed framework evolves from the NVAE model by minimizing quantile regression loss and Kullback-Leibler (KL) divergence term. The proposed model was tested using the real 11C-DASB dataset. The authors first generated a low-quality PET image by down-sampling the full list-mode data by a quarter, and 20 subjects were used for training, 3 for validation, and 3 for testing. By comparing the original image to a denoised PET image from each model, the authors evaluated the performance based on the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The major strengths of the work are reflected in 1) improved image denoising based on the PSNR and SSIM metrics, 2) NVAE-QR model avoids the variance shrinkage issue, and 3) shorter processing time compared to the other two methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Considering that the low-quality PET images are generated by down-sampling the list mode data, it is also important to consider other noise factors related to PET, such as noise distribution, motion, imperfect attenuation correction, etc that differentially contribute to the noise in the PET images. To bring this algorithm to a real-world application, have the authors investigated combinations of other noises to improve using the deep learning approach? Do the authors then expect that the NVAE-QR would still perform better than the other methods?

    In addition the methodology while useful and applicable to an important problem in PET imaging is not that novel.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors did not provide their statement on code availability. This would be highly desirable.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. It would really benefit the paper to present example cases or datasets to highlight the performance of the model in a research/clinical context (in an ROI or lesion-based approach), as well as other voxel-based metrics to indicate performance. While PSNR or SSIM are important metrics, a smoothed denoised image might provide high quantitative values on these metrics while missing important high-frequency information or small lesions for example.
    2. Page 1, Abstract: Not all abbreviations were first defined in the abstract section. For example, PSNR and SSIM.
    3. Page 2, Introduction 2nd paragraph: While the authors have included the description and extension of VAE and NVAE to provide a better understanding of the improvement, the Unet-based model literature was relatively lacking. It would be equally important to provide a bit more context rather than briefly mentioning Bayesian neural networks and suDNN.
    4. Page 2, Introduction 2nd paragraph: The authors used tumor identification as an example to describe the urging need for the novel technique to improve the PET denoising and uncertainty estimation. However, the dataset used to test the model of interest in the study was a brain imaging PET data.
    5. Page 5, Dataset: It would be great to include more detailed information on the PET data acquisition such as the dynamic framing, different types of corrections (transmission, scatter, random) applied, etc.
    6. Page 5, Data analysis: Based on the manuscript, the authors have reported using 200 training epochs and a 0.01 learning rate. How do the PSNR and SSIM change over different training epochs and different learning rates? Do the authors expect any further improvement?
    7. Discussion: Would this model still perform better than the other models for different PET tracers?
    8. Figures with a color scale bar should indicate the unit.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The topic of this study carries a great interest in the research field of using PET imaging to improve the image signal-to-noise ratio. Moreover, the application of the deep learning approach has gained great support to improve PET image signals and other medical imaging. I would accept this paper to be presented at the conference.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    The paper proposes a deep learning model for simultaneous PET image denoising and uncertainty estimation. It uses the NVAE Variational Autoencoder which is trained with the quantile regression loss.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Using the quantile regression loss avoids variance shrinking and allows estimating the variance directly from the quantiles, which is faster than using Monte Carlo sampling.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There is no reference to prior work where the quantile loss has been used. The dataset is very small, only 3 subjetcs were used for validation and testing. English is not very good.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    ok

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Explain what is meant by “ group of low-quality and high-quality training pairs”. Include a reference or an expression for the structural similarity index measure (SSIM). section 3.2 is confusing because it describes the usual VAE loss but it refers to it as the loss used in this work. Was there a reduction in variance shrinking?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The use of the quantile loss is interesting.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Somewhat Confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    No change.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The reviewers appreciate the idea of performing uncertainty estimation for image denoising. There are questions about the positioning with respect to the similar methods related published, and on the experimental evaluation. In particular, the dataset size is very small, and further evidence may be required to appreciate the contribution of this work.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    5




Author Feedback

We thank all reviewers for the detailed feedback and answer the comments below: R1-3.1&R2-3.2&R3-6.4: Novelty and contributions Our work (NVAE-QR) proposed to use quantile regression loss for PET denoising and uncertainty estimation. Compared to the similar method that used Monte Carlo sampling (NVAE-MC), the primary contribution is the computational efficiency. NVAE-MC needed 35min to generate the result of one subject in the brain dataset due to the repetitive sampling process, while NVAE-QR only needed 4s. In addition, using QR loss can avoid the variance shrinkage problem that degrades sample quality. The mean value of the variance map of NVAE-QR is 0.00142 which is 1.58 times of NVAE-MC (0.000898), showing that NVAR-QR does avoid variance shrinkage.
R1-3.3&R3-3.2: Concerns about small datasets We agree the datasets in the original paper are small, therefore, we evaluated the proposed method based on bigger datasets in the rebuttal: a whole-body 18F-FDG PET dataset containing 55 subjects. We used 40 for training, 5 for validation and 10 for testing. The size of PET image is 256256256. As the network was trained by 3256256 patches, we have 10240 patches for training, 1280 for validation and 2560 for testing. The quantitative results show that the mean(SD) PSNR and SSIM of NVAE-QR (PSNR: 21.89±2.05, SSIM:0.878±0.024) are higher than Unet (PSNR: 21.01±1.88, SSIM:0.86±0.027) and NVAE-MC(PSNR: 21.26±2.18, SSIM:0.875±0.028). R2-6.1&R2-6.4: Lesion-based evaluation We conducted a lesion-based evaluation on the whole-body FDG dataset (refer to R1-3.3&R3-3.2). ROIs were drawn on the tumor (target, Rt) and liver (background, Rb). Then we calculated contrast-to-noise ratio (CNR), CNR=(mean Rt-mean Rb)/(std Rb). The CNR improvement ratio=(CNR_denoised-CNR_noisyPET)/CNR_noisyPET*100%. NVAE-QR achieved the highest mean(SD) CNR improvement ratio (288%±103%) than Unet (148%±51%) and NVAE-MC (151%±74%). R1-3.2&R1-6.3: The usefulness and size of uncertainty Uncertainty has been well studied in the classic PET literature, particularly after Jeffery Fesler’s seminar work in 1996. The uncertainty estimated by traditional methods has been used for lesion detection, early treatment response evaluation, and so on. However, traditional methods are not accurate and fast enough. In this work, we proposed a deep learning based method to efficiently and accurately compute the uncertainty and will apply it in clinical applications in the future. For question R1-6.3, our goal is not to pursue large variance but accurate variance. As NVAE is trained to learn the distribution of the denoised image, the variance shrinkage problem will lead to an artificially narrow output distribution. This will limit the model’s generalization ability and reduce sample diversity. Our method can fill in the gap between the true variance and artificially narrow variance, and thus has a larger variance than NVAE-MC.
R1-6.1&R2-3.1&R2-6.7: Output distribution and generalization ability We evaluated the performance of different output distribution models (Fig 5, Table 2). We found that normal distribution cannot fit PET data well (usually has low SSIM). Both logistic and mixt_logistic distributions work well. However, the mean and variance of mix_logistic distribution are hard to calculate as the mixture parameters are learned by the network. Thus, we assumed that the output follows the logistic distribution. The results of the brain dataset and the whole-body dataset (refer to R1-3.3&R3-3.2) show our model works well under this assumption. The whole-body dataset was acquired by a different scanner (GE DMI PET/CT system) using a different tracer from the brain dataset. In addition, we tested our model trained by 18F-FDG datasets, on 3 18F-Fluciclovine subjects. Our method (PSNR: 16.30±1.79; SSIM:0.864±0.031) succeed Unet (PSNR:15.02±1.62;SSIM:0.839±0.038) and NVAE-MC(PSNR:15.70±1.88;SSIM:0.859±0.027). These results indicate our method has good generalization ability.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors’ rebuttal was found convincing, and the additional evidence was found useful to clarify the questions of the reviewers. As pointed by R1, the manuscript should be reshaped to prorperly present the new experimental material, which was decisive to reccomend for the acceptance of the paper.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
    • The method proposed quntile regresison to estimate denoising uncertainty estimation which is faster than sampling method
    • In the rebuttal, they applied on a larger sample and provide some justification for their choice of likelihood
    • I am inclined to accept the paper but the authors should fix the writing issue of the paper. The paper is written for the audience and it must be reachable for readers.
  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    na



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The manuscript presents a PET denoising and uncertainty estimation algorithm based on a variant of a variational auto-encoder. Authors demonstrated their method is able to improve PSNR and SSIM metrics with shorter processing times than comparison methods.

    Although there were initially concerns related to the small size of the dataset evaluated and the suitability of comparison methods after rebuttal all reviewers felt the paper has sufficient merit for acceptance in MICCAI (although two “weak accepts” due to the incremental improvements in the measure reported). The major weaknesses of the original manuscript were limited novelty, because the NVAE is an established network architecture, and the experimental setup is limited because of the dataset size and only global validation metrics (SSIM and PSNR) rather than for instance ROI based comparisons.

    The major strengths were the use of a quantile range loss to train the network, and a demonstration of a deep learning approach for PET denoising to improve the computational efficiency.

    After the rebuttal phase all reviewers agreed that the major concerns were addressed and believe the paper has merit to appear in MICCAI.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    1



back to top