List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Caiwen Jiang, Yongsheng Pan, Mianxin Liu, Lei Ma, Xiao Zhang, Jiameng Liu, Xiaosong Xiong, Dinggang Shen
Abstract
Positron emission tomography (PET) is an advanced nuclear imaging technique with an irreplaceable role in neurology and oncology studies, but its accessibility is often limited by the radiation hazards inherent in imaging. To address this dilemma, PET enhancement methods have been developed by improving the quality of low-dose PET (LPET) images to standard-dose PET (SPET) images. However, previous PET enhancement methods rely heavily on the paired LPET and SPET data which are rare in clinic. Thus, in this paper, we propose an unsupervised PET enhancement (uPETe) framework based on the latent diffusion model, which can be trained only on SPET data. Specifically, our SPET-only uPETe consists of an encoder to compress the input SPET/LPET images into latent representations, a latent diffusion model to learn/estimate the distribution of SPET latent representations, and a decoder to recover the latent representations into SPET images. Moreover, from the theory of actual PET imaging, we improve the latent diffusion model of uPETe by 1) adopting PET image compression for reducing the computational cost of diffusion model, 2) using Poisson diffusion to replace Gaussian diffusion for making the perturbed samples closer to the actual noisy PET, and 3) designing CT-guided cross-attention for incorporating additional CT images into the inverse process to aid the recovery of structural details in PET. With extensive experimental validation, our uPETe can achieve superior performance over state-of-the-art methods, and shows stronger generalizability to the dose changes of PET imaging.
Link to paper
DOI: https://doi.org/10.1007/978-3-031-43907-0_1
SharedIt: https://rdcu.be/dnv85
Link to the code repository
N/A
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
Thsi work presents an unsupervised PET enhancement (uPETe) framework based on the latent diffusion model trained using standard PET data only. The dataset consists of 100 SPET images for training and 30 paired LPET and SPET images for testing. Results show the proposed method outperforming the state of the arts.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
A frew main contributions: 1) adopting PET image compression for reducing the computational cost of diffusion model, 2) using Poisson diffusion to replace Gaussian diffusion for making the perturbed samples closer to the actual noisy PET, and 3) designing CT-guided cross-attention for incorporating additional CT images into the inverse process to aid the recovery of structural details in PET.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Overall the paper lacks of novelty since it basically applies latent diffusion for the low dose PET denoising problem.
The use of SSIM in denoising evaluation is questionable since noise is random in nature.
The generalization study is very limited only to simulation of noise.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
reproducibility is very poor both datasets and model are not published. The author’s reproducibility checklist is inconsistent with the actual paper.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
I recommend the authors improve the novelty of their methods and evaluated with more comprehensive metrics apart from SSIM. The authors are also encouraged to improve their reproducibility aspects,
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
4
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
novelty, experimental and reproducibility weakness
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #2
- Please describe the contribution of the paper
The paper proposes a denoising diffusion process for low dose (LD) PET image denoising. The authors contributions are
- performing diffusion not in the image domain but on a convolutional latent code to decrease computational complexity
- use of only standard dose (SD) PET in the training
- use of a Poisson noising model instead of Gaussian model
- CT guided cross attention by injecting the latent code of the CT image during denoising
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The ablation study demonstrates the advantage of the contributions wrt to the metrics considered. This is especially true regarding the CT cross attention mechanism, which seems to improve results by a substantial margin wrt state of the art in unsupervised denoising for the cases considered. Visual results are also convincing given the region considered
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The field of view (FOV) of the PET images considered are rather large (torso) and exhibit a number of structural features common to the CT image. This is in my opinion the most reasonable explanation as to why most gains are achieved with the CT attention mechanism. This paper does not mention inherent limitations of structural imaging-based image restoration for functional image denoising. However interesting the approach is in terms of methodology, due to this large FOV and the metrics considered (PSNR/SSIM), the approach does not provide insights in terms of functional volume restoration, which analysis is the sole objective of PET imaging.
- The Poisson noising model is less convincing. Naturally the authors justify the choice based on the Poissonian nature of the photon noise in the detectors. However, the Poisson statistics of the noise in the image domain is debatable (to say the least). The slight performance improvements may be due to the low counts in LDPET, but the authors should mention this limitation.
- some minor English errors
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
OK
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
This is a very interesting paper that demonstrates several novel mechanisms for PET image denoising with denoising diffusion models using coregistered CT information. Extensive comparisons of global performance with state of the art demonstrate the advantage of the choices considered at the global image level using PSNR/SSIM. However, PET is a functional imaging modality and the approach would benefit from being further validated on more local functional regions, and the inherent limitations of such a cross modal attention mechanism should be more discussed. Moreover, the Poisson noising model is less principled as PET images generally do not exhibit Poisson noise in the image domain. Conclusions related to this particular contribution should be maybe softened. Limitations should be more discussed in general, and especially related to validation at the functional region level.
- please provide image results without interpolation to more faithfully show the denoising effect.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
6
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
A very good paper worthy of communication at a venue such as MICCAI with several important methodological contributions to the field of cross modal image denoising. The clinical validation is much less convincing with only global metrics PSNR and SSIM.
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #3
- Please describe the contribution of the paper
This paper proposes an algorithm to match low-dose positron emission tomography (LPET) to standard PET (SPET) using Poisson latent diffusion model in an unsupervised manner. Although there is no theoretical study of the Poisson diffusion model, the use of diffusion on latent variables combined with Poisson diffusion offers good results compared to other methods and also decreases the computational cost of standard diffusion models.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Novelty : The poisson diffusion. Although it has not been the subject of theoretical analysis, Poisson diffusion is a novelty for diffusion models which, until now, have only used Gaussian noise.
Unsupervised : The learning requires only unlabeled data of SPET images and not pairs of SPET/LPET.
Robustness: The proposed method is robust to the change of dose in the measurement process and offers good results compared to SOTA algorithms, even supervised ones.
Computational cost Using diffusion models on latent variables decreases the computational cost compared to diffusion models on PET images.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
As stated by the authors, the paper lacks theoretical study on the Poisson diffusion. The final sample of a Poisson diffusion might not be close to pure noise as stated by the authors in Section 2.3.
PET images and latent variables from those images do not follow a Poisson distribution, and there is no theoretical evidence of any improvement made by a Poisson diffusion over Gaussian diffusion.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The architecture and dataset is clear enough and the CT cross attention is well explained to replicate the proposed method. However, as far as I know, the Poisson diffusion (Equation 1) is neither explicit nor referenced. I assume that the training of both autoencoders is done separately and prior to the training of the diffusion model.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
Equation 1 does not provide sufficient information for the Poisson diffusion, and the given reference [20] does not explain what the perturb function does. Even if the measurements are subject to Poisson noise, the image itself is not, and the same may be true for the latent variables. The comparison between LDM and LDM-P in Table 1 could only suggest (instead of proving) that Poisson diffusion is better than Gaussian diffusion for PET images. The CT cross attention is well explained and referenced. For comparison study, a “CT only” experiment would help to understand the impact of the latent Poisson diffusion model in the reconstructed image.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The Poisson diffusion is applied to latent variables which do not follow a Poisson noise. However, the results presented in this paper are competing state of the art methods while using an unsupervised method. This result is promising and may lead to future work on the study of a Poisson diffusion for inverse problems.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
The three reviewers acknowledge that this work has merit, but they point to several issues that should be addressed before it can be presented at MICCAI. Two important issues are about reproducibility of the paper, as well as insights into the method and why it should work. Another critical issue is regarding theoretical justification for the Poisson distribution. As PSNR and SSIM do not completely capture the performance, the reviewers are also requesting results on additional metrics.
Author Feedback
Q1: Novelty & reproducibility (R1) First, the introduction of diffusion model to PET enhancement tasks reflects our insights into current dilemma (lack of paired data) and future trend (unsupervised learning) in the field. Second, simply applying the original latent diffusion model to PET images cannot obtain satisfactory results. Thus, based on PET imaging theory, we designed targeted improvements, including Poisson diffusion, CT cross-attention, etc. We have made every effort to provide a clear description of the model architecture, training hyperparameters, data preprocessing, and other factors in the manuscript. Additionally, we will further improve these descriptions and include the link to the released code and dataset to ensure consistency with the reproducibility checklist in the final paper.
Q2: Poisson diffusion (R2, R3)
Effectiveness of Poisson diffusion over Gaussian diffusion (R2, R3) In PET imaging, sinogram (raw data) is affected by Poisson noise due to instability of signal photons. When the dose is low, the instability of photons will increase, resulting in higher Poisson noise in LPET sinogram compared to SPET sinogram. As each pair of sinogram and image can be transformed by Radon transform, it is a common operation to simulate the corresponding LPET image by applying Poisson noise to a SPET sinogram and then transforming it to an image [20]. We adopt a similar operation in Eq. 1 to produce perturbed samples, where our Poisson noise imposition is actually performed in the projection domain, not the image domain. Therefore, the perturbed samples produced by Poisson diffusion can mimic the LPETs from SPET better than those by Gaussian diffusion. This enables the self-supervision using only SPET and enhances the model’s robustness to input LPETs with varying doses (noise levels) as different LPETs have been simulated during the Poisson diffusion. Please also note that under this principle, the effectiveness of Poisson diffusion only relies on the Poisson distributions of the noise in the projection domain, rather than requirements on the image or it’s latent variables.
Final perturbed sample not close to pure noise (R3) Actually, the final perturbed sample can be close to pure noise when using a sufficient number of diffusion steps. But our goal is to generate SPETs from given LPETs rather than noises, thus we do not need the final perturbed samples to become pure noise. In Section 2.3, we stated this fact and used fewer diffusion steps to accelerate model training. Due to MICCAI length limit, we did not provide a detailed proof for the mathematical interpretability of Poisson diffusion in the manuscript. Therefore, we mentioned the lack of “theoretical support” for Poisson diffusion in Section 4. We will further improve and supplement the description of Poisson diffusion in the final paper to facilitate better understanding.
Q3: Evaluation Metrics(R1) PSNR and SSIM are the commonly-used metrics to measure PET quality in PET enhancement task [3] [9] [13] [14] [17] [21] [23]. Thus, using PSNR and SSIM to quantify the results is appropriate. Additionally, our experimental findings can also be supported by other two metrics (MSR and rRMSE) in our extra experiments which will be included in Tables I and II of the final version.
Q4: CT assistance to PET (R2) As point out, there is a potential issue in conventional CT/MRI aided PET generation that the overall PET image quality improves while the local target regions may not. To address this, we designed the CT cross-attention to selectively utilize CT information by automatically weighting it in different regions. Direct evaluation of the target region further demonstrates the effectiveness of CT cross-attention in enhancing both overall quality and local region quality. These results and discussions will be included in the final paper.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The authors have addressed issues about Novelty & reproducibility as well as issue about Poisson diffusion assumption. Other issues have also been addressed. I recommend accepting this paper.
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The rebutall has justified the use of Poisson diffusion model and addressed other concerns. This is an interesting paper that employs diffusion model to improve PET image quality that accounts for specific noise models of the imaging modality.
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
Based on a thorough evaluation of the reviews and the feedback from the authors, the Meta Reviewer has carefully considered the comments and recommendations. The majority of the reviewers have expressed a positive inclination towards accepting the paper. Additionally, after the authors provided their rebuttal, two reviewers maintained their positive scores, indicating their satisfaction with the authors’ responses.
However, one reviewer remains adamant about rejecting the paper, primarily citing concerns about the use of SSIM, which is widely employed in the literature, and generalisation, which follows a similar protocol found in other works.
Taking all of this into account, and considering the strong support from the majority of reviewers, the Meta Reviewer suggests accepting the paper. While the concerns raised by the dissenting reviewer are valid, they appear to be addressed by the use of established methodologies and common practices in the field. Thus, the paper can contribute to the existing body of knowledge and merits acceptance for publication.