Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Jun Ma, Yuanzhi Zhu, Chenyu You, Bo Wang

Abstract

Deep learning-based medical image enhancement methods (e.g., denoising and super-resolution) mainly rely on paired data and correspondingly the well-trained models can only handle one type of task. In this paper, we address the limitation with a diffusion model-based framework that mitigates the requirement of paired data and can simultaneously handle multiple enhancement tasks by one pre-trained diffusion model without fine-tuning. Experiments on low-dose CT and heart MR datasets demonstrate that the proposed method is versatile and robust for image denoising and super-resolution. We believe our work constitutes a practical and versatile solution to scalable and generalizable image enhancement.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_1

SharedIt: https://rdcu.be/dnwAy

Link to the code repository

https://github.com/bowang-lab/DPM-MedImgEnhance

Link to the dataset(s)

CT: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=52758026

ACDC: https://www.creatis.insa-lyon.fr/Challenge/acdc/

M&Ms-1: https://www.ub.edu/mnms/

[M&Ms-1: https://www.ub.edu/mnms-2/

CMRxMotion: http://cmr.miccai.cloud/

Reviews

Review #1

Please describe the contribution of the paper

This paper presents a method for medical image enhancement based on diffusion models. The method does not require paired data for training and only trains a diffusion model with high-quality images. The performance image enhancement, the method combine the orginal sampling in diffusion model with a data consistency step that project the denoised image to the subspace of the observed data.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The method does not require paired data for training and the trained diffusion model can be used for different degradation models.
2. The proposed method shows significant improvement over the other methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The novelty of the method is limited. The essential of the proposed method is very similar to DPS. DPS also combine the diffusion model with data consistency for inverse problem. The DPS use a gradient step for data consistency while the proposed method use a projection step.
2. It is unclear how algorithm 1 can perform denoising. (see detailed comments)
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The datasets used in this papar are public and the author plan to open-source their implementation after review.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. Could the authors discuss about the differece of the proposed method from the other diffusion model for inverse problem? (expect for the use of medical image data)
2. Could the authors explain more on how the algorithm 1 works on denoising task? The authors mentioned that the degradation operator for the denoising taks is identity matrix I. In this case, line 4 in algorithm 1 becomes \hat{x}{0|t} = x{0|t} - H^\dag(H x_{0|t} - y) = x_{0|t} - (x_{0|t} - y) = y which is constant. As a result, the diffusion model \epsilon_\theta in line 3 is actually not used. The authors should provide more details on how denoising is performed.
3. For the baseline diffusion models, did the authors use the pretrained models or re-train the model on the new datasets?
4. In fig 3, the SR results of the proposed method look smoother than the results IVLR and DPS. Could the authors comment on this?
5. There is a typoe in algorithm 1, line 5. Parentheses are missing from the second term.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

3
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents a diffusion model based method for medical image enhancement. The method is flexible in that it doesn’t require paired data for training. However, the novelty of the proposed method is limited and efficacy of algorithm 1 for denoising need to be justified.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

5
[Post rebuttal] Please justify your decision

The authors have properly addressed the concern raised by reviewers.

Review #2

Please describe the contribution of the paper

The authors present a diffusion model-based framework without the requirement of paired data that can deal with multiple image enhancement tasks by one pre-trained diffusion model. This framework is implemented in comparison with the state-of-the-art methods on low-dose CT and heart MR datasets, indicating its versatility and robustness for image denoising and super-resolution.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) An unsupervised plug-and-play framework is constructed by integrating diffusion model and general image restoration. 2) The proposed framework eliminates the requirement of paired data. 3) The pre-trained diffusion model can simultaneously deal with multiple tasks of medical image enhancement.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) The authors need to explain how to obtain the degradation operator H for medical image enhancement (e.g., super-resolution), which ensures that it represents the modeling of observed images accurately in practice. 2) The authors need to provide the detailed explanation on how to produce the low-resolution images and the ground-truth images, as well as avoid dataset shift between training data and testing data. 3) The authors need to provide parameter settings of medical image acquisition, and give in-depth analysis on method robustness and computational complexity with respect to different imaging modality, parameters and sites.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The results of this paper may be reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

1) The authors need to provide the convergence condition of the proposed algorithm. 2) The authors need to explain how the proposed model deals with the diversity, relevance and complementarity between multiple image enhancement tasks. 3) The authors should discuss the generality ability of the proposed method with respect to different imaging parameters, scanners and sites. 4) The authors had better provide the subsequent image analysis (e.g., segmentation and classification) to further verify the performance of the proposed image enhancement method in the Experiments section. 5) The computational complexity of the proposed method should be provided in comparison with baseline methods. 6) There are some typos and grammar errors, such as “ levels(i.e., ” and “opanai’s”.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors did not provide clear descriptions on the proposed method and enough experimental data for method validation.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

5
[Post rebuttal] Please justify your decision

The author basically answered the questions correctly raised by the reviewer.

Review #3

Please describe the contribution of the paper

This paper presents an innovative approach to medical image enhancement by utilizing a pre-trained Denoising Diffusion Probabilistic Model (DDPM) within a plug-and-play (PnP) framework. The proposed method treats the DDPM model as a denoiser or refiner and seamlessly integrates the refined intermediate results into the PnP framework. This approach represents a promising attempt to apply the DDPM model for medical image enhancement without the need for paired data, which is often difficult to obtain in the medical field.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1 This paper introduces the Denoising Diffusion Probabilistic Model (DDPM), a state-of-the-art method in image processing. 2 The proposed approach treats the DDPM model as a refiner and integrates it with the plug-and-play (PnP) framework, enabling image enhancement without the need for paired data.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1 The experiments conducted in this study utilized CT data with a size of 256x256, which may be considered less attractive due to its relatively small size. 2 The comparison methods in the study did not include other state-of-the-art techniques, such as GAN-based methods, which could have provided a more comprehensive evaluation.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Highli reproducible
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
This paper presents an innovative approach to medical image enhancement by utilizing a pre-trained Denoising Diffusion Probabilistic Model (DDPM) within a plug-and-play (PnP) framework. The proposed method treats the DDPM model as a denoiser or refiner and seamlessly integrates the refined intermediate results into the PnP framework. This approach represents a promising attempt to apply the DDPM model for medical image enhancement without the need for paired data, which is often difficult to obtain in the medical field. The following are my concerns for improving this paper. 1 The experiments conducted in this study utilized CT data with a size of 256x256, which may be considered less attractive due to its relatively small size. 2 The comparison methods in the study did not include other state-of-the-art techniques, such as GAN-based methods, which could have provided a more comprehensive evaluation.
1. There is an inconsistency between Algorithm 1 and the text, as the text mentions that sampling is implemented with the Denoising Diffusion Implicit Model (DDIM). Additionally, there seems to be an error regarding the timesteps; setting them to 100 does not imply that the range of t should be set to [1, 100].
2. The should be a simple description for the comparison methods IVLR and DPS.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper leverages the recent state-of-the-art (SOTA) method and incorporates it with the traditional plug-and-play (PnP) framework, addressing the clinical challenge of acquiring paired data.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
The paper presents a Denoising Diffusion Probabilistic Model (DDPM) based medical image enhancement method that utilizes a pre-trained diffusion model without the need for paired data. The proposed framework combines the original sampling in the diffusion model with a data consistency step to project the denoised image to the observed data subspace. The method can handle various image enhancement tasks by a single pre-trained diffusion model and show its versatility and robustness for image denoising and super-resolution.
While this paper has certain merits, several important weaknesses damp the enthusiasm, including:
1. The comparison methods did not include other state-of-the-art techniques, such as GAN-based methods.
2. Inconsistency between Algorithm 1 and the text, and error in setting timesteps.
3. Lack of explanation on how to obtain degradation operator H for medical image enhancement, and detailed explanation on producing low-resolution and ground-truth images.
Overall, the proposed method’s efficacy for denoising needs to be justified, and the authors should provide more details on how denoising is performed. The generality ability of the proposed method should be discussed with respect to different imaging parameters, scanners, and sites, and subsequent image analysis should be provided to verify the method’s performance. Finally, the computational complexity of the proposed method should be provided in comparison with baseline methods, and typos and grammar errors should be corrected.

Author Feedback

We sincerely appreciate the valuable feedback provided by the reviewers. We would like to address their concerns as follows.

Q1: Comparison to DPS. (R1&Meta-R) Our motivation differs from that of DPS. Specifically, we focus on leveraging the denoising capabilities of pre-trained diffusion models and incorporating them into the classical denoise and projection method. On the other hand, DPS aims to approximate the intractable log-likelihood. We have included a comprehensive comparison between our method and DPS. The results clearly demonstrate that our method outperforms DPS across all the conducted experiments. Furthermore, DPS is approximately 10 times faster than DPS because it requires the computation of the gradient of the log-likelihood, which can be a time-consuming step.

Q2: Handling denoising tasks. (R1-2&Meta-R) Diffusion models are trained to predict the noise between clean and noisy versions of images, making them effective as denoisers. For tasks that necessitate noise removal, we followed the practice in DDNM [27] and scale the projection difference A^{\dag}(Ax_{0|t}-y) to balance the information from measurement y and denoising output x_{0|t} hence avoid the situation mentioned by R1. The generalized algorithm is updated accordingly.

Q3: Experiments. (R1-3&Meta-R) For all experiments, we re-trained the diffusion models on the medical image datasets, which will be shared with the community and can be used for a range of applications, including image generation, denoising, and super-resolution. We acknowledge that conducting experiments on image size with 512x512 is more attractive but it is still a challenging task for training diffusion models on such high resolution. Although one can use the latent diffusion model to generate high-resolution images, the original imaging model y=Hx+n is not applicable in latent space. Our purpose is to develop a versatile framework that can use one pre-trained diffusion model to handle multiple medical image degradation tasks, while the GAN-based method cannot handle the pure denoising task. Thus, we didn’t include it in the comparison. DPS could obtain more crispy boundaries because it computed the gradient of the log-likelihood, while our method can better preserve the anatomical structure and is 10x faster. In super-resolution tasks, H is the downsampling operator, which was implemented with torch.nn.AdaptiveAvgPool2d. We applied this operator to the original images to obtain low-resolution images.

Q4: Computational complexity. (R1-2&Meta-R) Diffusion models are slower than DIP because it needs to evaluate the model multiple times during sampling. Nevertheless, Our algorithm is faster than other diffusion model-based methods, e.g., DPS.

Q5: Generalization ability. (R2&Meta-R) We agree that medical images are very diverse because of different imaging protocols. To mimic the real-world setting, the diffusion models were trained on a diverse dataset, including images from different centers and scanners. The testing set (e.g., MR images) is from a new medical center that has not appeared in the training set. Experiments show that our model can generalize to these unseen images.

Q6: Downstream evaluation. (R2) We acknowledge that subsequent image analysis (e.g., segmentation and classification) can further enhance the evaluation. Since our primary goal is to validate the versatility of pre-trained diffusion models for various medical image enhancement tasks, we leave the downstream evaluations as the near future work.

Q7: Convergence condition. (R2) Our algorithm follows the classical plug-and-play image enhancement framework. However, the rigorous proof of its global convergence with arbitrary denoisers is still an open question. Nevertheless, our empirical experiments show that the algorithm is very stable for different image modalities and enhancement tasks.

Q8: Writing. We have added the description of the compared methods and all the typos have been fixed.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper presents a “Plug-and-Play” medical image enhancement based on Denoising Diffusion Probabilistic Model (DDPM) utilizing a pre-trained diffusion model. The proposed method have certain merits including its versatility and robustness for image denoising and super-resolution tasks.

The rebuttal has addressed majority of the concerns from reviewers and clarified the methodology and experimental details. The novelty is still limited but the “plug-and-play” idea has merit.

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Based on authors feedback, all the major concerns are addressed in the rebuttal. I recommend acceptance. The paper presents and interesting idea.

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The rebuttals didn’t address the reviewers’ concerns. The majority reviewers decide to reject this paper.

back to top

Pre-trained Diffusion Models for Plug-and-Play Medical Image Enhancement