Authors

Batu Ozturkler, Chao Liu, Benjamin Eckart, Morteza Mardani, Jiaming Song, Jan Kautz

Abstract

Diffusion models have recently gained popularity for accelerated MRI reconstruction due to their high sample quality. They can effectively serve as rich data priors while incorporating the forward model flexibly at inference time, and they have been shown to be more robust than unrolled methods under distribution shifts. However, diffusion models require careful tuning of inference hyperparameters on a validation set and are still sensitive to distribution shifts during testing. To address these challenges, we introduce SURE-based MRI Reconstruction with Diffusion models (SMRD), a method that performs test-time hyperparameter tuning to enhance robustness during testing. SMRD uses Stein’s Unbiased Risk Estimator (SURE) to estimate the mean squared error of the reconstruction during testing. SURE is then used to automatically tune the inference hyperparameters and to set an early stopping criterion without the need for validation tuning. To the best of our knowledge, SMRD is the first to incorporate SURE into the sampling stage of diffusion models for automatic hyperparameter selection. SMRD outperforms diffusion model baselines on various measurement noise levels, acceleration factors, and anatomies, achieving a PSNR improvement of up to 6 dB under measurement noise. The code will be made publicly available.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_20

SharedIt: https://rdcu.be/dnwAR

Link to the code repository

N/A

Link to the dataset(s)

mridata.org

fastmri.org

Reviews

Review #1

Please describe the contribution of the paper

This paper introduces an automatic hyper-parameter tuning algorithm for accelerated MRI reconstruction with diffusion (score-based) generative models. The method uses an approximation of Stein’s unbiased risk estimate (SURE) to obtain an estimate of the test-time reconstruction loss, without requiring access to the ground truth fully sampled MRI scan. This estimate is used in an algorithm to automate the stopping criterion and step size in alternating minimization Langevin dynamics.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The idea of combining SURE with hyper-parameter choice for annealed LD in accelerated MRI reconstruction is novel and of high practical interest.
- The experimental results on retrospective MRI reconstruction are convincing, given that the authors use a pretrained diffusion model from a previous publication, and additionally compare against multiple baselines.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The writing of the paper does not clearly convey the justification or setup for using SURE in the first place, especially the approximation of the residual error as Gaussian, and the need to add artificial Gaussian noise during the test stage.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper uses pretrained diffusion model checkpoints and datasets that are publicly available, and the proposed algorithm is clearly explained in pseudo-code to ensure easy reproducibility. The used numerical values are presented in the result section.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- Section 3.2 could benefit from a major revision and clearer formalism of the SURE setup. For example, it is not clearly stated what the function and input to the “denoiser” are. In regular SURE formalism (e.g., [1]), it is common to explicitly write the denoising function as f(x_noisy). In this case, judging by the way the trace of the Jacobian is computed in the third term of (4), it would seem that the input to the denoiser is x_zf? This does not seem consistent with the formalism in Eq. (10), where the input to h is x_t, not x_zf.
- It is not clear why the approximation in (5) is justified or holds at all. The authors state that (5) is obtained from (4) by assuming that “the reconstruction error is not large”, however the approximation of \sigma reveals that this refers to the reconstruction error between the output of the denoiser and the zero-filled MRI scan. It is apparent that at high acceleration factors this approximation does not hold at all, e.g., as can be seen from Figure 1 itself. Furthermore, it is also clear that this residual is not i.i.d. Gaussian, hence questioning the entire use of SURE from a theoretical point of view. If the authors’ claim about using density compensation is true, it would help to see how it actually impacts results.
- Section 4 states that measurement noise was simulated and added to the MRI data. Why is this necessary? This is highly unusual in accelerated MRI reconstruction, where even fully-sampled scans already have noise in them. A much more convincing result could have been to use publicly available MRI data which have an inherently low SNR (e.g., fat suppressed knees in the fastMRI dataset).
- The authors could consider adding a statistical significance test for their results, given that many open-sources libraries for computing confidence intervals are readily available. This would be a simple and computationally cheap way of increasing the quality of the experimental evaluation.
References [1] Metzler, Christopher A., et al. “Unsupervised learning with Stein’s unbiased risk estimator.” arXiv preprint arXiv:1805.10531 (2018).
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

While the paper is interesting and the quantitative and qualitative results are solid, the paper would need a writing revision to make the assumptions and setup more convincing, especially the SURE formalism and the simulation setup.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The work proposes the integration of SURE into a diffusion model for MR image reconstruction. The aim is to provide a reconstruction network that is robust under distribution shifts during testing. Hyperparameter fine-tuning during testing is achieved via SURE. Investigations were carried out on multi-coil brain data from fastMRI and multi-coil knee data from mridata.org.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- A very comprehensively written manuscript that details all involved steps
- Addresses an important scientific question for diffusion models
- Compelling results
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Minor style corrections required
- No comparison to unrolled networks
- Code access information not provided
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

All relevant details and parameters are defined in the manuscript. However, code access and/or documentation about used datasets (e.g. in code repository) are not provided. Reproducibility checklist is correctly answered.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. Did the authors investigate the performance under distribution shift for the multi-coil knee data from fastMRI? These observations could be set into comparison to investigations of domain shifts for unrolled networks (DOI: https://doi.org/10.1002/mrm.28827).
2. The results of SMRD are smoother and blurrier than the comparative works. Where does this originate from? Is this from the proposed test-time adapted tuning parameter?
3. Did the authors perform any ablation studies on the settings of window size w, learning rate alpha, etc.?
4. Details about the network size (e.g. trainable parameter) and computational demand, training/test time could have been provided.
5. Please include information how code can be accessed and document in there the used datasets for reproducibility.
6. Please set Table captions above the Table.
7. Please print the references in the order of their appearance.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. Did the authors investigate the performance under distribution shift for the multi-coil knee data from fastMRI? These observations could be set into comparison to investigations of domain shifts for unrolled networks (DOI: https://doi.org/10.1002/mrm.28827).
2. The results of SMRD are smoother and blurrier than the comparative works. Where does this originate from? Is this from the proposed test-time adapted tuning parameter?
3. Did the authors perform any ablation studies on the settings of window size w, learning rate alpha, etc.?
4. Details about the network size (e.g. trainable parameter) and computational demand, training/test time could have been provided.
5. Please include information how code can be accessed and document in there the used datasets for reproducibility.
6. Please set Table captions above the Table.
7. Please print the references in the order of their appearance.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

A test-time hyperparameter tuning algorithm is proposed. Use SURE as a surrogate loss function for MSE and incorporate it into the sampling stage. SMRD achieves state-of-the-art performance across different noise levels, acceleration rates, and anatomies.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

As a novel formulation, SMRD does not require ground truth data from the target distribution for tuning as it uses SURE for estimating the true MSE. The proposed method is robust to distribution shift. Strong evaluation is provided to validate the proposed method.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

High-frequency signals are lost in Fig. 2. Image details are blurred by the proposed method.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Reproducibility is acceptable.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

It would be better to provide difference maps in the results, so a clear comparison be evaluated.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The major factors include the image quality of the proposed method and the novelty of the paper.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper introduces a novel algorithm for automatic hyper-parameter tuning in accelerated MRI reconstruction using diffusion generative models. The algorithm utilizes an approximation of Stein’s unbiased risk estimate (SURE) to estimate the reconstruction loss during test-time, eliminating the need for access to the fully sampled ground truth MRI scan. By leveraging this estimate, the algorithm automates the determination of the stopping criterion and step size in the alternating minimization Langevin dynamics, enhancing the efficiency and effectiveness of the reconstruction process.
All reviewers agreed on an overall high ranking for the paper. Yet, some raised concerns about the over-smoothness of the reconstructed images and the evaluation with very high undersampling factors. In addition, assessing the clinical impact on pathological regions using the fastMRI++ dataset in crucial to determine the actual value of the method. Please address the comments by the reviewers in the final submission.

Author Feedback

We thank the reviewers for their detailed reviews, the positive comments and the feedback, as well as valuable suggestions. Our responses are detailed below:

SURE formalism: Thank you for the detailed comments. Eq. 4 indicates the SURE formalism for a general “denoiser”, where the input is x_zf. In Eq. 10, the input of the “denoiser” for SURE calculation is x_zf as in Eq. 4. However, as Eq. 10 shows the calculation at time step t, the connection to x_zf is less apparent. In the final version, we will revise and clarify the SURE setup to match regular SURE formalism.

Comparison with unrolled networks: In the paper, we have compared our method with different diffusion model-based approaches. Although we have not directly compared to an unrolled network, previous studies indicate that diffusion model-based approaches including the csgm-langevin baseline used in the paper are more robust under domain shifts compared to unrolled networks [1,2].

Network size and computation details: In SMRD, we used a pre-trained score-based network at inference time from [1]. As a result, the network did not have any trainable parameters at inference time. The computational demand at inference in terms of inference time and inference memory are provided in Table 2 and Table 3 in the supplementary material.

High undersampling factors: We thank the reviewers for the comment. In our experiments, we aimed to use a wide range of acceleration rates with R = {4,8} for fastMRI, and R = {12,16} for Mridata. In doing so, we aimed to use acceleration rates applied by previous methods using the fastMRI and Mridata datasets [3,4].

Simulated noise: Thank you for the valuable suggestion. Exploring real, clinical settings that have inherently low SNR such as fat suppressed knees is an important experiment that will be investigated as part of future work. The aim of applying simulated measurement noise in our experiment is to have the ability to test reconstruction methods at multiple known noise levels as done in previous studies investigating robustness to acquisition-driven perturbations [5].

Over-smoothness of reconstructed images and hyperparameter ablation: The smoothing of reconstructed images might be attributed to the hyperparameter selection for SMRD. As the window size increases, the early stopping point is expected to shift to stop at a later iteration. This would result in improved image fidelity and less smoothing, with a higher risk of observing artifacts such as noise or hallucinations.

Style corrections: Thank you for the comment, we will include style corrections in the final version.

[1] Jalal, A., Arvinte, M., Daras, G., Price, E., Dimakis, A.G., Tamir, J.I.: Robust compressed sensing mri with deep generative priors. Advances in Neural Information Processing Systems (2021)

[2] Chung, Hyungjin, and Jong Chul Ye. “Score-based diffusion models for accelerated MRI.” Medical Image Analysis 80 (2022): 102479.

[3] Zbontar, J., Knoll, F., Sriram, A., Murrell, T., Huang, Z., Muckley, M.J., Defazio, A., Stern, R., Johnson, P., Bruno, M., et al.: fastMRI: An open dataset and benchmarks for accelerated mri. arXiv preprint arXiv:1811.08839 (2018)

[4] Gunel, B., Sahiner, A., Desai, A.D., Chaudhari, A.S., Vasanawala, S., Pilanci, M. and Pauly, J., 2022, September. Scale-Equivariant Unrolled Neural Networks for Data-Efficient Accelerated MRI Reconstruction. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part VI (pp. 737-747).

[5] A. D. Desai, B. Gunel, B. M. Ozturkler, H. Beg, S. Vasanawala, B. Hargreaves, C. Ré, J. M. Pauly, and A. S. Chaudhari. Vortex: Physics-driven data augmentations for consistency training for robust accelerated mri reconstruction. arXiv, 2021.

back to top

SMRD: SURE-based Robust MRI Reconstruction with Diffusion Models