Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Ye Mao, Lan Jiang, Xi Chen, Chao Li

Abstract

Multi-contrast magnetic resonance imaging (MRI) is the most common management tool used to characterize neurological disorders based on brain tissue contrasts. However, acquiring high-resolution MRI scans is time-consuming and infeasible under specific conditions. Hence, multi-contrast super-resolution methods have been developed to improve the quality of low-resolution contrasts by leveraging complementary information from multi-contrast MRI. Current deep learning-based super-resolution methods have limitations in estimating restoration uncertainty and avoiding mode collapse. Although the diffusion model has emerged as a promising approach for image enhancement, capturing complex interactions between multiple conditions introduced by multi-contrast MRI super-resolution remains a challenge for clinical applications. In this paper, we propose a disentangled conditional diffusion model, DisC-Diff, for multi-contrast brain MRI super-resolution. It utilizes the sampling-based generation and simple objective function of diffusion models to estimate uncertainty in restorations effectively and ensure a stable optimization process. Moreover, DisC-Diff leverages a disentangled multi-stream network to fully exploit complementary information from multi-contrast MRI, improving model interpretation under multiple conditions of multi-contrast inputs. We validated the effectiveness of DisC-Diff on two datasets: the IXI dataset, which contains 578 normal brains, and a clinical dataset with 316 pathological brains. Our experimental results demonstrate that DisC-Diff outperforms other state-of-the-art methods both quantitatively and visually.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_37

SharedIt: https://rdcu.be/dnwwQ

Link to the code repository

https://github.com/Yebulabula/DisC-Diff

Link to the dataset(s)

https://drive.google.com/drive/folders/1i2nj-xnv0zBRC-jOtu079Owav12WIpDE

Reviews

Review #2

Please describe the contribution of the paper

This paper propose a disentangled conditional diffusion model, DisC-Diff, for multi-contrast brain MRI super-resolution. It utilizes the sampling- based generation and simple objective function of diffusion models to estimate uncertainty in restorations effectively and ensure a stable op- timization process. The experimental results also show the effectivemess of the proposed method.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The author adopt a U-shape multi-stream network composed of multiple encoders enhanced by disentangled representation learning across MRI contrasts for reconstructing SR images.

This is the first time that introduce an entropy-inspired curriculum learning strategy for training diffusion models, which significantly reduces the impact of varied anatomical complexity on model convergence.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

My most critical concern is regarding the review of the related works. The authors totally lacked extensive MR imaging works. I suggest that the author should review some related works in the introduction. My second critical concern is regarding the baselines. The authors can provide more related baselines, e.g., the above multi-contrast MRI super-resolution method.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The author do not describe the details of the LR. In my opinion, the MRI LR image is very different from natural image, please see Multi-contrast mri super-resolution via a multi-stage integration network MICCAI 2021.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The authors can provide more related baselines, e.g., the multi-contrast MRI super-resolution method, - Feng C M, Yan Y, et al. Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution[J]. arXiv preprint arXiv:2109.01664, 2021.

The author should use the largest public dataset fastMRi to prove the effectiveness of the proposed method.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The author design a effective method based on the diffusion model that can improve the performance of the proposed method.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #1

Please describe the contribution of the paper

In this work, the authors aim to perform 2D super-resolution of in-plane magnetic resonance (MR) slices. The proposed method is a conditional denoising diffusion probabilistic model (DDPM) with a few novelties: 1) disentangling with multi-contrast; 2) a novel disentanglement loss function; 3) the application of the Charbonnier loss; and 4) a curriculum learning strategy to ease learning. The authors train two independent models on two simulated LR-HR paired datasets: IXI and an in-house clinical set. In both cases, since the model is 2D, only mid-axial slices are selected for training, validation, and testing, and the simulation of LR data is by k-space truncation, which is an appropriate forward model for in-plane MR. The work is novel and the results are convincing, so this paper merits an accept.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper’s main strength is the quality of its results. The paper is also well-motivated and well-written.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Major concerns:
- None
Minor concerns:
- Claim “compressed sensing and sparse regularization cannot restore high-freq details” – citation? In the theory, if a signal is K-sparse, then compressed sensing recovers it exactly. Making this bold of a claim weakens the otherwise strong work.
- One model must be trained for each input domain; the authors provide a separate model for IXI and for the clinical data. How well do these models generalize? Does the IXI model generalize to the clinical data or vice-versa? To apply the model in real scenarios, how much paired data would be required? Furthermore, must a new model be trained for each resolution scale?
- Motivation for improving in-plane resolution is weak. In practice, when are higher in-plane resolutions desired?
- The authors make multiple claims of “uncertainty estimation”. However, visualizing the mean and variance (since the process is stochastic) is not formally uncertainty estimation. Either citations to show that DDPMs formally enable uncertain estimation are needed, or further evidence is needed, or the phrasing should be changed.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

This work is partially reproducible. One experiment is on private clinical data, and no code is available, yet the method is described well.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

Please see the minor concerns listed above.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Please see summary and minor concerns listed above.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This paper presents a diffusion model for multi-contrast brain MRI super-resolution. The authors employ an existing conditional diffusion model for the diffusion process and develop a U-Net-based architecture to learn the reverse distribution. The U-Net features three branches to separate representations from different contrasts. The method assumes that various MRI contrasts contain both shared and independent information. The network divides the representation into shared and independent components. A novel disentanglement loss encourages shared representations to be similar and independent representations to be dissimilar. Shared representations from each modality are merged using the Squeeze-and-Excitation module to obtain a single shared feature. Finally, all independent features and the single shared feature are concatenated and forwarded to the decoder module. The model is trained using reconstruction and disentanglement loss within a diffusion model framework.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) The paper is well-structured and easy to follow.

2) The proposed model outperforms the existing methods on the benchmark datasets.

3) The ablation study demonstrates that the proposed architecture components and loss functions contribute positively to the quality of the generated images.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The proposed method has the potential to generate slightly varying outputs through sampling, which may not be optimal for clinical settings. The authors attempt to utilize this for uncertainty estimation, which could serve as a valuable indicator for error estimation. However, this aspect should be incorporated into the discussion.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors shared the source code as supplementary material.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

1) How can we ensure that the generated high-resolution MRI maintains a similar structure to the ground truth, given that the diffusion process’s stochastic nature can introduce new details or remove existing structures?

2) What is the runtime for processing a single MRI scan, and how does the speed of the model compare to other methods?

3) The paper lacks details about data pre-processing and intensity normalization. Have any standardization techniques been applied to the scans before training?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed method incorporates innovative elements that enhance the quality of the generated images. The study provides empirical evidence demonstrating the positive impact of the proposed architecture on the super-resolution task.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This work presents a conditional denoising diffusion probabilistic model (DDPM) for 2D super-resolution of in-plane magnetic resonance (MR) slices. It introduces several novelties, including disentangling with multi-contrast, a novel disentanglement loss function, the application of the Charbonnier loss, and a curriculum learning strategy. The results are convincing, and the paper is well-motivated and well-written. Some minor concerns were raised, such as the generalization of models, the motivation for improving in-plane resolution, and the uncertainty estimation. All reviewers were positive about the paper. Hence, the decision is to recommend the paper for acceptance. The authors are encouraged to make the necessary changes to the paper to the best of their ability following the reviewers’ comments.

Author Feedback

We sincerely thank the reviewers for their high-quality reviews and constructive feedback. We provide a point-to-point response here and will make changes to the final version.

R1[Q1] “Compressed sensing and sparse regularization cannot restore high-freq details” weakens the otherwise strong work. [A1] Thank you for pointing out this misleading sentence. We will change our narrative to specific approaches in our final version, i.e., iterative deblurring algorithms and dictionary learning-based methods, which are challenged due to their inability to restore high-frequency details and sharp edges, reported in [4,26]. R1[Q2] Model Generalisation; Paired Data Requirement; Resolution Scale. [A2] The domain discrepancy between datasets remains a challenge for most SR models. Our experiments observed performance degradation due to anatomical differences between healthy and lesioned brains in two datasets. To mitigate this, DisC-Diff employs a disentangled loss function as a regularisation, improving model generalisation. In real scenarios, fine-tuning the model might be able to address the generalisation challenge on another clinical dataset. The current version of DisC-Diff requires training a new model for each resolution scale, similar to prior work. Our future work aims to propose a unified DisC-Diff framework that could support multiscale super-resolution.

R2[Q1] Lack of review of the MR imaging works. [A1] Due to the page limit, we have focused on MR imaging rather than natural imaging in the second and third paragraphs of the introduction. However, we would expand more related work within the page limit. R2[Q2] The baselines selection. [A2] We evaluated MINet [4], a SOTA multi-contrast super-resolution (MCSR) method, with open-source codes. As far as we know, other MCSR methods did not provide training codes [14,20,26]. Re-implementing these models may lead to performance drops, making the comparison unfair. Therefore, we prioritised those SOTA methods with open-source codes. R2[Q3] The details of the LR. [A3] In section 3, we mentioned that the LR images were obtained from k-space truncation [2] rather than bicubic interpolation in natural image SR. R2[Q4] More related baselines. [A4]. We would thank the reviewer for this comment. Due to the page limit and time constraint, we have implemented MINet in this version. Our extended work aims to expand the comparison experiments (e.g., SANet).

R3[Q1] Varying outputs through sampling. [A1] Alternative methods may produce consistent results, but they normally cannot offer guidance to clinicians regarding the reliability of their reconstructions. We believe that model confidence is vital in clinical decision-making as a critical component of trustworthy AI. R3[Q2] How can we ensure that the generated high-resolution MRI maintains a similar structure to the ground truth? [A2] Due to the stochastic nature, diffusion models cannot generate a unique mapping from LR to HR. However, DisC-Diff leverages conditioned LR and other modalities to enhance structural consistency with ground truth. Therefore, the generated HR images from our model typically demonstrate a similar structure to the ground truth, though minor variations might still exist. R3[Q3] Runtime for processing a single MRI scan. [A3] Sampling speed remains a limitation for existing diffusion models. DisC-Diff can process a single 3D MRI scan in less than a minute. Our future work will integrate a more advanced acceleration strategy (e.g., DPM-Solver++) to further reduce processing time. R3[Q4] Details about data pre-processing. [A4] In DisC-Diff, preprocessing includes centre-cropping each slice to 224x224 and applying min-max normalization. Data augmentation methods have not been applied. We will include these details in our final version.

back to top

DisC-Diff: Disentangled Conditional Diffusion Model for Multi-Contrast MRI Super-Resolution