Authors

Eric Z. Chen, Chi Zhang, Xiao Chen, Yikang Liu, Terrence Chen, Shanhui Sun

Abstract

Compared with 2D MRI, 3D MRI provides superior volumetric spatial resolution and signal-to-noise ratio. However, it is more challenging to reconstruct 3D MRI images. Current methods are mainly based on convolutional neural networks (CNN) with small kernels, which are difficult to scale up to have sufficient fitting power for 3D MRI reconstruction due to the large image size and GPU memory constraint. Furthermore, MRI reconstruction is a deconvolution problem, which demands long-distance information that is difficult to capture by CNNs with small convolution kernels. The multi-layer perceptron (MLP) can model such long-distance information, but it requires a fixed input size. In this paper, we proposed Recon3DMLP, a hybrid of CNN modules with small kernels for low-frequency reconstruction and adaptive MLP (dMLP) modules with large kernels to boost the high-frequency reconstruction, for 3D MRI reconstruction. We further utilized the circular shift operation based on MRI physics such that dMLP accepts arbitrary image size and can extract global information from the entire FOV. We also propose a GPU memory efficient data fidelity module that can reduce >50% memory. We compared Recon3DMLP with other CNN-based models on a high-resolution (HR) 3D MRI dataset. Recon3DMLP improves HR 3D reconstruction and outperforms several existing CNN-based models under similar GPU memory consumption, which demonstrates that Recon3DMLP is a practical solution for HR 3D MRI reconstruction.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_19

SharedIt: https://rdcu.be/dnwwx

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #2

Please describe the contribution of the paper

In this paper, a reconstruction method is proposed for 3D MRI reconstruction to improve memory efficiency, The method, termed Recon3DMLP, uses CNN modules with small kernels along with adaptive MLP modules. Several other techniques such as efficient data fidelity are used to boost memory efficiency. Results on a high-resolution 3D MRI dataset show the effectiveness of the proposed method.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Novelty:

I believe the main contribution is the novel architecture, which combines a CNN with an arbitrary size adaptive MLP in the context of an unrolled (cascaded) network. Compared to Recon3DCNN, Recon3DMLP allows for much more learnable parameters. The increase in learnable parameters translates to improved fitting capacity (Fig. 3) and better performance (Table 1) with better memory.

One claimed benefit of the method is enabling arbitrary image resolution. Although this is each achieved by other MLP-based vision models such as MAXIM [1], to the best of my knowledge it is the first time an MLP-based unrolled model for MRI reconstruction has this ability. Thus, this can be considered as a strength of the method.

Evaluation:

The dataset that is used is very high-dimensional, and is a realistic and challenging setting to apply deep learning. This makes the experimental results significant.

The ablations are extensive and clear, facilitating understanding of the contribution of each component of the method.

[1] Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., Li, Y.: Maxim: Multi-axis mlp for image processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5769–5780 (2022)
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The main technical novelty is the architecture change. However, as there is not a direct comparison with an existing MLP based method, it is difficult to measure how this method ranks among MLP-based methods. (This is not to say the method does not have merit, which is proven by ablations compared to the CNN-based architecture.)

Even though the learnable parameters are much larger, the performance improvement can be considered marginal. For instance, Recon3DCNN (e=24) has 960K parameters, 15.2GB memory, and SSIM/PSNR is 0.9649/41.1503, where best performing Recon3DMLP has 11,264K parameters, 11.5GB memory, and SSIM/PSNR of 0.9637/41.1953.

Coil-by-coil processing of the data-fidelity term is commonly used in multi-coil MRI reconstruction. Furthermore, as Table 1 shows, efficient data-fidelity directly trades off memory with time (time: 1.17s vs 3.04s, memory: 35.5GB vs 11.5). Therefore, I do not believe that efficient data-fidelity is a contribution of this paper, and does not significantly improve the memory-computation tradeoff.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The papers results are reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
Typographical comments:
- Section 2.1: Rencon3DMLP -> Recon3DMLP
- What is meant by “ReconFormer and Img2ImgMixer models failed to train on datasets with various sizes, indicating the limitation of these methods.” ? Does this mean that the training of these models did not converge, or the models did not fit in the memory? If so, could the number of parameters be reduced to fit these methods into memory, and what would the number of parameters be in that case?
-For training and inference, was the batch size set to 1 for all methods?

-Is checkpointing used for all baselines, and what is the number of checkpoints? How much does it reduce memory?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Overall, the proposed method Recon3DMLP is a novel method for 3D MRI reconstruction. The combination of a CNN with an adaptive MLP in an unrolled network is new, and enables arbitrary input resolution.

I have a number of concerns on the amount of novelty, and the strength of the evaluation, as listed above. However, I believe strengths outweigh weaknesses, and its publication can advance the research topic of memory-efficient reconstruction.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #1

Please describe the contribution of the paper

This work proposed an memory-efficient MLP strategy for 3D MR image reconstruction with arbitrary image size. Comprehensive ablation and comparison showed the superior performance of the proposed module.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The proposed adaptive MLP is novel and accepts arbitrary image size.
2. Memory efficient learning/inference was studied, which is important for clinical practice.
3. Comprehensive ablation study was performed.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. It’s not quite clear how the 1D FC operation was performed after image patching, and why such process was able to handle arbitrary input size. Equations with symbol would be helpful for readers to understand the idea.
2. In figure 5, the authors compared the k-space difference between Recon3DMLP with and without dMLP to visualize high-frequency k-space recovery. A better visualization should be comparing difference between ground truth k-space and Recon3DMLP with or without dMLP.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Proper data statement was provided.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The authors should think of formulizing the proposed dMLP module as equations for better understanding. Also a better visualization compared with ground truth k-space data should be performed.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The dMLP idea is novel. Memory usage and variable input 3D image size are two major issues when deploying DL recon into clinical practice. The work proposed their approach to these two issues and the results were promising.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The major contribution of the paper is a new method proposed to accelerate 3D MRI reconstruction which costs a long time in computation and a large amount of memory.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Authors are the first to present a practical solution utilizing the proposed dMLP and eDF to overcome the computational constraint for HR 3D MRI reconstruction with various sizes. It is novel to apply dMLP and eDF to accelerate 3D MRI recon. It is also meaningful to accelerate 3D MRI reconstruction which has a long time cost.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Only one dataset reconstruction result is shown in the manuscript. Precision quantization is used in the paper but it is not stated clearly which factors enhance the computation efficiency. The proposed method is not compared to other existing 3D MRI recon methods. For example, recon quality and computational costs are not evaluated and compared.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility of the paper is good.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

It would be better to provide a supplemental document to compare the proposed method with other existing 3D MRI recon methods such as Ke Wang, Jonathan I. Tamir, Alfredo De Goyeneche, Uri Wollner, Rafi Brada, Stella X. Yu, Michael Lustig, “High fidelity deep learning-based MRI reconstruction with instance-wise discriminative feature matching loss”, Magnetic Resonance in Medicine, volume 88, pages 476-491, 2022.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The major factors that led me to my overall score for this paper are the novelty to accelerate 3D MRI recon. dMLP and eDF are new to 3D MRI reconstruction. Another factor is the lack of more results evaluation.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

All reviewers appreciate the novelty and significance of the proposed Recon3DMLP method for memory-efficient 3D MRI reconstruction that can handle arbitrary image sizes. The strengths highlighted by the reviewers include the novelty of the adaptive MLP, memory-efficient learning/inference, comprehensive ablation study, and applicability to high-dimensional and realistic datasets. However, some concerns have been raised about the lack of comparison with other existing 3D MRI reconstruction methods, the need for better visualization of ground truth k-space data, and the marginal improvement in performance.

Considering the overall positive feedback from the reviewers, and the potential impact of the proposed method in the field of 3D MRI reconstruction, I recommend accepting the paper. However, it is suggested that the authors address the concerns raised by the reviewers, particularly providing more comparisons with existing 3D MRI reconstruction methods and improving the visualization of k-space data.

Author Feedback

We thank all reviewers for their positive comments and constructive feedback. We clarify a few points raised by the reviewers as follows.

Comparison with other existing 3D MRI reconstruction methods (R3)

The goal of our paper is to develop a practical memory-efficient framework for 3D MRI reconstruction in clinical use. As acknowledged by R1, we provided comprehensive comparisons with various memory-efficient approaches such as 3D convolution decomposition, FP16 inference, model reparameterization, depth separable convolution and eDF. We also provided comparisons for CNN models with different widths and depths as well as experiments for various ablated Recon3DMLP models.

Due to the large 3D image size and computation constraint, the SOTA methods for 2D MRI reconstruction are not directly transferable to 3D MRI reconstruction. To the best of our knowledge, there is hardly any existing work proposing deep learning based architectures specifically for 3D MRI reconstruction. The paper pointed out by R3 (Wang et al.) proposed a new loss function rather than a network architecture. Although evaluated on 3D MRI data, their method utilizes a previously published 2D model (MoDL) and treats 3D data as 2D (i.e., extract 2D slices from the 3D volume and train/evaluate on each 2D slice independently). Such a method ignores the information across slices and reconstructs each slice one by one, which is slow and may have limited practical value. Therefore, the method in Wang et al. is a 2D MRI reconstruction framework, while our method is a 3D based method that directly reconstructs 3D MRI data and utilizes the across slice information.

Marginal improvement in performance (R2)

All baseline models have relatively good performance on the low-frequency reconstruction (indicated by relatively large PSNR/SSIM values). The improvement of the proposed method is mainly from the better high-frequency reconstruction (Fig.4 and 5), which may not be effectively measured by PSNR/SSIM. Nevertheless, the p values show that the improvement of the proposed method over baseline methods is statistically significant.

Better visualization of k-space difference (R1)

We will provide a supplementary figure comparing the k-space difference between ground truth and Recon3DMLP with and without dMLP.

More model and experiment details (R1,R2)

After image patching, the 1D FC layers are applied over each patch dimension. The FC layers are shared and thus can be applied to an arbitrary number of patches, which corresponds to the arbitrary image size. We omit equations due to the space limit and use figures to demonstrate the dMLP module.

We used batch size = 1 for all models in training and inference. Since all models have cascade architectures, we wrapped each cascade module into the gradient checkpointing function respectively (total 5). As an example, Recon3DCNN (e=24, RO=32) takes 9.3G and 12.9G training GPU memory with and without gradient checkpointing. Both ReconFormer and Img2ImgMixer require to specify a fixed input image size when constructing the model and throw runtime errors on data with sizes different from the specified value.

back to top

Computationally Efficient 3D MRI Reconstruction with Adaptive MLP