Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Shoujin Huang, Jingyu Li, Lifeng Mei, Tan Zhang, Ziran Chen, Yu Dong, Linzheng Dong, Shaojun Liu, Mengye Lyu

Abstract

Magnetic Resonance Imaging (MRI) is a critical imaging tool in clinical diagnosis, but obtaining high-resolution MRI images can be challenging due to hardware and scan time limitations. Recent studies have shown that using reference images from multi-contrast MRI data could improve super-resolution quality. However, the commonly employed strategies, e.g., channel concatenation or hard-attention based texture transfer, may not be optimal given the visual differences between multi-contrast MRI images. To address these limitations, we propose a new Dual Cross-Attention Multi-contrast Super Resolution (DCAMSR) framework. This approach introduces a dual cross-attention transformer architecture, where the features of the reference image and the up-sampled input image are extracted and promoted with both spatial and channel attention in multiple resolutions. Unlike existing hard-attention based methods where only the most correlated features are sought via the highly down-sampled reference images, the proposed architecture is more powerful to capture and fuse the shareable information between the multi-contrast images. Extensive experiments are conducted on fastMRI knee data at high field and more challenging brain data at low field, demonstrating that DCAMSR can substantially outperform the state-of-the-art single-image and multi-contrast MRI super-resolution methods, and even remains robust in a self-referenced manner. The code for DCAMSR is avaliable at https://github.com/Solor-pikachu/DCAMSR.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_30

SharedIt: https://rdcu.be/dnwwJ

Link to the code repository

https://github.com/Solor-pikachu/DCAMSR

Link to the dataset(s)

https://github.com/facebookresearch/fastMRI

https://github.com/mylyu/M4Raw


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a transformer network to improve the resolution of MR images using multi-contrast MR data. Towards this aim, a fully sampled image has been used as a reference to guide the recovery of high-resolution images of another contrast from low-resolution inputs. Particularly, they have used dual cross-attention transformer that extracts both spatial and channel information from these anatomically aligned multi-contrast images during feature extraction and fusion process.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper has presented a variant of transformer based cross attention approach to combine channel attention and spatial attention. This enables using the full-resolution reference image to extract more detailed texture information and therefore to improve the resolution of the image of interest.

    2. The paper presents a very comprehensive literature review, clearly outlines the current progress and positions the proposed method in the context.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The contribution of the paper is limited to combining existing transformed based tools for the super-resolution application. For example, both cross-attention approach for spatial cross-attention mechanism with full-resolution reference image and incorporating channel attention have used existing works [1,2].

    2. Jaegle, A., Borgeaud, S., Alayrac, J.B., Doersch, C., Ionescu, C., Ding, D., Koppula, S., Zoran, D., Brock, A., Shelhamer, E., et al.: Perceiver io: A general architecture for structured inputs & outputs. arXiv preprint arXiv:2107.14795 (2021) 2
    3. Shaker, A., Maaz, M., Rasheed, H., Khan, S., Yang, M.H., Khan, F.S.: Unetr++: Delving into efficient and accurate 3d medical image segmentation. arXiv preprint arXiv:2212.04497 (2022) 2
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper provides sufficient information for reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The authors haven’t discussed any limitations of the proposed method.The paper utilizes multi-contrast MR imges. Particulary, a fully-sampled reference image of one contrast is used for improving the resolution of a low-res image of another contrast. The two MR images have to be anatomically-aligned. In the presence of motion artifact between two MR images (reference MR and low res MR), how the alignment will be recovered? How would the misalignment between these two images affect the reconstruction performance?

    2. Is it clinically feasible to acquire the high-res reference images for all patients?

    3. For the performance evaluation, authors have indicated a single value for PSNR and SSIM. What are the ranges of these performance metrics? Please provide the mean and standard deviation values. Also, discuss the statistical significance of the performance metric values obtained using different methods.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well-orgnaized and well-written. The contribution of the paper is limited to combining existing methods for the particular application of improving resolution of MRI. The authors also didn’t mention any limitations and possible ways to mitigates those in the paper.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    A novel approach for MRI super-resolution is provided. Thereby, a dual attenuation network is utilized to learn the high details from the reference multi-contrast/resolution MRI sequences. The authors claim that this way, the varying features from the MRI stack get better transformed w.r.t. resolution and channels/spatial information. In contrast to deep learning based MRI multi-resolution approaches, the multi-contrast nature of the MRI data is utilized for the transformers. As theses multi-resolution aspects are present anyway for conventional MRI sequences, the acquisition protocol can stay unchanged in medical practice.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Excellent introduction / state of the art section. The authors clearly demonstrate a perfect overview regarding both, the imaging modalities as well as state of the art strategies in image super-resolution.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper assumes very, very deep understanding of the described multi-resolution topic. This is probably also due to the fact that the paper tries to include a lot of information and content for the length of a regular MICCAI publication (3 complex research questions get addressed within this paper).

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Common MRI datasets are utilized for training and testing. Furthermore, the authors state that “The code for DCAMSR will be available at GitHub after the blind review” – so probably a high level of reproducibility will be ensured if this draft is accepted for publication at the MICCAI.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    If supplementary material is provided, it should show similar format compared to the particular paper.

    Sentence “the texture features is aggregated”  check for singular/plural. In section “Encoder” on page 3: no need to state all of the image dimensions – can be provided via generic formula to save space in the paper (and to prevent from mismatch in inter line spacing). Caption of “Fig. 2.” ==> check blanks, as there are mismatches. “(d)Details of” obviously without any blank. “Finally, Xspatial and Xchanel” ==> x-ChaNNel!!

    “F H 8 × W 8 LR” ==> F_LR_8, i.e. simpler syntax and equation only once.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Structure and presentation of the paper is good. Evaluations proof that it is highly applicable in the area of image super resolution. Nevertheless, too much content for just 1 paper is provided making it hard to follow.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    This paper proposed a Dual Cross-Attention Multi-contrast Super Resolution (DCAMSR) frame work that constructs high resolution T2 data from low resolution T2 data and a reference T1 image.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed Dual Cross-Attention Multi-contrast Super Resolution (DCAMSR) framework addresses the limitations of existing super-resolution methods for multi-contrast MRI data. The use of a dual cross-attention transformer architecture with both spatial and channel attention allows for the extraction and promotion of features from both the reference image and the up-sampled input image in multiple resolutions.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper lacks statistical analysis or statistical distributions to support the experimental results, which is a significant drawback. Additionally, the authors did not compare the computational complexity of their approach against other methods, which could have provided valuable insights into the efficiency of the proposed solution.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors completed the reproducibility checklist without fully committing to its requirements. For instance, they listed “an analysis of statistical significance of reported differences in performance between methods” on the checklist, but the paper does not provide any statistical analysis for performance comparison. This lack of adherence to the reproducibility checklist raises significant concerns about the credibility of the self-claimed reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please refer to the weaknesses section for my comments.

    Furthermore, to enhance the overall clarity of this paper, I suggest the following experimental improvements:

    1. The author should consider providing the reference image used in Fig. 4 and discuss the potential impact on performance if the reference image is not perfectly aligned with the LR image.

    2. The proposed algorithm was only applied to 2D images instead of whole volume 3D images. It would be interesting to investigate whether the HR images maintain spatial consistency when treated as separate slices in the super resolution network. This could be discussed in future work.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, while the DCAMSR framework shows promise as a contribution to the field of MRI super-resolution, it would benefit from further validation and refinement. The fact that the proposed method was evaluated solely on 2D images raises concerns about its potential performance on 3D networks, which are more commonly used in MRI applications.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper introduces a variant of a transformer-based cross-attention approach for image super-resolution, specifically for multi-contrast MRI data. The proposed Dual Cross-Attention Multi-contrast Super Resolution (DCAMSR) framework combines spatial and channel attention mechanisms to utilize detailed texture information from a full-resolution reference image, improving the resolution of the target image. The paper includes an excellent literature review and contextualizes the proposed method within the current state of the art. However, some limitations and areas for improvement are identified. The paper relies heavily on existing tools and lacks original contributions in terms of combining these tools for super-resolution. Statistical analysis and comparisons of computational complexity with other methods are missing. The limitations of the proposed method, such as the impact of misalignment between reference and low-res images and the feasibility of acquiring high-res reference images for all patients, should be addressed. Furthermore, providing statistical details, discussing the significance of performance metrics, and improving writing clarity are suggested. Additional experimental improvements include discussing the potential impact of misalignment in Fig. 4 and exploring the spatial consistency of HR images in 3D volume super-resolution as future work.

    The work might be accepted if modifications can be done as suggested above.




Author Feedback

We would like to thank the valuable comments from the reviewers. Our response is as follows. Novelty: While we duly acknowledge that channel attention is not a novel concept developed by our study, our work significantly exceeds a mere application of channel attention and existing transformer-based technologies. Our unique contributions are summarized in the final paragraph of the Introduction section. The proposed Dual Cross-Attention Transformer, illustrated in Fig. 1, diverges substantially from previous hard-attention-based transformer methodologies and is particularly suitable for the MRI super-resolution task. The impressive performance of our model cannot be attributed to merely “combining existing transformer-based tools”.

Impact of image misalignment: We are grateful to the reviewers for drawing attention to this matter. Misalignment is indeed a common issue for all multi-contrast methods. Unlike several prior studies, we refrained from employing image co-registration, yet still accomplished high-quality super-resolution. The reported PSNR and SSIM values on M4Raw have considered real cases that were slightly misaligned due to inter-scan motion (please refer to Ref 18 for details). Despite these cases, the standard deviations of PSNR and SSIM presented in the updated supplementary materials indicate our method’s robustness. We have included a brief discussion on this matter in the revised main text.

Clinical feasibility of acquiring reference images: In a clinical context, high-resolution reference images are typically readily available, as MRI examinations routinely acquire multiple contrasts like T1, T2, and FLAIR. Particularly, high-resolution T1-weighted images, which were used as references in our study, can be efficiently acquired due to their short repetition time (TR). A more exhaustive discussion on this, albeit not included in this paper due to page limitations, can be found in various multi-contrast reconstruction studies.

Statistical analysis and computational complexity: We concur with the reviewers on the need for this information, and we apologize for its omission in our initial submission. We have now updated the supplementary materials to include statistical analysis and computational complexity. Our method significantly outperforms other tested methods (with a p-value < 0.001 according to paired t-tests) and demonstrates reasonable model size and FLOPs.

Topic depth: We agree with the reviewer’s observation regarding the denseness of information within the page limit. We hope our GitHub source code will aid readers in comprehending our paper more efficiently.

Formatting and misspelling issues: We deeply appreciate the reviewers’ feedback regarding these matters and have accordingly made the suggested corrections.

Extension to 3D in future work: The potential for exploring 3D super-resolution in future work is indeed intriguing. We briefly discuss the feasibility of this extension in the revised main text.



back to top