Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jiazhen Pan, Suprosanna Shit, Özgün Turgut, Wenqi Huang, Hongwei Bran Li, Nil Stolt-Ansó, Thomas Küstner, Kerstin Hammernik, Daniel Rueckert

Abstract

In dynamic Magnetic Resonance (MR) imaging, k-space is typically undersampled due to limited scan time, resulting in aliasing artifacts in the image domain. Hence, dynamic MR reconstruction requires not only modeling spatial frequency components in the x and y directions of k-space but also considering temporal redundancy. Most previous works rely on image-domain regularizers (priors) to conduct MR reconstruction. In contrast, we focus on interpolating the undersampled k-space before obtaining images with Fourier transform. In this work, we connect masked-image-modeling with k-space interpolation and propose a novel Transformer-based k-space Global Interpolation Network, termed k-GIN. Our k-GIN learns global dependencies among low- and high-frequency components of 2D+t k-space and uses it to interpolate unsampled data. Further, we propose a novel k-space Iterative Refinement Module (k-IRM) to enhance the high-frequency components learning. We evaluate our approach on 92 in-house 2D+t cardiac MR subjects and compare it to MR reconstruction methods with image-domain regularizers. Experiments show that our proposed k-space interpolation method quantitatively and qualitatively outperforms baseline methods. Importantly, the proposed approach achieves substantially higher robustness and generalizability in cases of highly-undersampled MR data.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_22

SharedIt: https://rdcu.be/dnwwA

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors propose a transformer method for interpolating k-space data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Applying transformers for k-space interpolation is new

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The existing literature is not described accurately which obscures the context of the contribution. The practical relevance of the approach is unclear.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Very difficult to reproduce.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The literature review is incomplete and includes inaccurate statements:

    (a.) Low-rank models are not inherently image-domain models. Some of the earliest low-rank modeling papers use formulations in (k,t)-space, and can be viewed as (k,t)-space interpolation.

    Z.-P. Liang, “Spatiotemporal imaging with partially separable functions,” in Proc. IEEE ISBI, 2007, pp. 988–991.

    J. P. Haldar and Z.-P. Liang, “Spatiotemporal imaging with partially separable functions: A matrix recovery approach,” in Proc. IEEE ISBI, 2010, pp. 716–719.

    (b.) There were many k-space interpolation methods developed in the last 20 years, so focusing on references 8,15,12 for k-space interpolation severely understates the progress that has been made in this area. There are calibration-based methods like SPIRiT and PRUNO, there are calibrationless structured low-rank k-space interpolation methods like SAKE, LORAKS, and ALOHA, and there are convolutional methods like RAKI and LORAKI. A recent review article related to k-space interpolation is

    J. P. Haldar and K. Setsompop, “Linear Predictability in Magnetic Resonance Imaging Reconstruction: Leveraging Shift-Invariant Fourier Structure for Faster and Better Imaging.” IEEE Signal Processing Magazine 37:69-82, 2020.

    (c.) There are also methods like k-t GRAPPA that perform interpolation in (k,t)-space.

    1. Modern MRI uses multi-channel data acquisition, but the paper simulates single coil. The reason for this disconnection from the practical scenario is unclear, the multi-channel data contains more information and should produce better results.

    2. The paper uses a lot of jargon without providing sufficient explanations or citations. The paper seems to assume that all readers will have intimate familiarity with transformers, which is not a good assumption.

    3. The paper describes cardiac MRI without being clear about the sampling assumptions. Are these prospectively gated scans or retrospectively gated? The k-space sampling patterns for retrospective gating will be less controllable than for prospective gating, and it is not clear if the simulated k-space patterns are realistic.

    4. There is no value in showing both PSNR and NMSE which are both by transforming the same L2-norm error value. Only one of them needs to be shown. There is likely something wrong in the calculation of one of these metrics in Table 1, because NMSE and PSNR should always produce the same ranking of methods.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the method is interesting, papers that fail to accurately describe the current state of the field often cause problems for the research community.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    In this paper, a purely k-space based approach to dynamic image reconstruction is proposed, using a transformer architecture to learn global k-t space signal dependencies. This is combined with an additional refinement module (also transformer based), but operating with a logarithmic loss function that provides higher weighting to high spatial frequency information. The approach is evaluated in in-house acquired cardiac CINE MRI data acquired on a 1.5T scanner. The method is shown to outperform comparison methods (total variation CS, low-rank plus sparse, and image-based CNN prior), as well as significantly improved performance to unseen acceleration factors, owing to the k-space based approach.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper has several strengths. The use of a logarithmic HDR loss function in the refinement module is a promising approach to ensuring preservation of high-frequency information. The robustness to acceleration factors not seen during training is a highly significant result, and not something that typically results from image-based regularization.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    This paper also has several weaknesses. One, mentioned by the authors, is the application of the technique to single-channel MR data, and not taking advantage of multi-channel information, particularly for the baseline comparisons. It’s also unclear whether the authors used the Swin-transformer approach for hierarchical attention modelling, and if not, whether the developed approach would scale well to the multi-coil case. Furthermore, the comparison method used as a baseline imge-based CNN prior is the DcCNN method, which is fairly old now (ca. 2018), and not necessarily representative of the current state-of-the-art in image-based regularizers.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    A clear high level description of the method is provided, although no link to a code repository or mention of the actual ML framework used is provided (as far as I can tell)

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The paper would be significantly improved if methods were evaluated in a multi-coil setting, given the known benefit to the compared baseline methods with multi-coil information, compared to a single-coil context. Although the proposed approach does outperform baseline methods in the single-coil case, it’s not necessarily clear that the proposed k-GIN method would outperform TV or Low-rank methods when multi-channel information is used. Furthermore, the paper would also be improved if the comparison with the image-based prior used a more modern approach, such as “Deep low-rank plus sparse network for dynamic MR imaging” MEDIA 2021, or “Accelerating cardiac cine MRI using a deep learning‐based ESPIRiT reconstruction” Magn Res Med 2021.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper shows good results, and has several novel contributions (k-space transformers, HDR loss). However, in my opinion the lack of multi-coil integration and the weak comparison to an older image-based prior method leave it with moderate weaknesses, although it should certainly be accepted.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This work suggest k-space reconstruction network to accelerate Cardiac MR imaging. This work proposes Transformer-based k-space interpolation for 2D+t reconstruction, and introduces HDR loss which put more weights on low signal intensity k-space (preferably outer k-space containing high-frequency components).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Using transformer-like approach for k-space interpolation was interesting.
    2. The author used two networks – Global Interpolation Network and k-space Iterative refinement module.
    3. They didn’t use image domain network, but shows good results, which I think is interesting and promising.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. I am wondering about the so-called “Iterative Refinement Module”. However, looking at the method, it uses three transformer blocks sequentially and I cannot find any “iterative” module.
    2. The author present only the final outcome and quantitative assessment. It would have been more interesting if there was a figure dedicated to show the intermediate stages (kspaces after k-GIN process and after refinement).
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper is easy to follow, and if the code is released I think this will be reproducible

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    I liked the idea of using transformer in k-space and liked the iterative refinement module. The paper would have been better if the comparison study contained various SOTA methods, especially those on using CNN network for K-space domain (i.e., KIKI-Net). Moreover, the study didn’t show k-space, so it was hard to conceive the idea.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think lack of comparison study was the major factor. The author should have tested more recent reconstruction methods (learning data-consistency network, L+S deep learning, to name a few). It is hard to call this method SOTA.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    I think the authors made effort in performing more comparison suggested by the reviewers. Also, they answered most of the questions both major and minor carefully.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a k-space interpolation method for cine MR reconstruction using a transformer-based approach. The idea has been found to be attractive by all three reviewers. Despite this, several concerns were raised regarding the lack of references to existing literature and the persuasiveness of the comparison results. Furthermore, a significant issue is that all experiments were conducted in a single-coil setting, which raises the question of whether this method can be applied to multi-coil settings. To address this, it is important to identify the factors that currently prevent the use of this approach in multi-coil applications.




Author Feedback

We thank the reviewers for the thoughtful feedback. We appreciate that all reviewers found our idea of using masked-image-modeling with Transformers to interpolate k-space globally for cine MR reconstruction, novel. We are encouraged by their positive feedback on superior results without using any image domain operation (R3), increased robustness to acceleration rates unseen during training which image domain prior-based methods cannot offer (R2) and novel HDR loss function to enhance high-frequency signals in k-space (R2).

Major:

  1. Lack of related work (R1,M2) The current state of the field usually interpolates k-space using local operators e.g. SPIRiT, SAKE, LORAKI as mentioned by R1. Therefore we mentioned representative methods from this category e.g. GRAPPA, RAKI and emphasized how our method differs from them. We can also conduct global operations that are crucial for reconstruction. We thank the constructive comments and will include all references of R1. Further, we improve Sec. 2.2 now to describe a more accurate state of the field.

  2. Persuasiveness of the baselines (R2,R3,M2) For a fair comparison, we chose three representative works that are single-coil based and refrain from comparing with multi-coil methods because groundtruth k-space is different in single-coil and multi-coil settings. Nevertheless, we have now implemented L+S-Net [1] as suggested by R2 and R3 with the single-coil setting. We apply five L+S blocks in L+S-Net and it performs on par with DcCNN and our method at acceleration rate 4 (R4). PSNR R4: L+S-Net: 40.53, DcCNN: 40.29, Ours: 40.36

However, it performs substantially inferior to our proposed method at R8 data that is unseen during training. PSNR R8: L+S-Net: 30.36, DcCNN: 29.77, Ours: 35.67 This observation reaffirms the superior robustness to unseen acceleration rates of our proposed method. In contrast, methods using image domain priors are susceptible to variability in artifact types arising from different undersampling rates.

  1. Potential multi-coil extension (R1,R2,M2) Since we are one of the first to explore the feasibility of Transformers on k-space interpolation, single-coil settings serve as a touchstone for us before approaching multi-coil. Our current implementation on single-coil cine data (2D+t) requires 10GB of GPU memory for training. A direct extrapolation to multi-coil cine data would require more than 60GB of memory, which exceeds our current GPU capability. However, there are some solutions to address memory issues. For example, using memory-efficient Transformer backbones e.g. [2,3] instead of the currently used Vision Transformer, could reduce the GPU memory by a factor of 4. We are keen on implementing this in future work. We will clarify this limitation in the final version as suggested by M2. Additionally, our method can be applied to static multi-coil data e.g. brain or knee by replacing the time dimension with the coil dimension. Future work will investigate this in more detail.

Minor:

  • Transformers jargon (R1): We will add more details for jargon i.e. position embedding, Multi-head Attention and improve readability.

  • Inconsistent rank using NMSE and PSNR (R1): The PSNR was calculated using 2D+t cardiac sequence while the NMSE was calculated on every single 2D cardiac frame, which caused this discrepancy. We fixed this and now use both metrics on the 2D+t cardiac sequence. The NMSE at R4 is 0.088 for both DcCNN and our method, thus it doesn’t change our experiments’ conclusion.

  • k-space visualization and gating (R1,R3): The raw data is collected with prospective gating. We simulated the single-coil data as suggested by DcCNN and CRNN [4]. We will add the groundtruth and estimated k-space images in our final version.

Refs: [1] Huang et al., Deep low-rank … MR imaging, MedIA 2021 [2] Dao et al., Flashattention … io-awareness, NeurIPS 2022 [3] Xiong et al., Nyströmformer … self-attention, AAAI 2021 [4] Qin et al., Convolutional … reconstruction, TMI 2018




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes a k-space interpolation method for cine MR reconstruction using a transformer-based approach. The authors have clearly identifed questions from reviewers, and attempted effective response. It’s a bit above borderline.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper introduces a method for dynamic cardiac data interpolation in single coil k-space using vision transformers. While the main contribution lies in the utilization of vision transformers, it is important to acknowledge that both the concept of k-space interpolation and the use of vision transformers are well-established techniques. Therefore, the paper lacks novelty in this regard. To compensate for this limitation, a more comprehensive evaluation in a real-world setting is expected. Despite these concerns, the reviewers expressed positive feedback regarding the paper.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors provided their rebuttal to address the reviewers’ concerns in the initial review phase. I looked at the paper and rebuttal. I agreed with the reviewers on the issues of the methods, such as insufficient comparisons to more sota methods (R3), simulations of multi/single coil (R1 and R2), sampling strategy (R1), and assessment criteria (R1). Unfortunately, the authors did not clear those issues in their rebuttal. I have to recommend to reject the paper at current stage.



back to top