Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zhiyun Song, Xin Wang, Xiangyu Zhao, Sheng Wang, Zhenrong Shen, Zixu Zhuang, Mengjun Liu, Qian Wang, Lichi Zhang

Abstract

Cross-modality synthesis (CMS) and super-resolution (SR) have both been extensively studied with learning-based methods, which aim to synthesize desired modality images and reduce slice thickness for magnetic resonance imaging (MRI), respectively. It is also desirable to build a network for simultaneous cross-modality and super-resolution (CMSR) so as to further bridge the gap between clinical scenarios and research studies. However, these works are limited to specific fields. None of them can flexibly adapt to various combinations of resolution and modality, and perform CMS, SR, and CMSR with a single network. Moreover, alias frequencies are often treated carelessly in these works, leading to inferior detail-restoration ability. In this paper, we propose Alias-Free Co-Modulated network (AFCM) to accomplish all the tasks with a single network design. To this end, we propose to perform CMS and SR consistently with co-modulation, which also provides the flexibility to reduce slice thickness to various, non-integer values for SR. Furthermore, the network is redesigned to be alias-free under the Shannon-Nyquist signal processing framework, ensuring efficient suppression of alias frequencies. Experiments on three datasets demonstrate that AFCM outperforms the alternatives in CMS, SR, and CMSR of MR images. Our codes are available at https://github.com/zhiyuns/AFCM.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_7

SharedIt: https://rdcu.be/dnwjh

Link to the code repository

https://github.com/zhiyuns/AFCM

Link to the dataset(s)

https://brain-development.org/ixi-dataset/

https://adni.loni.usc.edu/


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper suggests a method to perform consistent compressed sensing (CMS) and super-resolution (SR) using co-modulation, which allows for reducing slice thickness to non-integer values. The network is designed to be alias-free under the Shannon-Nyquist signal processing framework, enabling efficient suppression of alias frequencies.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The motivation of the work is well-defined: (1) 3D inconsistent reconstruction across slices is a notorious issue. (2) the synthesis model could only generate fixed thickness; (3) high-frequency details with alias frequencies are often ignored.

    Open dataset were included for validation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The experiment results show that the proposed method outperformed the baseline, however, the improvement still seem to be moderate. The significance test is missing.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It seems to be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • ‘Table -1 … only one slice is taken as the input’ what is the slice selection standard?
    • what is the sample selection criteria in Figure 2?
    • I’m curious that if the proposed results have significant benefits for the downstream analysis.
    • Recently, diffusion models shows promising efforts for synthesis (i.e., super resolution), please add relevant discussion.
    • The modality labels are missing in the left part of Fig 3.
    • For non-integer slice thickness, what is the results difference of directly resampling integer slice thickness to the target thickness?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The motivation of the paper.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors propose Alias-Free Co- Modulated network (AFCM) to accomplish cross-modality synthesis (CMS) and super-resolution (SR) with a single network design through co-modulation, which provides the flexibility to reduce slice thickness to various, non-integer values for SR. To ensure alias-free generation, the authors redesign the resampling filter and nonlinearity in each layer of the decoder to suppress the corresponding alias frequencies.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is well-written, easy to follow and addresses an important issue.
    2. The results (PSNR and SSIM) look promising.
    3. Adequate amount of literature review, experiments have been done.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Lacking clinical significance description. The submission could be made stronger with inputs from clinicians on the generated images i.e. conducting a user study to see if the generated images look realistic; or using the generated images for a downstream task such as segmentation of blood clots for CSDH dataset or detection/diagnosis of Alzheimer’s.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Implementation details are somewhat provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see weakness section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper lacks clinical significance description. The submission could be made stronger with inputs from clinicians on the generated images i.e. conducting a user study to see if the generated images look realistic; or using the generated images for a downstream task such as segmentation of blood clots for CSDH dataset or detection/diagnosis of Alzheimer’s. I have concerns about reproducibility of the results as the implementation details are somewhat present, however the details about in-house dataset is missing.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This manuscript is on cross-modality synthesis (CMS), super-resolution (SR), and cross-modularity super-resolution (CMSR) of MRI data. It addresses these problems jointly. It suppresses aliasing artifacts with a proper design of the generator with a co-modulated network. The reconstruction is done slice-wise for the images and the method can change the resampling of the slices with non-integer target thickness continuously. Their method is called Alias-Free Co-Modulated network (AFCM).

    It uses deep learning with a GAN net. The implementation uses an NVIDIA GPU.

    It is evaluated on three open datasets. The validation measures are PSNR and SSIM. It is compared with other tools based on deep learning. These are ResViT, DeepResolve, and SynthSR. The proposed methodology improves comparative performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    It addresses a valid problem for medical imaging that is dealing with multimodal and multi-contrast data at various resolutions. It uses cross-modality synthesis and super-resolution. The validation shows its efficiency.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    In the comparison section the EDSR method is mentioned and cited in table 3 but not included in the manuscript.

    There are multiple experiments of super-resolution and cross-modality synthesis. They are not sufficiently explained. The improvement in the performance of the method is marginal and sometimes it does worse such as compared to SynthSR [10]. The images in most cases do not show any visible improvement compared to some of the other methods. In any case, the tables and figures with the results are very small and difficult to sea and read. The time and space requirements of the method are not described.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No information is provided in the text. But, the authors state that they will make the code available upon acceptance.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Improve the focus of the paper. A very large number of experiments are reported. However, they are not sufficiently described and discriminated between them. The presentation of the results can also be improved. A reasonable number of experiments well-explained with demonstrated efficiency would be preferable. Figures and tables with large enough images and text help.

    It would be better if the manuscript was self-explanatory without the need to refer to references [15,16].

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Addresses a valid problem, namely of cross modality synthesis and super-resolution. But, a large number of experimental results are presented without being sufficiently described and the results are not clearly presented.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The motivation of the work is well-defined, addressing the issues of 3D inconsistent reconstruction, fixed thickness in synthesis models, and the neglect of high-frequency details with alias frequencies. Open datasets were included for validation.

    The paper is well-written, easy to follow, and addresses an important issue. The results in terms of PSNR and SSIM look promising, and an adequate amount of literature review and experiments have been conducted.

    The proposed method deals with multimodal and multi-contrast medical imaging data at various resolutions, using cross-modality synthesis and super-resolution. The validation demonstrates its efficiency.

    However, the improvement achieved by the proposed method is considered moderate, and the significance test is missing. The paper lacks a description of the clinical significance, which could be strengthened by involving clinicians for evaluation or using the generated images for downstream tasks. The EDSR method is mentioned and cited but not included in the manuscript.

    There are multiple experiments on super-resolution and cross-modality synthesis that are not sufficiently explained. The improvement in performance is marginal, and the images often do not show visible improvements compared to other methods. The presentation of tables and figures with results is criticized for being small and difficult to read. The time and space requirements of the method are not described.

    Specific questions are raised regarding the slice selection standard in Table -1, the sample selection criteria in Figure 2, the benefits of the proposed results for downstream analysis, the discussion of diffusion models for synthesis, the missing modality labels in Figure 3, and the comparison of directly resampling integer slice thickness to the target thickness for non-integer thickness.

    The focus of the paper should be improved, reducing the number of experiments and providing better descriptions and discrimination between them. The presentation of results and figures can be improved with larger images and text. It is suggested to make the manuscript self-explanatory without relying heavily on external references.

    The work might be accepted via some modifications suggested above.




Author Feedback

Thank you for your detailed and insightful feedback. We have thoroughly addressed the comments made by the reviewers and AC, and provide point-by-point responses to the major concerns as follows:

  1. The improvement achieved by the proposed method is considered moderate. (R1, R3) The significance test indicates that our method significantly outperforms the baselines for CMS. For SR and CMSR, our 2D-based model also yields superior results compared with other 3D-based models. The experiments demonstrate our main contribution: to establish a single network for multiple tasks. Qualitative analysis in Fig. 2, Fig. S1, and Fig. S2 shows that our method restores details more preciously and accurately, which is the significant advantage compared with the alternatives.

  2. The significance test is missing. (R1, R3) We have conducted the independent two-sample t-test to prove that the generation quality of our method is significantly higher than the baselines in CMS and CMSR (SSIM) (p<0.05 in both cases).

  3. The paper lacks a description of the clinical significance, which could be strengthened by involving clinicians for evaluation or using the generated images for downstream tasks. (R1, R2) We would like to clarify that our paper does provide a description of the clinical significance of our work, particularly in terms of downstream tasks. As we mentioned on Page 8, we used the synthesized images for the clot segmentation task on the CSDH dataset and report the Dice coefficient. In our revised manuscript, we will further emphasize this by adding more analysis and discussion of the clinical significance details, as follows: “ We also evaluate whether the synthesized high-resolution images can be used for downstream tasks. As observed from Fig.3, when using a pre-trained segmentation model to segment the liquefied blood clots in the reconstructed images, AFCMsr can produce most reliable results, which also indicates the superiority of our method for clinical applications.”

  4. What is the slice selection standard in Table-1? What is the sample selection criteria in Figure 2? The modality labels are missing in the left part of Fig 3. (R1) The slice selection standard in Table 1 is described in the last paragraph of Sect. 2.1. Specifically, for one-to-one translation, only one slice is used as input for predicting the corresponding output slice; for many-to-one translation, 3 adjacent slices are also used as the input. In Fig. 2, we randomly select a sample for visualization. We will add the modality labels in the caption of Fig. 3.

  5. For non-integer slice thickness, what is the results difference of directly resampling integer slice thickness to the target thickness? (R1) Due to the page limitation, we only qualitatively present the CMSR results with non-integer scale and target thickness. Also note that the direct resampling method has not been designed for our task of CMSR, therefore it is not feasible to be compared here. As our SR experiment has demonstrated our method’s superiority over the alternatives, it can also demonstrate its effectiveness over direct resampling.

  6. There are multiple experiments that are not sufficiently explained. (R3) We will add statistical analysis, more comprehensive explanations, as well as a description of baselines (such as EDSR) in our revised manuscript.

  7. The presentation of tables and figures with results is criticized for being small and difficult to read. (R3) We made a compromise between the completeness of our work and the space requirement for the manuscript. We will try to reorganize the tables and figures so that they are more readable.

  8. The time and space requirements of the method are not described. (R3) The descriptions about the time and space requirements will be added as follows: “AFCM is trained with batch size 16 on an NVIDIA GeForce RTX 3090 GPU with 24G memory for 100 epochs, which takes about 36 GPU hours for training and 5 seconds per case for inference.”



back to top