Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Wen Li, Saikit Lam, Tian Li, Andy Lai-Yin Cheung, Haonan Xiao, Chenyang Liu, Jiang Zhang, Xinzhi Teng, Shaohua Zhi, Ge Ren, Francis Kar-ho Lee, Kwok-hung Au, Victor Ho-fun Lee, Amy Tien Yee Chang, Jing Cai

Abstract

The purpose of this study is to investigate the model generalizability using multi-institutional data for virtual contrast-enhanced MRI (VCE-MRI) synthesis. This study presented a retrospective analysis of contrast-free T1-weighted (T1w), T2-weighted (T2w), and gadolinium-based contrast-enhanced T1w MRI (CE-MRI) images of 231 NPC patients enrolled from four institutions. Data from three participating institutions were employed to generate a training and an internal testing set, while data from the remaining institution was employed as an independent external testing set. The multi-institutional data were trained separately (single-institutional model) and jointly (joint-institutional model) and tested using the internal and external sets. The synthetic VCE-MRI was quantitatively evaluated using MAE and SSIM. In addition, a visual qualitative evaluation was performed to assess the quality of synthetic VCE-MRI compared to the ground-truth CE-MRI. Quantitative analyses showed that the joint-institutional model outperformed single-institutional models in both internal and external testing sets, and demonstrated high model generalizability, yielding top-ranked MAE, and SSIM of 71.69 ± 21.09 and 0.81 ± 0.04 respectively on the external testing set. The qualitative evaluation indicated that the joint-institutional model gave a closer visual approximation between the synthetic VCE-MRI and ground-truth CE-MRI on the external testing set, compared with single-institutional models. The model generalizability for VCE-MRI synthesis was enhanced, both quantitatively and qualitatively, when data from more institutions were involved during model development.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16449-1_73

SharedIt: https://rdcu.be/cVRXL

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #2

Please describe the contribution of the paper

This paper proposes a deep learning framework to investigate the generalizability of the proposed method through comparing MRI contrast synthesis results from some single-institutional models and a joint-institutional model. According to the results, joint-institutional model outperformed single-institutional models in both internal and external testing sets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper is well organized and has presented enough figures and tables to support authors’ ideas.

It is novel to investigate the generalizability of the proposed method.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Authors should provide data description in detail. Are all T1w, T2w, and CE-MRI images from the same institute obtained using the same MRI scanner with the same pulse sequence? Do all institutes use MRI scanners from the same manufacturer? These descriptions will help readers better understand the difference in MR images from different institutes.

Another suggestion is to consider adding a reader study to evaluate the quality of synthesized images.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Limited reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

This paper proposes a deep learning framework to investigate the generalizability of the proposed method through comparing MRI contrast synthesis results from some single-institutional models and a joint-institutional model. According to the results, joint-institutional model outperformed single-institutional models in both internal and external testing sets. The paper is well organized and has presented enough figures and tables to support authors’ ideas. However, there are some major concerns.

Authors should provide data description in detail. Are all T1w, T2w, and CE-MRI images from the same institute obtained using the same MRI scanner with the same pulse sequence? Do all institutes use MRI scanners from the same manufacturer? These descriptions will help readers better understand the difference in MR images from different institutes.

Another suggestion is to consider adding a reader study to evaluate the quality of synthesized images.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Novelty, experimental design, result presentation.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

In this study, the authors evaluated the performance of a deep learning method (multimodality-guided synergistic neural network) for generating virtual contrast-enhanced MRIs from T1 and T2 MRIs. They explored the generalizability of the proposed method using multi-centric brain MR images. The images were acquired from patients having ongoing radiotherapy. The authors found that the model trained with multi-centric MRIs had better generalizability and accuracy than those trained with images from a single clinical center.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The problem addressed by the paper is clinically relevant. Generation of virtual contrast-enhanced MRIs from T1 and T2 MRIs may help for improving patient care and decreasing the acquisition time of MRI protocols. One of the strengths of the paper is the multi-centric dataset the authors gathered for studying the generalizability of their model. Studying the generalizability of deep learning models may facilitate their integration in clinical practice. The paper is globally well written.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The strength of the paper is mostly based on the multi-centric cohort the authors gathered. For improving the generalization power of the model, no technical developments were shown in this study. The clinical application of the study is important but seems not novel. The proposed method was not compared to other state-of-the-art methods. The method evaluation lacks precise statistical analysis.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Please add the batch size used for training the networks. Please provide the computational time (at training and testing) of the networks. Additional statistical tests are needed to evaluate the significance of the results. Please use statistical tests to compare the performance of the model trained with multi-centric MRIs to those trained with MRIs from a single center. Did data augmentations were performed to improve the method generalization? Please describe the data. What are the MRI scanners used for the data acquisition? How many Teslas? What are the MRI parameters (TE, TR, flip angle, acquisition time, etc)? Did this study consider the whole MRI volumes or only 2D slice MRIs? It will be great to consider the whole MRI volumes during the method evaluation (that is more needed for clinical practice). JIM model was trained with MRIs from institutions 1, 2, and 3 and evaluated on MRIs from institution 4. That seems not enough to conclude that the model trained with multi-centric MRIs had better generalizability and accuracy than those trained with images from a single institution. I think it will be interesting to train and evaluate the JIM model with MRIs from distinct combinations of institutions (e.g., train JIM with MRI from institutions 2, 3, and 4 and evaluate it on MRI from institution 1) and compare the obtained results (evaluation metrics + statistical tests).
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

Discuss the limitations of the study? Discuss the performance of all methods in challenging subjects (worst cases)? Do the pathologies are preserved after generating the virtual contrast-enhanced MRIs?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is clinically relevant and may interest a lot of scientists. Studying the generalization of deep learning methods is important for facilitate their integration in clinical practice.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #5

Please describe the contribution of the paper

The purpose of this study is to investigate the model generalizability using multi-institutional data for virtual contrast-enhanced MRI (VCE-MRI) synthesis. This study presented a retrospective analysis of contrast-free T1- weighted (T1w), T2-weighted (T2w), and gadolinium-based contrast-enhanced T1w MRI (CE-MRI) images of 231 NPC patients enrolled from four institutions.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) Well organised and written paper. Easily Readable, and high reproducibility. 2) Well planning study which answers clearly the initial hypothesis. 3) Clinical contribution and conclusions about generalization.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1)There is no statistical analysis about the significance of the testing cohorts so the authors can strengthen the final conclusion about the training in different cohorts and generalization. 2) More details about the institutes and the difference of the cohorts’ modalities, resolution quality needed. 3) minor typos (conclusion needs capital C, reference of L1 loss or equation is needed)
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

high reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

A very nice study just some minor concerns: 1)Need of statistical analysis about the significance of the testing cohorts so the authors can strengthen the final conclusion about the training in different cohorts and generalization. 2) More details about the institutes and the difference of the cohorts’ modalities, resolution quality needed. 3) minor typos (conclusion needs capital C, reference of L1 loss or equation is needed)
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

1) Well organised and written paper. Easily Readable, and high reproducibility. 2) Well planning study which answers clearly the initial hypothesis. 3) Clinical contribution and conclusions about generalization. A very nice study just some minor concerns: 1)Need of statistical analysis about the significance of the testing cohorts so the authors can strengthen the final conclusion about the training in different cohorts and generalization. 2) More details about the institutes and the difference of the cohorts’ modalities, resolution quality needed. 3) minor typos (conclusion needs capital C, reference of L1 loss or equation is needed)
Number of papers in your stack

8
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a deep learning framework to investigate the generalizability of the proposed method through comparing MRI contrast synthesis results from some single-institutional models and a joint-institutional model. The reviewers agreed that the clinical value of the proposed approach is high. Overall, the paper is well written. However, a few concerns were raised with regard to experimental design which can be addressed in the final submission.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

5

Author Feedback

We would like to thank all the reviewers for their invaluable comments. This rebuttal is started by listing the major comments (C) and providing our answers (A), please find our response below:

C1: Authors should provide data description in detail. Are all T1w, T2w, and CE-MRI images from the same institute obtained using the same MRI scanner with the same pulse sequence? Do all institutes use MRI scanners from the same manufacturer? These descriptions will help readers better understand the difference in MR images from different institutes.

A1: Thank you for your suggestion. We will clarify this in the manuscript.

C2: Another suggestion is to consider adding a reader study to evaluate the quality of synthesized images.

A2: Thank you for your suggestion. We have done the reader study in a previous work [1]. For this follow-up work, we mainly focused on generalizability investigation. We used qualitative and some quantitative metrics to evaluate the quality of the synthetic images. Another reason we did not use a reader study in this study is that the visual quality of VCE-MRI generated by different models varies significantly (as shown in Fig. 4), so we did not consider a reader study necessary in this work.

C3: The batch size should be added; the detailed data description should be added.

A3: Thank you for your comments. We will add this information in the manuscript.

C4: Please provide the computational time for model training and testing.

A4: Thank you for the comment. We agree that the time for network training and testing is an important consideration during model development. But in this work, we applied a previously developed model instead of focusing on model development.

C5: Did data augmentations were performed to improve the method generalization?

A5: Thank you for your question. No data augmentation was performed in this study.

C6: Please use statistical tests to compare the performance of the models.

A6: Thank you for your comment. The statistical tests were provided in the text instead of in a table due to the page limitations. Please refer to page 6. Quantitative results.

C7: It will be interesting to train and evaluate the JIM model with MRIs from distinct combinations of institutions (e.g., train JIM with MRI from institutions 2, 3, and 4 and evaluate it on MRI from institution 1) and compare the obtained results.

A7: Thank you for your comment. In our dataset, we have only 18 patients in institution 4, while institutions 1, 2, and 3 all had 53 patients. We think that the unbalanced combinations are not proper for the comparison.

C8: Discuss the limitations of the study and the performance of models in challenging subjects. Do the pathologies are preserved after generating the VCE-MRIs?

A8: Thank you for your comments and questions. We will clarify limitations in the manuscript. We agree that assessing model performance in challenging subjects is an important approach to comprehensively evaluate the trained models. However, due to limitation in the number of subjects tested, we did not focus on challenging subjects in this work. Yes, the pathologies were preserved.

Reference: [1] Virtual Contrast-Enhanced Magnetic Resonance Images Synthesis for Patients With Nasopharyngeal Carcinoma Using Multimodality-Guided Synergistic Neural Network.

back to top

Multi-institutional Investigation of Model Generalizability for Virtual Contrast-enhanced MRI Synthesis