Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Prasun C. Tripathi, Mohammod N. I. Suvon, Lawrence Schobs, Shuo Zhou, Samer Alabed, Andrew J. Swift, Haiping Lu

Abstract

Heart failure is a severe and life-threatening condition that can lead to elevated pressure in the left ventricle. Pulmonary Arterial Wedge Pressure (PAWP) is an important surrogate marker indicating high pressure in the left ventricle. PAWP is determined by Right Heart Catheterization (RHC) but it is an invasive procedure. A non-invasive method is useful in quickly identifying high-risk patients from a large population. In this work, we develop a tensor learning-based pipeline for identifying PAWP from multimodal cardiac Magnetic Resonance Imaging (MRI). This pipeline extracts spatial and temporal features from high-dimensional scans. For quality control, we incorporate an uncertainty-based binning strategy to identify poor-quality training samples. We leverage complementary information by integrating features from multimodal data: cardiac MRI with short-axis and four-chamber views, and cardiac measurements. The experimental analysis on a large cohort of 1346 subjects who underwent the RHC procedure for PAWP estimation indicates that the proposed pipeline has a diagnostic value and can produce promising performance with significant improvement over the baseline in clinical practice. The decision curve analysis further confirms the clinical utility of our method. The source code can be found at: https://github.com/prasunc/PAWP.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43990-2_20

SharedIt: https://rdcu.be/dnwLv

Link to the code repository

https://github.com/prasunc/PAWP

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a pipeline for identifying Pulmonary Arterial Wedge Pressure (PAWP), a surrogate marker for high pressure in the left ventricle, using multimodal cardiac Magnetic Resonance Imaging (MRI) and Electronic Health Records (EHR). The proposed pipeline uses tensor learning to extract spatial and temporal features from high-dimensional scans and incorporates an epistemic uncertainty-based binning strategy for quality control. The authors also integrate features from multimodal data to improve performance. The experimental analysis on a large cohort of 1346 subjects who underwent the RHC procedure for PAWP estimation shows that the proposed pipeline has diagnostic value and can produce promising performance with significant improvement over the baseline in clinical practice. The decision curve analysis further confirms the clinical utility of the method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The study presents a pipeline for clinical study and upon reviewing the evaluation experiments, they appear to be acceptable.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The methodology employed by the authors involves the use of the MPCA method to predict PAWP, which is combining other methods together.

    2. The authors did not compare their approach to any other methods, which is a limitation.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    1. The prediction method utilized in the study is SVM, which requires training for different features and is time-consuming. Have the authors considered using different deep-learning based methods, such as FC layer?
    2. The discussion of the results in the study is limited.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    They didn’t make the codes open.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The innovation of the method

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    In this paper, the authors propose a fully automatic pipeline for predicting pulmonary arterial wedge pressure (PAWP) using cardiac MRI and electronic health records. PAWP is an important indicator of heart failure severity and is usually measured by invasive and expensive right heart catheterization. The proposed pipeline utilises multilinear principal component analysis to reduce feature dimensions while preserving spatial and temporal information in cardiac MRI. The authors also leverage automatic landmarks with uncertainty quantification to tackle the challenge of manual landmark labelling. Furthermore, they extract complementary information from multimodal data, including short-axis, four-chamber, and electronic health record features. The effectiveness of the proposed pipeline was validated on cardiac MRI scans of 1346 patients with various heart diseases, showing a significant improvement over the current clinical baseline. Decision curve analysis indicates the diagnostic value

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors developed a fully automatic pipeline for PAWP prediction using cardiac MRI and electronic health records, which includes automatic landmark detection with uncertainty quantification, an uncertainty-based binning strategy for training sample selection, tensor feature learning, and multimodal feature integration. The use of tensor-based features for analysing cardiac MRI scans is a novel and effective approach. The multilinear PCA method utilised in this paper is well-suited for analysing high-dimensional spatial and temporal features generated throughout the cardiac cycle. The authors integrated features extracted from electronic health records identified in baseline work for PAWP prediction, which enhanced the diagnostic power of the pipeline. This demonstrates the potential of multimodal learning for solving healthcare problems. 

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I understand the motivation behind the uncertainty-based binning and the improvement in AUC when using 4Ch and SA features, but excluding more than 10 percent of the data is a large portion of data on which this framework may not work well. I would have preferred if the authors had kept the data and compared performances with and without uncertainty estimation for all modalities.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors state they will make the code and the data set publicly available if accepted.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    I would be interested in seeing which features contributed the most to the MPCA. For example, how important are certain elements of the EHR to the model (left atrial volume, etc.)?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The study has a convincing clinical focus, and the authors used a large data set to test the methods. They demonstrate clearly that the combination of all three modes, especially the inclusion of the EHR, leads to better accuracy, AUC, and MCC. Although I do think the inclusion of EHR is a fairly common approach, and the authors may overestimate the novelty in using this additional set of information.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors built a pipeline for Pulmonary Arterial Wedge Pressure (PAWP) classification from cardiac MRI and heart measurements. Firstly, the authors integrated their previously developed techniques to build this pipeline, including landmark detection with uncertainty for registration and multilinear principal component analysis (MPCA) for dimension reduction. Then the features are fused with two measurements of the heart for prediction.

    The experimental results show the effectiveness of their fusion method on a dataset with over 1300 patients.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors built an automated pipeline for PAWP classification. The authors show that fusing cardiac MRI with different views can improve classification performance. The authors show that the clinical variables, the measurements of the hearts, associated with cardiac MRI images can improve the prediction performance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Wording: The using of the term ‘EHR’ is very confusing. a. The authors mentioned EHR features several times but did not give a clear description until the experiment section. b. Moreover, the ‘EHR’ used in this paper was not the actual electronic health record, such as admissions, lab events, and prescriptions. In their experiments, the authors used left ventricle mass and left arterial volume, which is calculated from imaging.
    2. No comparison with deep neural networks-based classification methods. a. In recent years, deep learning has become the most popular natural or medical image classification approach. When solving similar problems, using a deep learning-based method or comparing with it is almost unavoidable. b. The PAWP classification can be considered as a cardiac MRI classification problem [R1, R2]. c. The authors used CNN methods to localize landmarks but did not compare any CNN-based classification without any explanation, which is not convincing.

    [R1] Fries, Jason A., et al. “Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences.” Nature communications 10.1 (2019): 3111. [R2] Clough, James R., et al. “Global and local interpretability for cardiac MRI classification.” Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part IV 22. Springer International Publishing, 2019.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    According to the reproducibility checklist, the authors will open-source their work.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please replace the word EHR with a more precise word.

    If possible, train a deep neural network for classification. It will help other researchers know the boundaries of deep learning-based methods for this task.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors built an effective pipeline for the PAWP classification problem.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors answered my questions on comparing deep learning-based methods.

    However, I still have concerns with the word ‘EHR’ in this manuscript. The authors selected LA volume and LV mass as the ‘EHR feature’, which are far away from typical electronic health records, like demographics, vital signs, visiting events, lab tests, prescriptions, procedures, condition history, etc.

    It would be more accurate to call these two features ‘cardiac indices’ or ‘cardiac measurements’ rather than ‘electronic health records’. The term ‘EHR’ covers much more broadly than described in this manuscript.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper presents a pipeline for identifying Pulmonary Arterial Wedge Pressure (PAWP), a surrogate marker for high pressure in the left ventricle, using multimodal cardiac Magnetic Resonance Imaging (MRI) and Electronic Health Records (EHR). The proposed pipeline utilises multilinear principal component analysis to reduce feature dimensions while preserving spatial and temporal information in cardiac MRI. The authors also leverage automatic landmarks with uncertainty quantification to tackle the challenge of manual landmark labelling. Furthermore, they extract complementary information from multimodal data, including short-axis, four-chamber, and electronic health record features. The effectiveness of the proposed pipeline was validated on cardiac MRI scans of 1346 patients with various heart diseases, showing a significant improvement over the current clinical baseline. Decision curve analysis indicates the diagnostic value

    Strengths of the paper:

    • The study presents a pipeline for clinical study and upon reviewing the evaluation experiments
    • Use of multi-modal data for PAWP, which include cardiac MRI and electronic health records, which exemplify the potential of multimodal learning for solving healthcare problems.
    • The framework include automatic landmark detection with uncertainty quantification, an uncertainty-based binning strategy for training sample selection, tensor feature learning, and multimodal feature integration. The use of tensor-based features for analysing cardiac MRI scans is a novel and effective approach.
    • The authors integrated features extracted from electronic health records identified in baseline work for PAWP prediction, which enhanced the diagnostic power of the pipeline.
    • The authors built an automated pipeline for PAWP classification.
    • The authors show that fusing cardiac MRI with different views can improve classification performance.
    • The authors show that the clinical variables, the measurements of the hearts, associated with cardiac MRI images can improve the prediction performance.

    Weaknesses of the paper:

    • The methodology employed by the authors involves the use of the MPCA method to predict PAWP, which is combining other methods together.
    • The authors did not compare their approach to any other methods
    • The discussion of the results in the study is limited, it would be advise to extend it.
    • It would be good to better introduce the EHR features in the introduction section, as they only become clear on the experiment section.
    • The PAWP classification can be considered as a cardiac MRI classification problem and authors could acknowledge this.

    Recommendation: The paper proposed an interesting pipeline for identifying Pulmonary Arterial Wedge Pressure, but there are some points that will need to be revised or included for acceptance.




Author Feedback

Q1: The methodology involves the use of the MPCA method to predict PAWP which is combining other methods together. (R1: Weakness 1).

A1: The primary objective of our work was to construct an interpretable, simple, and efficient pipeline for PAWP classification that can solve a practical clinical problem. By utilizing MPCA, we can correlate predictions with the clinical image features associated with PAWP, a factor that is pivotal in establishing the clinical reliability of our pipeline for clinical decision-making. We strategically employed various components based on specific requirements and deep insights. We consider this integration effort as a valuable contribution.

Q2: No comparison with any other methods (R1: Weakness 2).

A2: Our method is compared with two baselines in Table 2 (Row 2 to 4 on Page 6). Row 2: Garg et al. (2022) [6] (unimodal-EHR) is the state-of-the-art clinical approach for this problem. Row 3 and 4: Swift et al. (2021) [20] is the unimodal baseline on cardiac MRI. We will add reference [20] to rows 3 and 4 of Table 2 in the revised version.

Q3: No comparison with deep learning methods (R1 and R3 Weakness 2. a and 2. c).

A3: We agree that deep learning methods have many strengths. However, they are not fully interpretable and transparent, which is highly desirable and important in clinical decision-making (e.g., diagnosis). In contrast, image processing, e.g., landmark localization, has a lower requirement on interpretability, where we have chosen to leverage the power of deep learning.

Our classification pipeline is linear and therefore the contribution/importance of input features is interpretable (see an example interpretation in A7, our answer to R2’s question), which makes clinical decision-making more transparent and trustworthy.

Q4: SVM is time-consuming (R1)

A4: We chose SVM by following [18,19,20] and the average time cost of training an SVM classifier is less than one minute in our experiments on a standard CPU machine.

Q5: Code Availability (R1)

A5: The link to the source code will be included in the final version, following the guidelines of the reproducibility checklist.

Q6: Compare performances with and without uncertainty-based binning on all modalities (R2 Weakness).

A6: For the best-performing model in Table 2, the results obtained without uncertainty estimation-based data exclusion are: AUC = 0.8036, Accuracy = 0.7820, MCC = 0.4779, which decreased by 0.0291, 0.0218, and 0.032, respectively. We will add these results as an additional row in Table 2. Due to the space limit, we will not be able to add such results for all settings.

Q7: Feature contribution to the models (R2).

A7: For cardiac MRI scans, the highly-weighted features were detected in the left ventricle and interventricular septum. For EHR features, left atrial volume (0.778 out of 1) contributed more than left ventricular mass (0.222 out of 1) to the prediction. We will add a short text description of important features before the Conclusion section on Page 8 in the revised version.

Q8: Wording: The use of the term “EHR” is very confusing as they only become clear in the experiment section. (R3 weakness 1).

A8: We follow [6] to use only two EHR features: left ventricle mass and left atrial volume. We explained our choice in the second last paragraph of the Introduction (Page 2, above our main contributions). We will clarify the EHR features used early in the revised version.

Q9: PAWP classification as cardiac MRI classification in two suggested references (R3 Weakness 2. b).

A9: It could be unfair to compare the suggested works (for CMRI only) with our proposed multimodal pipeline directly. However, it would be interesting to include the two suggested papers as two unimodal baselines in our, longer, journal version. Due to the page and time limitation, we were not able to perform this comparison in the submitted manuscript and this rebuttal.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Although there is weakness in methodological novelty by combining the components from existing studies, the motivation is driven the observation from medical tasks and the experimental results demonstrated better performance than other methods. Thus, I am inclined to acceptance.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes an interesting pipeline for pulmonary artery pressure classification. However, my main concern is regarding its methodological contribution. It seems a standard pipeline that extracts multi-modal features and feeds them into an SVM for classification. There is a lack of method comparison to convince me why this would be possibly the best pipeline for this task.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Using MPCA method on tensors to predict PAWP lacks interpretability. The fact that this work did not compare with other imaging+EHR-based methods (the listed ones are unimodal methods) further reduces the validity of this paper.



back to top