Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Rohan Dhamdhere, Gourav Modanwal, Mohamed H. E. Makhlouf, Neda Shafiabadi Hassani, Satvika Bharadwaj, Pingfu Fu, Ioannis Milioglou, Mahboob Rahman, Sadeer Al-Kindi, Anant Madabhushi

Abstract

Chronic Kidney Disease (CKD) patients are at higher risk of Major Adverse Cardiovascular Events (MACE). Echocardiography evaluates left ventricle (LV) function and heart abnormalities. LV Wall (LVW) pathophysiology and systolic/diastolic dysfunction are linked to MACE outcomes (O and O+) in CKD patients. However, traditional LV volume-based measurements like ejection-fraction offer limited predictive value as they rely only on end-phase frames. We hypothesize that analyzing LVW morphology over time, through spatiotemporal analysis, can predict MACE risk in CKD patients. However, accurately delineating and analyzing LVW at every frame is challenging due to noise, poor resolution, and the need for manual intervention. Our contribution includes (a) developing an automated pipeline for identifying and standardizing heart-beat cycles and segmenting the LVW, (b) introducing a novel computational biomarker—STAR-Echo—which combines spatiotemporal risk from radiomic (MR) and deep learning (MT ) models to predict MACE prognosis in CKD patients, and (c) demonstrating the superior prognostic performance of STAR-Echo compared to MR, MT, as well as clinical-biomarkers (EF, BNP, and NT-proBNP) for characterizing cardiac dysfunction. STAR-Echo captured the gray level intensity distribution, perimeter and sphericity of the LVW that changes differently over time in individuals who encounter MACE outcomes. STAR-Echo achieved an AUC of 0.71[0.53–0.89] for MACE outcome classification and also demonstrated prognostic ability in Kaplan-Meier survival analysis on a holdout cohort (Sv = 44) of CKD patients (N = 150). It achieved superior MACE prognostication (p-value = 0.037 (log-rank test)), compared to MR (p-value = 0.042), MT (p-value = 0.069), clinical biomarkers—EF, BNP, and NT-proBNP (p-value >0.05).

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_28

SharedIt: https://rdcu.be/dnwJK

Link to the code repository

https://github.com/rohand24/STAR_Echo

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes STAR-echo, a computational biomarker automatically extracted from a full heartbeat cycle echocardiography video, to predict Major Adverse Cardiac Event (MACE) occurrence in Chronic Kidney Disease (CKD) patients. STAR-echo uses texture and shape features of Left Ventricular Wall (LVW) to compute spatiotemporal risk from radiomic (M_R) and deep learning (M_T) models. On a dataset of 105 CKD patients (101 training; 44 held-out evaluation set), the results show that the biomarker provides improved prognosis prediction and better separation of high vs low risk group patients.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Building upon the existing literature that suggest that using the spatiotemporal LVW morphology and motion features instead of a single EF value as biomarker might be beneficial, this work provides a good pipeline and an early stage small validation to show the potential benefits on survival & hazard analysis on CKD patients wrt CVDs & MACE.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Although the paper is relatively well-written and easy to understand in most parts, there are some areas where a general reader can get confused due to a lack of details or presentation arrangement (see detailed comments).

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The LVW segmentation for most images are predicted by a deep learning model. The authors mention that these are manually checked and corrected. Is there an specific protocol that was used? Or how many images needed the correction, and are the corrected masks available? The dataset split details do not seem to be available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Figure 2 requires a lot of effort to understand for a general reader not familiar with KM curves. Improved caption and providing references to relevant sections or citations can be helpful. It took me a while to understand that the High Risk and Low risk numbers in different colors are the number of patients out of 44 from the held out set. Briefly describing how the low & high risk are computed in KM curves would be helpful.

    Is the single number for each year in the graph an average survival probability for each of the two groups or sth else?

    The figure seems to provide an idea on how good the high vs low risk group separation is and whether it is statistically significant, but what is the ground truth to evaluate whether the high vs low stratification is correct. In other words, how do we know if the model is predictive two highly distinctive groups but getting it wrong (confidently)? Or, how’s risk severity defined and ground truth calculated?

    The Star-echo and M_R have exactly the same sensitivity & specificity; any comment on that?

    Sec 4.1: O- (No occurrence of MACE has lower median survival rate than occurrence of MACE), means patient died before MACE happening for other cause? Those who died early without MACE would have had MACE if they hadn’t died early; isn’t this a confounding factor?

    Typo: Sec 4.1 myocardical -> myocardial Sec 4.2 Brouta [?] Sec 4.3 clnical -> clinical

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea of using spatiotemporal image-based biomarker with some interpretable features is interesting and the early results look promising. The pipeline could be useful for potentially other diseases where cardiovascular morphology and motion undergo changes or can be a potential biomarker.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    A spatio-temporal analysis for prognosis of MACE in chronic kidney disease patients from echocardiographic images based on transformer-based radiomics models.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main contribution is a spatio-temporal analysis of the evolution of sphericity and perimeter. This can be important to detect small changes in the cardiac cycle, which can be associated with a disease

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Radiomics is a challenge task in echocardiographic images due to the high variability in the quality of the images. This can affect considerably the method’s precision. How this limitation could be mitigated? There a lot of typo errors in the manuscript, some references are not correct

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    There is not information on sensitivity regarding parameter changes and neither the exact number of training and evaluation runs.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    A comparison with a baseline or state-of-the art methods is necessary. As well, an ablation study to evaluate the quality variability in the extraction of the radiomics features

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    A comparison with a baseline or state-of-the art methods is necessary. As well, an ablation study to evaluate the quality variability in the extraction of the radiomics features

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    In the paper, the authors address the limited prognostic value of traditional echo measurements in CKD patients, who are at higher risk of developing CVD. They propose a novel biomarker, STAR-Echo, which combines radiomics and video transformer-based descriptors to evaluate spatiotemporal changes in LVW morphology. By identifying features based on longitudinal changes in LVW shape and texture, STAR-Echo is able to predict CVD risk with greater accuracy than individual spatiotemporal models or clinical biomarkers. The authors demonstrate the superiority of STAR-Echo in predicting CVD risk in CKD patients and present an end-to-end automated pipeline for echo videos that can identify heartbeat cycles, segment the LVW, and predict a prognostic risk score.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper describes the use of echocardiography videos to investigate the spatiotemporal changes in the morphology of the left ventricular wall (LVW) to predict the risk of cardiovascular disease (CVD) in patients with chronic kidney disease (CKD). In CKD patients, traditional echo measurements based on static LV volume and morphology, such as ejection fraction, have limited prognostic value beyond baseline clinical characteristics. The authors introduce STAR-Echo, a biomarker combining radiomics and video transformer-based descriptors to evaluate spatiotemporal changes in LVW morphology. STAR-Echo identifies features based on longitudinal changes in LVW shape and texture, which are prognostic for CVD risk in CKD patients. The authors demonstrate the superiority of STAR-Echo in the prognosis of CVD in CKD, compared to individual spatiotemporal models and clinical biomarkers and present an end-to-end automated pipeline for echo videos that can identify heartbeat cycles, segment the LVW, and predict a prognostic risk score for CVD in CKD patients.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper lacks a clear description of the patient population used in the study, including demographic information, comorbidities, and CKD stage. It would have been interesting to see if a combination of EF, BNP, NT-ProBNP could have achieved better results. Even though the authors state that the separation between CVD+ and CVD- outcomes improves with STAR-Echo, it seems that the separation is very poor up to 2 years. Is this separation sufficient for clinical use?

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors state they would make the code publicly available. The data set (Chronic Renal Insufficiency cohort) may be open to collaboration.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    I do not understand where the texture feature variation comes from. It would be very interesting to see a further discussion of the most important features that are thought to contribute to or are connected to CVD. Please spend some time editing your text for grammar and spelling mistakes. There are too many problems with missing or additional spaces and small spelling mistakes to list here, but they are very distracting.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper addresses an important problem, which is the poor prognostic value of standard echo measurements for CVD risk assessment in CKD patients and provides has a comprehensive review of prior work in the field, demonstrating the novelty and contribution of the proposed method. They test their method on a relatively large dataset (n=150) and are able to show that Star-Echo improves CD progonosis in CKD patients.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposes STAR-echo, a computational biomarker automatically extracted from a full heartbeat cycle echocardiography video, to predict Major Adverse Cardiac Event (MACE) occurrence in Chronic Kidney Disease (CKD) patients. STAR-echo combines radiomics and video transformer-based descriptors to evaluate spatiotemporal changes in LVW morphology. The authors used a database of 105 CKD patients to demonstrate the superiority of STAR-Echo in predicting CVD risk in CKD patients. In addition, STAR-echo is an end-to-end automated pipeline for echo videos that can identify heartbeat cycles, segment the LVW, and predict a prognostic risk score.

    Strengths of the paper:

    • The spatio-temporal analysis of the evolution of sphericity and perimeter
    • Use of radiomics and video transformer-based descriptors
    • STAR-Echo identifies features based on longitudinal changes in LVW shape and texture, which are prognostic for CVD risk in CKD patients
    • Proposed method has superiority in the prognosis of CVD in CKD, compared to individual spatiotemporal models and clinical biomarkers
    • End-to-end automated pipeline for echo videos that can identify heartbeat cycles, segment the LVW, and predict a prognostic risk score for CVD in CKD patients.
    • The pipeline could be useful for potentially other diseases where cardiovascular morphology and motion undergo changes or can be a potential biomarker.

    Weaknesses of the paper:

    • The paper could benefit adding some further details or arranging the presentation.
    • It would be interesting to add a paragraph on the discussion or results to discuss the effect of image quality on the radiomics extraction and method’s precision.
    • No comparison with a baseline or state-of-the art method
    • No ablation study to evaluate the quality variability in the extraction of the radiomics features
    • Please add a clear description of the patient population used in the study
    • The paper could be improved by adding a discussion on the clinical use of the proposed method

    Recommendation: The paper proposed an interesting approach to predict Major Adverse Cardiac Event, but there are some points that will need to be revised or included for acceptance.




Author Feedback

Our paper received scores of R1:6, R2:3, R3:6. R1 and R3 were bullish about the paper. R2 raised concerns about radiomics precision due to echo image quality variability, training details, baseline and state-of-the-art(SOTA) comparison, and ablation analysis. All these concerns were adequately addressed in the paper, and our rebuttal points to the data, alleviating R2’s concerns.

  1. Variability of radiomic features not addressed(R2): We disagree with the reviewer’s comment. As was described in Sections 3.4, 4.2, several steps were taken to address this issue, including: •ROI-based feature extraction focusing on the LV wall (LVW) to exclude irrelevant areas and reduce impact of noise. (Sec. 3.4, para.2) •Optimal parameters (e.g., binwidth=5) to effectively capture texture features and minimize noise, followed by Boruta feature selection method to select robust features less sensitive to noise and outliers.(Sec. 4.2, para.1) •Random forest (RF) model was used to reduce variance by training multiple decision trees on 100 different subsets and averaging their predictions, ensuring the model’s insensitivity to feature variation due to echo quality. (Sec. 4.2, para.1)
  2. Training and evaluation run not provided(R2): R2’s comment is incorrect. We clearly provided these details: •Radiomic model (M_R) was trained(N=101) using 5-fold cross-validation for 100 runs (Sec 4.2, para. 1). •Transformer model (M_T) was trained for 50 epochs and their hyperparameters are provided in Supp. Table S3. •Single run evaluation was done on the holdout set(Sv = 44).
  3. Comparison with a baseline/SOTA methods is necessary(R2): Yes, we agree, and it’s already provided. •Comprehensive comparison with baseline clinical biomarkers—EF,BNP,NT-proBNP is shown (Fig2 ,Table1, Supp. FigS2 and TableS2). •SOTA models are based on EchoNet dataset, predicting Ejection fraction(EF) and then predicting MACE with EF(Sec 4.3, para.1). •EF is known to perform poorly in CKD patients(Fitzpatrick et al.). Also observed from our baseline comparison (Fig2 and FigS2). Thus, SOTA comparison is unwarranted (Sec 4.3, para.1).
  4. No ablation analysis for radiomic feature quality(R2): •Extensive ablation analysis is provided in Supp. FigS2 and TableS2 w.r.t combination of the clinical biomarkers (R3) and for single-frame(current clinical workflows) based radiomics models. •Although ablation about radiomic feature quality is not provided, the strategies to address it are mentioned above in point 1.
  5. Strength partially acknowledged by R2: We agree with R2 regarding the changes in sphericity and perimeter. But we’re disappointed that they missed the spatio-temporal texture evolution(Fig3)—an important strength of STAR-Echo, emphasized by R1, R3. This and above comments suggest either R2’s misapprehension of our work or a lack of careful review by R2.
  6. Explain KM curves? Is poor separation(upto 2 years) clinically useful?(R1, R3): •KM analysis uses time-to-event follow-up of MACE as the ground truth. We used a median threshold of the STAR-Echo risk score to categorize into low and high-risk groups, which revealed a significant difference(p<0.05) in patient survival rates. •Star-Echo’s clinical usefulness is its interpretable MACE prognosis using echo which surpasses limitations of clinical biomarkers for CKD patients (Fig2 and Table1). The initial poor separation is actually a strength of STAR-Echo. It shows early risk identification, facilitating timely preventive interventions and patient management.
  7. Connecting important features with CVD(R3): •Texture feature capture LVW tissue alterations due to fibrosis, ischemia.(Sec 4.3 para.3) •Changes in LVW sphericity signal LV hypertrophy (LVH) which increases MACE risk. (Sec 4.3 para.3)
  8. Description of Patient population(R3): Dataset details are provided in Sec 4.1, para.1. The study is on individuals with mild to moderate CKD, not on dialysis and with diabetes as main comorbidity. They were followed for CVD.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed the concerns on Variability of radiomic features and the use of ablation analysis for radiomic feature quality. The suggestion of the reviewers to accept the paper is reinforced after the rebuttal. Therefore, I recommend acceptance of the paper.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Radiomics features for US might not be reliable as the author has argued. In overall the proposed work lacks novelty.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    There has been lots of missing important points in the original submission as indicated by MR and other reviewers (patient population, valuable discussions, comparison with other SOTA, etc.). Any information provided in Supplementary data is optional for the reviewer. The main manuscript should contain ALL necessary info and the Sup. should be extra. The train validation test split is not the optimal experimental design. The authors should use a cross validation as the RF. Table 1 is for ablation experiment and no comparison with other methods is given. The authors did not response to the critical comment by R1 about manual correction for LVW segmentation correction. There is question marks in some places (e.g., Brouta [?])



back to top