Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Viktor van der Valk, Douwe Atsma, Roderick Scherptong, Marius Staring

Abstract

Electrocardiography is the most common method to investigate the condition of the heart through the observation of cardiac rhythm and electrical activity, for both diagnosis and monitoring purposes. Analysis of electrocardiograms (ECGs) is commonly performed through the investigation of specific patterns, which are visually recognizable by trained physicians and are known to reflect cardiac (dis)function. In this work we study the use of β-variational autoencoders (VAEs) as an explainable feature extractor, and improve on its predictive capacities by jointly optimizing signal reconstruction and cardiac function prediction. The extracted features are then used for cardiac function prediction using logistic regression. The method is trained and tested on data from 7255 patients, who were treated for acute coronary syndrome at the Leiden University Medical Center between 2010 and 2021. The results show that our method significantly improved prediction and explainability compared to a vanilla β-VAE, while still yielding similar reconstruction performance.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_52

SharedIt: https://rdcu.be/dnwzk

Link to the code repository

https://github.com/ViktorvdValk/Task-Specific-VAE

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    By using β-variational autoencoders (VAEs) with a prediction module, authors tried to achieve good explaination and good LVF classification.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1- The use of beta-vae.

    2- The combination with prediction module which makes the beta-vae remain more disease-related information, which leads to task-related explaination.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1- Why LVF special? It seems this work can be used for other heart diseases. Authors can show the results. In other words, I don’t know the universality of this pipeline towards other heart diseases, although it seems to be available for other diseases.

    2- The prediction module not only use the latent codes but also some other information such as mean RR interval. That means, the prediction results may be not bad if the disease-related information is not in the latent codes. That means, if the diseases can be detected by only RR interval for example, the latent codes can learn nothing about morbid signals. Maybe some discussions for this can be showed.

    3- I think a direct discussion about the explaination for LVF should be given.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I believe the authors will make codes available if the paper is accepted.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The number should be: 119.886 -> 119,886 33.610 -> 33,610

    More details in 6.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is not technically novel, but it may be acceptable if this paper is clinically useful.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose a β-VAE approach for extracting task-specific features from ECG signals to predict the left ventricular function (LVF) of the heart. The authors prefer a β-VAE to vanilla VAE for improving explainability of the model and controlling the feature representations in the latent space. The proposed method is compared to a PCA-based baseline method and task-naive VAEs. The results show that the proposed method outperforms the baseline and task-naive methods in terms of LVF prediction. The authors also perform experiments to optimize the hyperparameters and evaluate the influence of the latent space size on the explainability, reconstruction, and prediction quality of the model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The use of β-VAE to improve interpretability and explainability in this problem domain is an interesting and valuable contribution, as it enhances the model’s transparency and allows for better understanding of the extracted features. 2) The authors conducted a thorough evaluation of their proposed method, optimizing hyperparameters and exploring the influence of latent space size on model explainability, reconstruction, and prediction quality. This provides valuable insight into the advantages of a carefully designed β-VAE over a vanilla VAE. 3) The proposed method outperformed baseline and task-naive methods in terms of LVF prediction, demonstrating its effectiveness and potential for clinical applications. 4) The paper provides a detailed explanation of the dataset. 5) The paper is well-written, with clear explanations and a logical flow of ideas, making it easy to follow and understand.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) Limited dataset: The study is based on a single dataset (trained and tested on one patient cohort). While the authors have taken care to carefully explain the characteristics of the dataset, the generalizability of the proposed method to other datasets or patient populations are unclear. 2) Lack of comparison to other state-of-the-art methods: While the proposed method is compared to a PCA-based baseline and task-naive VAEs, there is no comparison made to other state-of-the-art methods for LVF prediction using ECG signals. This limits the ability to assess the true effectiveness of the proposed method compared to other approaches in the literature. How does this perform compared to other DL-based approaches? 3) Lack of clinical validation: While the proposed method shows promising results in predicting LVF, the authors do not provide any clinical validation of the model’s predictions. This means that it is unclear whether the model’s predictions align with clinical reality or could potentially lead to incorrect diagnoses or treatment decisions.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Authors will make their code publicly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please check the above.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The carefully designed experiments provide good insight in training a simple yet explainable model with good performance.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The paper explores the use of β-variational auto encoders as an explainable feature extractor, and improve on its predictive capacities by jointly optimizing signal reconstruction and cardiac function prediction. The aim of this paper is to explore further improvement of the latent features by improving their explainability and prediction performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is overall well-written and organized.

    2. The application of using a VAE for extracting features from EEG signals is interesting.

    3. The quality of experiments is good. Results show that the proposed method significantly improved prediction and explainability compared to a vanilla β-VAE and other baselines, while still yielding similar reconstruction performance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper lacks significantly novel in terms of methodological contributions. The idea of adding a task-specific layer from the latent space for downstream application is not new. This has been explored before for different applications (for example, see https://link.springer.com/chapter/10.1007/978-3-030-32251-9_23).

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    1. The authors mentioned that the implementation code will be made publicly available in GitHub if accepted.

    2. It is not clear if the datasets used in the paper are publicly available or only restricted to institution access.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The authors have mentioned that previous studies have also explored the use of VAE for explainable feature extraction from EEG signals. It is not clear how this paper is novel in application from the previous studies. Is this the 1st work on jointly optimizing classification and reconstruction in VAE for EEG signals? If yes, that info should be explicitly highlighted.

    2. I would like the authors to explain the intuition behind task-naive VAE and why that is used. Since the main goal of the paper is joint optimization of reconstruction and classification (from latent space), it is not clear why the classification layers need to be trained after freezing the encoder and VAE.

    3. From Figure 2, it seems like PCA is performing at par for higher dimensions (MSE, AUROC) compared to the VAE methods. I would like the authors to provide an intuition of why PCA works better than deep learning for this application.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper lacks significantly novel in terms of methodological contributions. The idea of adding a task-specific layer from the latent space for downstream application is not new. This is an interesting application paper with some minor flaws in the results. Also the paper should explicitly highlight their novelty w.r.t the previous works addressing the same problem.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    I am satisfied with the author responses and I am changing my rating from weak reject to weak accept.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The method uses β-VAE to improve interpretability and explainability for EEG analysis, which is an interesting and valuable contribution. Experiments show that the method can improve the performance. However, reviewers have concerns with the limited dataset and lack of comparison with other methods. Evaluation of clinical validation is lacking. Technical novelty is also limited. It would be great to address the concerns of the reviewers and discuss the clinical usefulness of the proposed method.




Author Feedback

We thank all (meta-)reviewers for their positive assessment and for suggesting improvements to the paper.

In response to the technical novelty remarks by R3 and R4: The task-naive VAE is commonly used as a feature extractor and is therefore used as a baseline. The proposed task-specific VAE has indeed been used in other modalities, however the use in ECG data is new. The proposed split task specific VAE method, in which only a small subset of the latent space is optimized for LVF prediction in order to enhance interpretability, is novel, to our understanding. We further demonstrate that our method leads to visual separation of the ECGs w.r.t. left ventricular function, as we focus on an interpretable method that aids cardiologists’ understanding. These clarifications will be added to the introduction.

In response to the comparison with other methods and the validation as mentioned by R3 and R4: In addition to PCA and other baselines, we do compare with the state-of-the-art FactorVAE network from Van der Leur et al. (2022), which was shown to predict LVF on par with a residual CNN with the same architecture as the FactorVAE encoder, and show improvements. We will clarify this point in the paper, specifically in Table 1. With respect to the PCA baseline, we would like to point out that although on par with our method in the AUC metric, the more appropriate F1 metric shows that we outperform the PCA method, see Figure 2. It is to be expected that prediction based on PCA features performs on par with the task-naive VAEs, as they can both be considered unsupervised feature extractors. However, the task-naive VAEs still create more interpretable features than PCA, because the KL-divergence, not used in PCA, promotes disentanglement of features.

In response to the generalizability of the pipeline as pointed out by R1 and R3: We purposely focused on LVF as it is the most commonly used indicator of cardiac health, that has general applicability in a range of cardiovascular diseases, beyond myocardial infarctions. We agree that other predictors, such as mortality or myocardial infarct location, can be easily plugged in our framework as well, and have provided the proof of concept in this study. We considered this extension beyond the scope of the current conference publication, but is certainly the next step in our work. Furthermore, we would like to point out that no public ECG dataset with LVF labels is available at the moment, and would also like to emphasize that the used dataset is large and includes various types of myocardial infarctions (STEMI, NSTEMI and angina pectoris). It is therefore not unlikely that the current pipeline generalizes to other cohorts of patients. In addition, we performed a 5-fold cross validation as a form of a generalizability experiment, even though this is not as robust as external validation. Ultimately, a user study would provide more insight in the interpretability aspects of our method, but this was beyond the scope of the MICCAI contribution.

In response to the use of the RR interval in the prediction by R1: The RR interval was added to the prediction because it contains information that is lost when averaging the ECG heartbeats. If the RR interval is very predictive for disease outcome and the latent codes are not, then the task naive method would already give a high prediction score at the lowest latent dimension (L=1), which would not improve for higher L. This is however not the case, see Figures 2c and 2d.

In response to the training workflow mentioned by R4: Although we first freeze the VAE when training the classification layers (to ensure more stable training), we then jointly train the reconstruction as well as the prediction. We will briefly clarify this in the method section.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I appreciate the authors to address concerns of the reviewers. After rebuttal, three reviewers give positive evaluations. I would suggest ‘Accept’.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors made compelling argument on technical novelty, comparison with other methods, and generalizability of the work. However, the overall contribution seems incremental, which makes this a borderline case.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors propose a β-VAE approach for extracting task-specific features from ECG signals to predict the left ventricular function (LVF) of the heart. The paper lacks significantly novel in terms of methodological contributions. The usage of beta-VAE has been investigated in literature.



back to top