List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Ke Xiao, Erik Learned-Miller, Evangelos Kalogerakis, James Priest, Madalina Fiterau
Abstract
Mitral regurgitation (MR) is a heart valve disease with potentially fatal consequences that can only be forestalled through timely diagnosis and treatment. Traditional diagnosis methods are expensive, labor-intensive and require clinical expertise, posing a barrier to screening for MR. To overcome this impediment, we propose a new semi-supervised model for MR classification called CUSSP. CUSSP operates on cardiac magnetic resonance (CMR) imaging slices of the 4-chamber view of the heart. It uses standard computer vision techniques and contrastive models to learn from large amounts of unlabeled data, in conjunction with specialized classifiers to establish the first ever automated MR classification system using CMR imaging sequences. Evaluated on a test set of 179 labeled – 154 non-MR and 25 MR – sequences, CUSSP attains an F1 score of 0.69 and a ROC-AUC score of 0.88, setting the first benchmark result for detecting MR from CMR imaging sequences.
Link to paper
DOI: https://doi.org/10.1007/978-3-031-43990-2_23
SharedIt: https://rdcu.be/dnwLy
Link to the code repository
https://github.com/Information-Fusion-Lab-Umass/CUSSP_UKB_MR
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
This paper proposes an ML-based pipeline for with mitral valve regurgitation detection from cine MRI imaging in 4 chamber views without doing any additional flow-based acquisition sequence. To achieve that, this work builds a clever pipeline using existing LEGO bricks: segmentation of cardiac chambers is done with TernausNet, convolutional neural network with Barlow twins is used to obtain good feature representation from a large unlabeled dataset in a self-supervised fashion. Siamese network is then used to fine-tune this representation on a much smaller annotated dataset.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The idea to sequentially refine the feature representations (exploiting a large unlabeled dataset and then a smaller labeled dataset) is interesting. Barlow twins are a refreshing way to enable us to use large unlabeled datasets (here with 30000 images) in a self-supervised fashion and learn initial strong image representations, that can be refined with labels using with Siamese networks.
Strong validation of the results on a large dataset.
Comparison with an intuitive baseline based on cardiac volumes.
Excellent supporting material
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Trivial mitral regurgitation can be a common finding of a well functioning heart. The proposed method does not currently allow differentiation between light and severe mitral regurgitation. Is the output probability any indication of that?
Resnet18 is far from the currently best performing networks for image classification, which might limit the final performance.
Have you considered taking Resnet (or other) pre-trained on ImageNet instead of doing the self-supervised step and fine-tuned those? Would there be a big difference?
Reference to some more recent work: On - Automatic Aortic Valve Pathology Detection from 3-Chamber Cine MRI with Spatio-Temporal Attention Map
Missing reference to figure in the appendix
- Please rate the clarity and organization of this paper
Excellent
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The paper seems to be rather straight-forward to reproduce. Since it is possible to get access to the datasets, it would be great if the extra labels could be made available to the public, since significant effort has to be taken to annotate the data as was done here in this study to achieve comparable same consistency.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
I believe that the caption of Figure 9 in the appendix is very well done and would rather like to see that in the main paper.
“We used the segmentation model in 2.1 to locate the mitral valve and the orientation of the left ventricle.” - It is not obvious how this can be used for the orientation estimation. Can you detail this in a bit more detail?
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
7
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This is a very interesting paper introducing several interesting components and techniques from our community. This promotes re-usability of tools and datasets we have at our disposal. The proposed technique is a valuable tool for large-scale cardiac MR image collections. The paper is clearly written and the method is well validate. This constitutes a solid baseline for further improvements in mitral valve regurgitation. I strongly recommend publication at MICCAI.
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #3
- Please describe the contribution of the paper
The paper presents a methodology for classifying mitral regurgitation from 4 chamber MR images. In this work, a clever use of both consolidated supervised techniques and unsupervised representation learning is shown to overcome the relative scarcity of labeled data. The proposed framework outperformes baseline random forests and another deep model based on CNN-LSTM. In general, the authors provide a clear description of the methodology and an appropriate discussion of their experiments.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Clear propositions and descriptions of the work.
- Appropriate comparison against existing methods.
- Very relevant clinical application
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Only 2D MRI slices are used. 3D imaging may be better suited to detect anatomical and functional mitral valve defects.
- Limited information is included concerning the different training steps. More details on training steps and fine tuning strategies need to be added.
- No information about time required for training (each step) and inference.
- It would be interesting to discuss more in depth the role played by each training step in CUSSP. It is difficult to grasp how important, for example, the Barlow Twins training is.
- Poor reproducibility
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Poor reproducibility in general. This reviewer understands that sharing data might be impossible, but code publication would likely be a useful resource for other researchers.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
Excellent paper in terms of data usage for a clinically relevant problem. I suggest an editing phase aimed at making the training process more reproducible and going more in depth into what make the quite advanced training strategies necessary and how much they matter.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
7
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Solid and technically advanced paper addressing a relevant clinical application.
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #2
- Please describe the contribution of the paper
Early detection of mitral regurgitation is cruicial for optimal treatment outcomes. The authors develop an automated approach to detect mitral regurgigation from 4-chamber (4CH) view of cardiac magnetic resonance (CMR).
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The approach is the “first” attempt of fully automatic detection of mitral regurgitation from CMR. However, it is important to note that while the authors claim their approach is the “first ever automated MR classification system,” other automatic approaches have been developed for echo (such as Zhang et al., 2021, doi: 10.1155/2021/2602688 and Wang et al., “Automatic Detection and Quantification of Mitral Regurgitation on TTE with Application to Assist Mitral Clip Planning and Evaluation”). Additionally, the authors’ system is a “detection” system that provides a binary yes/no answer rather than a “classification” system that provides severity, as was done by Zhang et al.
-
The authors compare their method with traditional global-metric-driven approach (random forest one) and the existing wekaly-supervised CNN-LSTM and show their method outperforms the other two.
-
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
The writing quality is unsatisfactory, especially the introduction of CUSSP approach lacks clarity in terms of the motivation/purpose, input&output, and the model design of each step. A more detailed description of the rationale for each step and how they fit into the overall approach is needed.
-
Although the authors’ approach is the first attempt at a challenging problem, the sensitivity (positive accuracy) of 0.62 is low. It would be helpful to compare this sensitivity (AI vs. human) to inter-reader sensitivity (human vs. human) to help readers understand whether the limitation is due to AI or the subjective nature of the problem.
-
The logic behind the effectiveness of the CUSSP approach is not clear. In section 2.2.2 and table 2 the authors claim that existing weakly supervised learning approaches (CNN-LSTM) have difficulty understanding blood flow data from CMR images (sensitivity = 0.5), but it is not clear how CUSSP specifically addresses this issue. The use of unsupervised learning does not seem to directly target this difficulty (CUSSP-1 to CUSSP-3, which refer to unsupervised learning then supervised learning, show very low performance). It is also unclear why the fine-tuning of the last block of their model by the contrastive loss (Siamese network) improves performance (note CUSSP-SIAM has similar performance as CNN-LSTM, meaning their design has very limited improvement compared to the existing approach). The highest performing variant is CUSSP-SIAM-25 using fewer frames from CMR, but it’s hard to find the logical connection between fewer frames and better blood flow understanding. The authors should provide more insights into the effectiveness of their approach.
-
- Please rate the clarity and organization of this paper
Poor
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Good.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
-
please address all my comments in weaknesses.
-
refer to strength point 1, please soften your claim of “first mitral regurgitation detection system”.
-
The authors miss to mention which unlabeled data they used in for unsupervised learning.
-
where is AUC value is Fig.5?
-
Lots of stuff in section 2.1 is about general introduction of the existing segmentation methods and need to be moved to introduction section.
-
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
3
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Although it needs to be recognized that this is the first attempt to a challenging mitral regurgitation detection problem, the authors have not provided sufficient clarity and justification for their technical designs, especially how their designs tackle the difficulty in blood flow data. Additionally, the writing quality should be improved.
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
This paper proposes an ML-based pipeline for with mitral valve regurgitation detection from cine MRI imaging in 4 chamber views without doing any additional flow-based acquisition sequence. Authors use a clever strategy to combine well consolidated supervised techniques and unsupervised representation techniques, which helps to overcome the relative scarcity of labeled data.
Strengths of the paper:
- The approach is the first attempt of fully automatic detection of mitral regurgitation from CMR.
- The authors compare their method with traditional global-metric-driven approach and the existing wekaly-supervised CNN-LSTM and show their method outperforms the other two.
- They use a Barlow twins to enable the use of large unlabeled datasets in a self-supervised fashion and learn initial strong image representations, that can be refined with labels using with Siamese networks.
- Strong validation of the results on a large dataset.
- Comparison with an intuitive baseline based on cardiac volumes.
- Excellent supporting material
Weaknesses of the paper:
- Trivial mitral regurgitation can be a common finding of a well functioning heart. The proposed method does not currently allow differentiation between light and severe mitral regurgitation. It would be good to check if the output probability is an indication of that.
- It would be good to add some newer state-of-the-art methods on the literature review section.
- The logic behind the effectiveness of the CUSSP approach is not clear.It would be good to add a more detailed description of the rationale for each step in the CUSSP approach and how it fits with overall approach
- It would be helpful to compare the obtained sensitivity to inter-reader sensitivity to help readers understand whether the limitation is due to AI or the subjective nature of the problem.
Recommendation: The authors proposed a novel method for with mitral valve regurgitation detection. The paper would be a good contribution to the MICCAI community but it could be improved by adding some of the reviewers suggestions.
Author Feedback
We thank the reviewers for their feedback, and for their appreciation of our “clever strategy to combine … supervised and unsupervised representation techniques, which helps to overcome the relative scarcity of labeled data”, “excellent supporting material” and “strong validation”.
In response R1’s questions about the encoder architecture, we chose Resnet18 because we found that more complex encoders would easily memorize the limited labeled dataset, reducing the effectiveness of the embeddings. The Resnet was pre-trained on ImageNet. Following R1’s suggestion, we will move Fig 9 to the main text in the camera-ready. We will also add more details on the mitral valve localization and orientation estimation. In short, the segmentation of the LAX 4CH cardiac imaging provides locations (contours and centers) of all four chambers, which we use to determine the location and orientation of the mitral valve, which separates the left atrium and left ventricle.
W.r.t. R2’s comment on previous MR classifiers: this paper classifies MR from cardiac MRI, as opposed to classifying MR from ECG, so our main claim to novelty stands. We will cite this work and ensure that our claims include the specification “from heart MRI”. We used binary labels, as introducing additional classes would be challenging without more data. The indicated methods detecting MR from ECG use tens of thousands labeled samples, compared to ~700 for our method. Concerning the sensitivity, we adjusted our threshold to achieve higher precision which resulted in lower recall/sensitivity. In the future, we will use the pipeline on the large unlabeled dataset to scan for and adjudicate more MR cases. We thus value precision more than recall/sensitivity. The suggestion of comparing labels from multiple clinicians is valuable for future investigations but is outside of the scope of this paper.
About the request for explanations on why these particular models were chosen and why they work, our setting uses very limited labels, and a moderate number of unlabeled samples (tens of thousands). The design decisions reflect this. We focus on the patch relevant to the MR problem to reduce the number of parameters. The unsupervised learning component is necessary to extract a descriptive set of representations from the unlabeled samples, of which there are significantly more than the labeled ones. We chose Barlow Twins due to its versatility and robustness to distortions such as blurring, different sizes of the relevant areas and intensity variations, common in MRI images. Contrastive learning helps because it uses the very limited labels that we do have to obtain representations focused on encoding differences between classes. The input / output dimensions are indicated in the text, but we will add them in the figures as well. CUSSP explicitly works on the relevant part of the image (area around the valve) to capture the blood flow through the valve, while the CNN-LSTM relies on attention, which doesn’t seem to work as well. The performance gain for CUSSP comes from the combination of Barlow Twins and contrastive learning, with the decision to focus only on the frames relevant to the task bringing a lot of improvement because it essentially halves the number of parameters the model needs to learn. This made the model considerably more sample-efficient than the alternatives, essential in attaining high performance in the low label setting. Fig 5 describes the best performing model (CUSSP-SIAM-25) reported in table 2, so the AUC is 0.88, which we will add in the caption.
In response to R3’s suggestion about using 3D MR, the UK Biobank only includes one 2D slice for the 4CH view (the most useful for identifying MR). It is not possible to construct other high-resolution slices in this view from the available data. The training of the pipeline can be done within 3-5 days using a single 12G GPU, and the inference on 100 samples takes minutes at most. We will publish our code on GitHub.