List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Yingjie Feng, Jun Wang, Dongsheng An, Xianfeng Gu, Xiaoyin Xu, Min Zhang
Abstract
We presented a novel radiomics approach using multimodality MRI to predict the expression of an oncogene (O6-Methylguanine-DNA methyltransferase, MGMT) and overall survival (OS) of glioblastoma (GBM) patients. Specifically, we employed an EffNetV2-T, which was downscaled and modified from EfficientNetV2, as the feature extractor. Besides, we used evidential layers based to control the distribution of prediction outputs. The evidential layers help to classify the high-dimensional radiomics features to predict the methylation status of MGMT and OS. Tests showed that our model achieved an accuracy of 0.844, making it possible to use as a clinic enabling technique in the diagnosing and management of GBM. Comparison results indicated that our method performed better than existing work.
Link to paper
DOI: https://link.springer.com/chapter/10.1007/978-3-031-16437-8_27
SharedIt: https://rdcu.be/cVRtd
Link to the code repository
N/A
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
This study attempts to predict the oncogene status of MGMT in glioblastoma patients using multiparametric MRI scans. This prediction has been previously investigated with mixed results. Using data from a prior challenge, the authors show that their evidential deep learning (EDL)-based approach achieve highest performance compared to other state-of-the-art.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The manuscript has multiple strengths: (1) tackling a clinically meaningful problem that can potentially alter management of glioblastoma patients; (2) a novel formulation and application of the EDL model to fuse high-dimensional features from multiparametric MRI scans; and (3) superior performance of the model compared to other state-of-the-art.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Few weaknesses are noted: (1) only a single dataset was used to train and test the model, making the generalizability of the model uncertain, and (2) it is not clear why the authors chose to discretize the overall survival prediction into three bins rather than perform a survival analysis (e.g., using a Kaplan Meier curve).
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The reproducibility of the work is adequate. The conceptual idea behind EDL is provided, Figure 2 summarizes the network architecture, and model parameters are provided in Table 1 / section 3.1. The code will not be publicly shared.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
Please provide additional details about the patient population used and whether any cases were excluded and for what reason(s). Also, clarify whether all MRI scans were acquired pre-surgery. Please also consider redoing the overall survival prediction task as a Kaplan Meier analysis.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
7
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This work presents an innovative use of the evidential deep learning framework in classifying MGMT status from multiparametric MRI scans. The study utilizes a large publicly available dataset and outperforms many of the existing state-of-the-art.
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
1
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
My fundamental concerns about the generalizability of the study and the modeling approach (classification versus Cox regression) remain. While I still feel that this is an interesting and worthwhile paper, my enthusiasm has diminished taking into account the other reviewers’ comments.
Review #2
- Please describe the contribution of the paper
Evidential deep learning for classification and regression is employed to predict the methylation status of MGMT and overall survival (OS) of GBM patients.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
(1) The idea of this paper is easy to understand. (2) The end-to-end evidence-efficient net is proposed to simultaneously classify the methylation status of MGMT and predict OS.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
(1) The author claims that the motivation for introducing evidential deep learning is to improve the prediction accuracy, which is untenable; Evidential deep learning is usually employed to address the “know unknown” flaws, rather than to improve the prediction performance of traditional models; (2) The description logic of the paper is confusing. MGMT methyl status prediction is a classification problem, while OS prediction is a regression problem, which should be solved by using evidence classification and evidence regression, respectively. However, the authors incorrectly used evidence regression to address the MGMT methyl status prediction problem; (3) Figure 2 illustrates the architecture of the proposed model, which obviously confuses the Dirichlet distribution and the evidence regression.
- Please rate the clarity and organization of this paper
Poor
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
It is hard to decide the reproducibility of the paper.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
(1) The paper should be written carefully; (2) The proposed algorithm should be introduced seriously; (3) The experiments should be improved.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
2
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Both the motivation and architecture of the proposed model are incorrect.
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
4
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Not Answered
- [Post rebuttal] Please justify your decision
Not Answered
Review #3
- Please describe the contribution of the paper
This paper proposed a deep learning-based framework for the tasks of Overall Survival (OS) prediction as well as prediction of methylation status of MGMT for patients diagnosed with Glioblastoma from multimodal MR images. The status of MGMT that is of vital importance for both diagnosis and prognosis purposes and its accurate prediction is of great interest for the communities. The main contributions of this work can be summarized as: Uncertainty information was integrated to the final prediction through Evidential-Regression approach, and Integration of a Gaussian distribution to characterize the final predicted values. The proposed pipeline was based on a tiny version of EfficientNet followed by some modifications. This model was tested on two standard challenge dataset that include multiomodal MR scans of subjects diagnosed with Glioblastoma. Performance of the model for both tasks of MGMT status prediction and OS prediction shows outstanding accuracy.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The proposed method to integrate the uncertainty to the final prediction score is the main strength of this paper. This method was well developed and described. It was tested on comprehensive datasets and the results were well analyzed. External comparison against several advanced models were done that show the superiority of the model.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
More details and descriptions are necessary to avoid the confusion by the readers regarding the data preparation, preprocessing and training. More importantly, the main contribution of the model is to perform the classification without needing for the segmentation labels. It is very important to verify if the model really learn from the target tumoral region to conduct the final classification.
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
As the authors were not shared some important details of the model development and training, it would be hard to reproduce the exact same experiments.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
Page2: it was mentioned that some studies recently highlighted the limitations of the studied models. It will be very interesting to briefly state some of this limitation and related such limitations to the reported results by other studies. For example, while some studies reported the prediction accuracy above 90 percent, the best performance of the challenge is less than 65 percent. The authors can connect such a significant difference between the performance with the limitation/challenges of the task. Section 2.2: please add a reference for the “aleatoric uncertainty” Section3.1: I assume the cross validation was done a subject-wise as there are several slices extracted for each subject. Please state this in the text to avoid the confusions for the readers. Section3.1: The description of the preprocessing steps seems to be valid only for the MGMT dataset in which 16 slices for each subject’s scan from Axial view are extracted. While the BraTS dataset was fed to the feature extractor with 4 channel volumetric images. Please specify the preprocessing steps for these two dataset and clarify whether the EffNetV2-T backbone is a 3D or 2D model. An important question is about the 16 number of extracted slices. How these slices were extracted? Please also clarify how the final classification metrics were clalculated for each subject if it was based on 16 extracted slices? In general, this confusion over 2D or 3D data classification should be addressed to also avoid the inconsistency when compared against other external references. As an important comment, when performing the classification tasks over the whole image instead of the segmented area, there is always the risk of feature learning from irrelevant texture/spatial information. As there have been quite many methods introduced to somehow visualize the learned features over the image space, it would of great interest to include this analyses and investigate whether the model was successful in capturing the features from the target regions or irrelevant regions?
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
6
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper introduced a novel method and tested it on comprehensive standard dataset. The results were analyzed carefully and external comparisons have been performed. Model justification and result discussion were relatively well done.
- Number of papers in your stack
5
- What is the ranking of this paper in your review stack?
2
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
i do believe this study presents the potential of a novel approach. The technical part of the method definitely needs more clarifications. The authors tried to address major concerns which in the context of 4000 characters seem to be acceptable.
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
mainly needs to address rev. 2 concerns: evidential deep learning to improve the prediction accuracy? evidence classification, evidence regression? confusing Dirichlet distribution and evidence regression?
- What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
4
Author Feedback
R1Q1: Only a single dataset was used. We wish to clarify that the dataset we used, BraTS, is a large collection of brain MRIs from multiple institutions over the world. Therefore, our model was actually tested on a diverse body of images. R1Q2: Why discretize the overall survival prediction into three bins. We discretize the OS prediction into bins to conform with clinical practice where in clinics it is impossible to predict a patient’s survival to the scale of days or even weeks. Thus, discretizing the OS prediction into bins of long-, medium-, and short-term survival allows us to robustly evaluate our model and compare it with existing methods, which generally placing their predictions of OS in several bins. R2Q1: EDL is usually employed to address the “know unknown” flaws, rather than to improve the prediction performance. We agree that usually EDL is used to “know unknown”. Due to the powerfulness of EDL, as Sensoy et al. showed in “Evidential Deep Learning to Quantify Classification Uncertainty”, one can treat the “prediction of a neural network as subjective opinions and learn the function that collects the evidence leading to these opinions by a deterministic neural net from data.” From this perspective, EDL can be extended to improve the prediction performance of learning models, which, in our case, we used to EDL to produce evidential distributions that better separate features arisen from binary or multi-nary populations, e.g., radiomics features from cases of MGMT positive vs MGMT negative. Then starting with the EDL-generated distributions, the learner can achieve higher performance in classification. R2Q2: MGMT methyl status prediction is a classification problem, while OS prediction is a regression problem. We would like to clarify our logic here. For both MGMT and OS tasks, we first used evidential regression to generate outputs over a range, e.g., [0, 1] for MGMT and number of months for OS. We then bin the output to classify them into groups. For MGMT, our threshold is 0.5, that is, output less than 0.5 is classified as negative, greater than 0.5 is classified as MGMT positive. Similarly we set three bins to classify months of OS prediction into long-, medium-, and short-term survival bins. R2Q3: Figure 2 confuses the Dirichlet distribution and the evidence regression. Thank you for pointing out the error. We will make the change. R2Q4: the proposed algorithm should be introduced seriously. We would like to clarify that our algorithm consists of a pipeline of several major steps, namely, feature extraction, EDL for generating evidential distributions, and classifiers for the tasks on hand. Specially, we trained two classifiers, one for classifying MGMT status, and another for categorizing OS prediction into three bins, long-, medium- and short-term survivors. In the final version of the paper, we will add the above description. R3Q1: More details and descriptions regarding the data preparation and training. We used the BraTS21 for MGMT classification and BraTS19 for OS classification. The MGMT dataset provides DICOM data, the number of sequences ranges from 10 to 300. We converted and resampled DICOM to 3D 16x256x256 NIFITI data. The OS dataset is 3D NIFITI data which do not need more preprocessing. The model is 3D, the training, validating, and testing is based on each patient not extracted slices. R3Q2: Verify if the model really learns from the target tumor region. We use GradCAM algorithm to produce visual heat maps to study the impact of the regions. The map shows that the features learned from tumor area always weight high. In some cases, features are learned in other region. We ran our model using both tumor area and whole brain, the AUC of whole brain is 8% higher. One limitation is the whole brain analysis may provide more information about the underlying task and introduce interferences from non-tumor region. We believe that more judicious feature design and selection can achieve high robustness.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
some of the concerns could be clarified.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
lower
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The reviewer 2 raised conerbs and asked for clarifications and recommened rejection. In my opinion, the authors have tried to address those in the rebuttal. Other rer reviewers have recommended acceptance. The reviewer 2 has not reponded to the rebuttal. Given these circumstances, and the fact that the paper addresses important questions that are relevant to MICCAI I recommend acceptance.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
6
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
In this paper the authors address the important question of leveraging different Brain MRI modalities to predict MGMT expression and overall survival. The results of the proposed method are very promising.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
3