Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zhenyu Tang, Zhenyu Zhang, Huabing Liu, Dong Nie, Jing Yan

Abstract

Pre-operative survival prediction for diffuse glioma patients is desired for personalized treatment. Clinical findings show that tumor types are highly correlated with the prognosis of diffuse glioma. However, the tumor types are unavailable before craniotomy and cannot be used in pre-operative survival prediction. In this paper, we propose a new deep learning based pre-operative survival prediction method. Besides the common survival prediction backbone, a tumor subtyping network is integrated to provide tumor-type-related features. Moreover, a novel ordinal manifold mixup is presented to enhance the training of the tumor subtyping network. Unlike the original manifold mixup, which neglects the feature distribution, the proposed method forces the feature distribution of different tumor types in the order of risk grade, by which consistency between the augmented features and labels can be strengthened. We evaluate our method on both in-house and public datasets comprising 1936 patients and demonstrate up to a 10% improvement in concordance-index compared with the state-of-the-art methods. Ablation study further confirms the effectiveness of the proposed tumor subtyping network and the ordinal manifold mixup.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_75

SharedIt: https://rdcu.be/dnwEs

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposed a new survival prediction model based on patients’ MR images. Besides the common survival prediction backbone, a tumor subtyping network is integrated to provide tumor-type-related features. A novel ordinal manifold mixup is used to enhance the training of the tumor subtyping network.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The intuition is clear. The paper is easy to follow.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Could you please compare your method with top 4 winners of BraTS2019 using the accuracy metrics rather than CI?
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It is hard to duplicate their model and its performance based on their description. In addition, it will provide limited source code to reproduce their paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Should benchmark with top 4 winners of BraTS 2019 using accuracy metrics.
    2. Should provide all your codes to reproduce your results.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. this paper can be accepted if it compares to SOTA of BraTS 2019 survival prediction task using accuracy metrics.
    2. this paper can be accepted if they can provide everything to reproduce their results
  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The author provides good reason to chose CI as their evulation metrics and did experiments on the new dataset. It also provides the source code.



Review #2

  • Please describe the contribution of the paper

    The paper proposes a new deep learning based pre-operative survival prediction method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper addresses an important problem in personalized treatment for diffuse glioma patients by proposing a new deep learning-based pre-operative survival prediction method. The proposed method integrates a tumor subtyping network to provide tumor-type-related features and a novel ordinal manifold mixup to enhance the training of the tumor subtyping network.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper does not provide details on the datasets used for evaluation, such as demographics and clinical characteristics of the patients. The paper does not discuss the limitations or potential biases of the proposed method, such as the generalizability to diverse patient populations or the potential for overfitting.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproducibility is uncertain.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Ablation experiments are not adequate,such as with/without manifold mixup, age, and so on.
    2. In addition to deepConvSurv, the comparison method should be compared with other advanced deep learning methods.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The experiment is not adequate.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper proposes a method for pre-operative survival prediction of diffuse glioma patients, where a tumor subtyping network is integrated into the prediction backbone to boost the survival prediction performance. A novel ordinal manifold mixup scheme is presented in the tumor subtyping network to address class imbalance in the tumor types, where ordinal constraint is imposed to make feature distribution of different tumor types in the order of risk grade. Patient age and tumor location information are also leveraged to boost classification performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • Extended existing manifold mixup approach to multiple classes while ensuring that the mixed-up label of two classes does not overlap with a third class by using an ordinal constraint and limiting the mixup only to neighboring classes • As the model for survival prediction is a deep Cox proportional hazards model, it can leverage both censored and non-censored data unlike models that predict survival days and can use only non-censored data. • Achieves SOTA performance and extensively compared to 4 baselines • Ablation studies done for various design choices of the network showing rigor in study design

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • The authors mention that the tumor mask is used to draw a bounding box for training the subtyping model, but it is not clear where the tumor mask is coming from – is it manually annotated or automatically segmented by a different model or otherwise? • It is not clear what kind of cross-validation was performed to determine hyperparameters.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code and internal datasets used for training are not available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    • Instead of reference 10, the authors should cite the latest WHO 2021 classification paper (https://doi.org/10.1093/neuonc/noab106). • Regarding the method used for determining the tumor position, if a tumor spans multiple blocks, then are all of them set to 1? But this does not consider how much of the tumor is present in which block. Can using a method like FSL’s atlasquery improve the performance because it gives more soft assignments instead of hard binary labels? The authors can refer to https://doi.org/10.1093/noajnl/vdad023 for details of how atlasquery can be used for this • How are the tumor masks available to the network? Do they need to be segmented separately, either manually or by another network? Perhaps using some kind of detection/classification approach as the tumor subtyping network can circumvent this • Were any ablation studies performed to find the optimal value of lambda (mixing weight)? • It will be good to have statistical tests for the comparisons between baseline and proposed method as well as for the different ablation studies.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper has novelty in method, rigorous evaluation and ablation studies, comparison to baselines, SOTA performance and well-written.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper presents a framework for pre-operative survival prediction of glioma patients. Based on the clinical findings that tumor types are highly correlated with the prognosis, the proposed framework includes an independently trained tumor subtyping network which provides additional features to the survival prediction backbone. The authors also propose the ordinal manifold mixup method for feature augmentation.

    The use of tumor subtyping to improve survival prediction is reasonable and the ordinal manifold mixup method seems novel. On the other hand, the reviewers have major concerns regarding the experiments and the reproducibility.

    Please clarify the following points in the rebuttal:

    1. Why the concordance index was used to measure the performance but not the more common metrics such as accuracy? As the BraTS’19 dataset was used, why not use the metrics used in the leaderboard?

    2. How the BraTS’19 dataset was used is unclear. Was it only used to test the trained models during testing? It seems that only the official training dataset of BraTS’19 was used. Should the official validation dataset without the ground truth be used to improve persuasiveness?




Author Feedback

We thank all reviewers for the valuable comments. We appreciate that they agree on the strengths of our work in terms of novelty (RM (MetaReviwer), R1, R2, R3), rational idea (RM, R1), well-written (R1, R3) and promising results (R3). Our detailed responses are presented below:

Q1 [RM, R1]. Why C-index but not the metrics used in BraTS’19, how BraTS’19 was used, and why not compare with the leaderboard of BraTS’19? A: The C-index is the most common metric for the Cox model and reflects a measure of how well a Cox model predicts the ordering of patients’ survival risk. The main reasons why we choose the Cox model rather than the overall survival (OS) days used in BraTS’19 are: (1) In survival analysis, the Cox model is widely accepted by physicians, while the OS days are rarely adopted in clinic because the OS days is not always the primary endpoint of interest. (2) Censored data is very common in survival analysis, and 57% of our in-house data are censored. For OS days prediction, censored data cannot be used, which is a huge waste of data. While for the Cox, all data can be used.

Since there is no tumor type information in BraTS’19, we use it as the independent testing dataset.

To compare with the Top 3 methods of BraTS’19, we changed our model to OS days prediction by replacing the Cox loss to MSE and trained with non-censored patients in the in-house dataset. BraTS’19 dataset is used as independent testing dataset. The resulting Accuracy, MSE (mean SE), and mSE (median SE) are 0.589, 113455.6, and 17622, respectively. The corresponding results (Validation) for the Top 3 methods of BraTS’19 are [0.586, 105061.874, 16460] for Top 1, [0.50, 99707, 18218] for Top 2, [0.31, 107639.326, 77906.27] for Top 3 (tie), and [0.448, 100000, 49300] for Top 3 (tie). So our method has the top Accuracy and second small mSE. For MSE, our method is large, and the main reasons could be: (1) BraTS’19 is used as independent testing dataset, and distribution shifting is inevitable using the in-house data as training data. (2) The number of non-censored patients in our in-house dataset is relatively small for OS day prediction using deep learning. (the Top 3 are radiomics methods with traditional ML and can work well with less data)

Q2 [RM, R2]. Details on the datasets used for evaluation. A: The in-house dataset is collected during year 2011-2022. Besides the age (49.7±13.1), the number of censored patients (983), and the distribution of tumor types (274 oligodendroglioma/324 astrocytoma/1128 glioblastoma) reported in our manuscript, the 1726 patients have 728 females and 998 males, the OS times (in months) of patients are 14.9±11.1 (non-censored) and 30.4±18.2 (censored), and 980 patients with KPS≥80.

Q3 [R2]. Ablation experiment, such as with or without manifold mixup, age. A: We train our model without manifold mixup and age, respectively, the corresponding C-indices are 0.771 and 0.763 for in-house dataset, and 0.752 and 0.746 for independent testing dataset (BraTS’19).

Q4 [R2]. Limitation of the proposed method. A: Currently, three tumor types are adopted, which can be further divided into 6 types by considering grades, i.e., O2, O3, A2, A3, A4, and GBM (O: Oligodendroglioma, A: Astrocytoma, GBM: Glioblastoma). Since the prognosis varies across different grades, our method still has large space for improvement.

Q5 [R3]. How are the tumor masks available to the network? A: The tumor masks are manually labeled and can also be generated by tumor segmentation methods.

Q6 [R3] Coding of tumor position. A: If a block contains part of the tumor, then the corresponding bit is set to 1. To make more precision position coding, one way is to use small blocks, and the other way proposed by Reviewer 3, which uses soft assignment, is better.

Q7 [RM, R1, R2, R3]. Reproducibility. A: We have upload the code for review purpose at https://anonymous.4open.science/r/OSPred and will officially release the code at the earliest time after acceptance.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed the main concerns in the rebuttal, thus I prefer to accept the paper.

    Note: I do not consider the new experimental results in the rebuttal as instructed in the reviewer guidelines.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presents a framework for pre-operative survival prediction of glioma patients. In the rebuttal the authors provided further details and clarifications as requested by the reviewers. Please include these additional explentations/results in the camera ready version.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal addressed many concerns raised by the reviewers, such as concordance index and BraTS’19 dataset. The response is adequate. I would suggest to accept this paper.



back to top