Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Xiaohan Xing, Zhen Chen, Meilu Zhu, Yuenan Hou, Zhifan Gao, Yixuan Yuan

Abstract

The fusion of multi-modal data, e.g., pathology slides and genomic profiles, can provide complementary information and benefit glioma grading. However, genomic profiles are difficult to obtain due to the high costs and technical challenges, thus limiting the clinical applications of multi-modal diagnosis. In this work, we address the clinically relevant problem where paired pathology-genomic data are available during training, while only pathology slides are accessible for inference. To improve the performance of pathological grading models, we present a discrepancy and gradient-guided distillation framework to transfer the privileged knowledge from the multi-modal teacher to the pathology student. For the teacher side, to prepare useful knowledge, we propose a Discrepancy-induced Contrastive Distillation (DC-Distill) module that explores reliable contrastive samples with teacher-student discrepancy to regulate the feature distribution of the student. For the student side, as the teacher may include incorrect information, we propose a Gradient-guided Knowledge Refinement (GK-Refine) module that builds a knowledge bank and adaptively absorbs the reliable knowledge according to their agreement in the gradient space. Experiments on the TCGA GBM-LGG dataset show that our proposed distillation framework improves the pathological glioma grading significantly and outperforms other KD methods. Notably, with the sole pathology slides, our method achieves comparable performance with existing multi-modal methods. The code is available at https://github.com/CityU-AIM-Group/MultiModal-learning.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_61

SharedIt: https://rdcu.be/cVRzf

Link to the code repository

https://github.com/CityU-AIM-Group/MultiModal-learning

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper address the clinically relevant problem where paired pathology-genomic data are available during training, while only pathology slides are accessible for inference. To ensure effective knowledge transfer, a DC-Distill module is proposed to allow the teacher to provide knowledge via reliable contrastive samples with teacher-student discrepancy. A GK-Refine scheme is proposed to allow the student to selectively absorb the beneficial knowledge according to the gradient-based agreement.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1, The proposed problem is clinically relevant, where one data modality is present and another is missing. 2, The problem is clearly motivated and formulated, and each component is well explained. 3, The proposed methods properly addressed the challenges of missing modality in testing time.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The improvement to the accuracy seems incremental.
    2. Please clearly state the relationship between Eq. 1,2 and Eq. 3.
    3. How is g_{ens} in Sec. 2.2?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Data is publicly available. Code and split will be made available after review. So it’s reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Fixing the minor issues would make the paper better.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Except several small issues, the paper is well-written. Two major contributions properly addressed the presented challenges.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors propose a novel multimodal model using genomic+histopathology information for glioma grading. The model jointly uses three components: 1) State-of-the-art knowledge distillation techniques 2) Contrastive loss over teacher-student model features and 3) gradient-guided knowledge refinement. The model is compared with current multimodal models and outperforms them using the image modality, reaching ​​92.35% of AUC.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The multimodal framework is highly novel and includes recent advances in computer vision in a comprehensive manner.

    • The results clearly show that the use of genomic information in the multimodal approaches outperform the unimodal unimodal pathology approaches.

    • The proposed DC- Distill is more effective at transferring the knowledge than state-of-the-art knowledge distillation methods.

    • While the model has many components, each one with its own technical properties and rationale, the authors explained it clearly and with the necessary formal support.

    • The paper is very well written.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Nor, p_m (the labels ground truth) or the fusion F of genomic and CNN features are properly defined.

    • There are no qualitative results showing meaningful genomic expression + morphology in image regions.

    • There is no statistical significance test for the difference of the methods results.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The dataset is open, and the authors mentioned that the source code will be released. Therefore, I’m confident the results could be easily reproduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Figure 1 is great, but I think it is a little bit cluttered. Numbering each loss function component is a good idea but should have been ideal if you also reference and explain them properly in the same order in the text, currently only L_{DCD}^{m} does it.

    • In the first stage of the Training of the teacher the fusion of the genomic features vector and CNN features is denoted by a F, what is F? Concatenation? Kronecker or Hadamard products? Addition? I guess the 80 dimensional genomic vector gets transformed to an embedding dimension equal to the multimodal (and unimodal) embedding vector before classification. Anyway this should be clearly explained (since it is the multimodal part of the method!), but currently it is not.

    • This might be a great contribution to computational pathology if the code to reproduce the experiments is released with proper documentation and explaining how to extend it for other genomic+pathology paired datasets.

    • Extending this framework and showing the potential for different tasks would be a great contribution for the community and for a prominent journal in the field.

    • Qualitative results? It would have been great to see few heatmaps (for the different approaches) with paired gene expression data and compare with what is known for Glioma prognosis.

    • Is the AUC the average one versus rest AUC of binary classification problems?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    8

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • Clearly written paper.
    • Solid mathematical insights and knowledge of how to include state-of-the-art advances in ML in a clinically relevant problem.
    • Open access datasets and (not yet released) code.
    • Awareness of relevant works in the area/task and properly comparing to them.
  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    Authors propose a two-stage knowledge distillation framework for pathological glioma grading. At stage I, a multi-modal network is trained with both histopathological images and genomic data as inputs. At stage II, the privileged knowledge of the trained multi-model network is distilled to a unimodal model which only takes histopathological images as inputs. Authors further propose the discrepancy-induced contrastive distillation and the gradient-guided knowledge refinement to improve the performance of knowledge distillation. According to experimental results, the performance of the learned unimodal model can be very close to the multi-modal upper bound.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strength goes to the novel idea of transferring multi-modal knowledge to unimodal model via knowledge distillation. After training, the performance of the learned unimodal model can be very close to that of the pathology-genomic model and only require histopathological images as inputs. Besides, authors also propose discrepancy-induced contrastive distillation and the gradient-guided knowledge refinement to improve the performance of knowledge distillation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The description of stage I training is insufficient.
    2. The explanation of DC-Distill loss is insufficient.
    3. The description of data preprocessing is too brief.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Authors claim they will release the code. If so, the study could be reproduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. Authors should give more details on stage I training and explain the motivation of using mean-teacher simultaneously.
    2. More explanation about DC-Distill loss should be given to facilitate understanding.
    3. Authors should introduce the data preprocessing carefully. How to choose ROI? Working on which magnification?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Good performance, inspiring idea of transferring multi-modal knowledge to unimodal model via knowledge distillation

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Authors propose a two-stage knowledge distillation framework for pathological glioma grading. At stage I, a multi-modal network is trained with both histopathological images and genomic data as inputs. At stage II, the privileged knowledge of the trained multi-model network is distilled to a unimodal model which only takes histopathological images as inputs. Authors further propose the discrepancy-induced contrastive distillation and the gradient-guided knowledge refinement to improve the performance of knowledge distillation. The paper addresses a clinically relevant problem and is well-written. A few minor concerns were raised which the authors should consider addressing in the final submission.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    1




Author Feedback

N/A



back to top