Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Niharika S. D’Souza, Hongzhi Wang, Andrea Giovannini, Antonio Foncubierta-Rodriguez, Kristen Beck, Orest Boyko, Tanveer Syeda-Mahmood

Abstract

In a complex disease such as tuberculosis, the evidence for the disease and its evolution may be present in multiple modalities such as clinical, genomic, or imaging data. Effective patient-tailored outcome prediction and therapeutic guidance will require fusing evidence from these modalities. Such multimodal fusion is difficult since the evidence for the disease may not be uniform across all modalities, not all modality features may be relevant, or not all modalities may be present for all patients. All these nuances make simple methods of early, late, or intermediate fusion of features inadequate for outcome prediction. In this paper, we present a novel fusion framework using multiplexed graphs and derive a new graph neural network for learning from such graphs. Specifically, the framework allows modalities to be represented through their targeted encodings, and models their relationship explicitly via multiplexed graphs derived from salient features in a combined latent space. We present results that show that our proposed method outperforms state-of-the-art methods of fusing modalities for multi-outcome prediction on a large Tuberculosis (TB) dataset.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16449-1_28

SharedIt: https://rdcu.be/cVRU8

Link to the code repository

N/A

Link to the dataset(s)

https://tbportals.niaid.nih.gov


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a framework for multi-modal fusion of medical data based on multiplexed graph neural networks for multi-class classification for the prediction of clinical outcomes in tuberculosis. The authors applied the framework to a large cohort of TB patients with imaging (CT), genomic, treatment, demographic, and clinical data modalities. Comparisons are made to the class fusion schemes (early, intermediate, and late fusion), as well as several methods based on graph convolutional networks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper is well written and contains clear descriptions, explanations, and illustrations throughout. The use of multiplexed graph neural networks for multi-modal fusion appears to be novel, and the rationale for its use is intuitive and sound. The clinical task, data types, and sample size are appropriate for testing multi-modal fusion for classification. The results demonstrate that the proposed framework was consistently the best performer across the 5 different outcomes.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The literature review is very light, and only mentions the classic fusion schemes. Some modalities of data were only available for subsets of the patients, and it is unclear how this impacted the models’ performance. The procedure for forming edges between feature nodes is somewhat heuristic (e.g., saliency threshold), but this is fine for a proof-of-concept.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The data is publicly available through the Tuberculosis Data Exploration Portal, but some of the genomic data received additional processing to generate functional domains, and this process was only briefly described and may be difficult to reproduce. However, most of the data will be available, as will the code, so reproducibility is likely feasible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Other than the basic fusion techniques, perhaps a high-level summary of multi-modal learning approaches (e.g., for text and images) would be useful to provide more context.

    To understand the data better, please provide the prediction class frequencies for each data modality.

    If possible, it would be informative to see which relationships between modalities are most important for each outcome.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents an apparently novel but intuitive solution to an important problem in machine learning analysis of multi-modal data. The data used for testing is appropriate, and the results are promising.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    In this paper, authors adapt Multiplex-GNN theory for the purpose of multimodal/multi-omics latent fusion. They demonstrate their method on a multimodal dataset for Tuberculosis (TB) treatment outcome prediction. The dataset features 5 input modalities (Chect CT, genomics, demographics, clinical data, regimen data). The proposed Multiplex-GNN method is compared against various baselines and ablations, and has the highest and most consistent AUC values across the 5 output classes.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Demonstration of the method was done on a complex dataset which is “truly” multimodal, with 5 modalities: CT/Genomic/Regimen/Demographic/Clinical data. It would be great if authors, along with their method (which apparently will be open-sourced), also published their pre-processing pipeline which extracts latent representations from the individual models (e.g. genomic vector representation from genomic sequence raw data). This in itself is a complex task wihch requires domain knowledge from several experts. It would also short-cut the entry of new teams into this dataset and multi-omics fusion. If the data was already present in tabular form, and pre-processing only constituted in modality-wise d-AEs, please ignore this suggestion.
    • Authors propose to use Multiplexed-GNNs as a novel approach for latent fusion method. As far as I understand, its strength comes from the capability of the model to find salient correspondence across sub-dimensional feature spaces of the input modalities. Multiplexed-GNNs are theoretically well founded, a foundational book is cited in the paper for further study.
    • Strong baseline analysis, comparing 7 latent fusion methods. Two methods serve as ablation study - one model replacing Multiplex-GNN with Relational-GCN, the other removing the latent encoder.
    • Statistical robustness of results by comparing AU-ROC values by a DeLong test and reporting p-values.
    • Convincing results on the Tuberculosis dataset, with consistent performance of Multiplex-GNN on all 5 classes (and weighted avg).
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Demonstration of results on only 1 dataset (though page limit wouldn’t allow to explore more datasets, see 8.)
    • The method does not seem to have a built-in mechanism to deal with missing modalities (as compared to e.g. XgBoost), which is a common problem in clinical multi-omics. Instead imputation needs to be performed as pre-processing (in this case simple mean-imputation from the training set). Could be an aspect for further study.
    • The paper is very complex and hard to understand, especially for readers not familiar with Multiplex-GNN theory. Difficult to improve, given the page limit, but still somewhat of a weakness of the paper.
    • Several unclarities remain: 1) Are the d-AEs and c-AEs pre-trained and then frozen during multiplexed-graph training, or are they learned end-to-end with multi-target learning? 2) If there are 5 modalities, why is the “multiplexed graph [only used for fusion of 3 modalities, i.e.] for multimodal fusion of imaging, genomic and clinical data for outcome prediction in TB” (cf. bottom of page 2)? 3) In Fig 1, c-AE yields a 1D latent vector, so why are 3 latents illustrated, and why does this lead to 3 multiplex planes? And does this mean that K=3 in the figure, but the experiment actually uses K=32 concepts? I think it would be helpful to tie notation to figure contents, e.g. by indicating the values of k, K, P, i etc in the figure wherever possible.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • The dataset preparation is complex. It is well explained in 3.1., but for reproducibility (and ease of entry for other research teams), I again recommend the co-publishing of the data pre-processing code, from Tuberculosis Data Exploration Portal to the level of input into the d-AEs.
    • The dataset is publicly available.
    • According to the reproducibility statement, the method is planned to be published open-source (license model would be interesting to mention in the paper). This is not mentioned in the paper itself, however. Importantly, otherwise the method may be difficult to reimplement purely from the paper, unless very familar with multiplexed-GNNs.
    • Hardware requirements for reproducibility: Authors only mention the CPU. Was a GPU used for training? If not, was there a reason (e.g. multiplexed-GNN training cannot be easily accelerated by GPU)? If GPU, how much VRAM would be necessary to reproduce, or at least how much VRAM did the training GPU have?
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Authors demonstrate the performance of Multiplex-GNN on only 1 dataset (Tuberculosis). This is perfectly sufficient for MICCAI, would probably be hard to fit more experiments into the page limit. But for a journal extension, I would really be curious to see the performance on more datasets. Maybe also strong baseline methods from “conventional” ML on the d-AE joint latents, especially Gradient Boosted Classifiers (e.g. XgBoost).
    • The presentation of the multiplexed-graph framework is extremely condensed in this work. This may be hard to avoid within the page limit of MICCAI, but for a journal extension, I would appreciate a much clearer introduction (almost tutorial-style) into the theory
    • I believe citations 3&4 refer to the same book. Probably good to check&merge.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Interesting and novel approach for multimodal fusion via multiplexed-GNN. Demonstration on a “truly multimodal” dataset with 5 modalities (internally even 6, considering categorical and numeric clinical features as separate modalities). Convincing results, including an interesting ablation study and comparison to several baselines.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    This paper proposes a multiplex graph based representation for fusion of 5 clinically relevant types of data (drug regimen, chest x-ray, demographic, clinical, genomic) and a Graph Neural Network based learning algorithm performing multi-class outcome prediction for Tuberculosis disease

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    -A large number of baseline models (including one using a monoplex graph representation) are considered and experimented on -The proposed model shows statistically significant improvements over the baseline models for the majority of the comparisons -Dataset is relatively large with 5 types of clinically relevant data for patients

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    -There is no explicit discussion on possible ways to incorporate interpretability/explainability of the model predictions

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors are using the Tuberculosis Data Exploration Portal dataset, and they specify that training and evaluation code will be available for download

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Could you please comment on possible ways to incorporate interpretability/explainability into your model? Since this is a clinically relevant problem, this aspect would also be important for the practitioners

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The problem is clinically relevant and it proposes to use a large variety (drug regimen, chest x-ray, demographic, clinical, genomic) of types of data for outcome prediction. The experiments are adequate and clearly described. Statistical significance is shown to prove improvements over a large number of baselines including a monoplex graph based representation

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper is well written and addresses an important clinical problem. One thing that the authors could consider is to handle missing data problem, which is always a concern with multi-modal datasets.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2




Author Feedback

N/A



back to top