Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zijian Dong, Yilei Wu, Yu Xiao, Joanna Su Xian Chong, Yueming Jin, Juan Helen Zhou

Abstract

Under the framework of network-based neurodegeneration, brain functional connectome (FC)-based Graph Neural Networks (GNN) have emerged as a valuable tool for the diagnosis and prognosis of neurodegenerative diseases such as Alzheimer’s disease (AD). However, these models are tailored for brain FC at a single time point instead of characterizing FC trajectory. Discerning how FC evolves with disease progression, particularly at the predementia stages such as cognitively normal individuals with amyloid deposition or individuals with mild cognitive impairment (MCI), is crucial for delineating disease spreading patterns and developing effective strategies to slow down or even halt disease advancement. In this work, we proposed the first interpretable framework for brain FC trajectory embedding with application to neurodegenerative disease diagnosis and prognosis, namely Brain Tokenized Graph Transformer (Brain TokenGT). It consists of two modules: 1) Graph Invariant and Variant Embedding (GIVE) for generation of node and spatio-temporal edge embeddings, which were tokenized for downstream processing; 2) Brain Informed Graph Transformer Readout (BIGTR) which augments previous tokens with trainable type identifiers and non- trainable node identifiers and feeds them into a standard transformer encoder to readout. We conducted extensive experiments on two public longitudinal fMRI datasets of the AD continuum for three tasks, including differentiating MCI from controls, predicting dementia conversion in MCI, and classification of amyloid positive or negative cognitively normal individuals. Based on brain FC trajectory, the proposed Brain TokenGT approach outperformed all the other benchmark models and at the same time provided excellent interpretability.



Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_34

SharedIt: https://rdcu.be/dnwHd

Link to the code repository

https://github.com/ZijianD/Brain-TokenGT.git

Link to the dataset(s)

https://adni.loni.usc.edu/

https://www.oasis-brains.org/


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a framework to generate effective embeddings for AD diagnosis and prognosis based on longitudinal functional connectome data. It consists of two major components, i.e., GIVE for node and edge representations generation, and BIGTR for representations fusion via Transformer. Experiments on OASIS and ADNI datasets validate its superiority against several baselines. Visual interpretation of the results is also provided.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The motivation of this work is meaningful and well explained in the introduction. (2) The network design is reasonable and tightly connected to the key challenge of leveraging longitudinal information. (3) Experiments were conducted on two different datasets with three tasks. Various baseline methods were employed to be compared. And visual interpretation is provided to demonstrate the potential clinical impact.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) The innovation of the algorithm in this work is limited. Specifically, it integrated several previous works, i.e., Evolvegen [14] (in the submission), DHT [6], and Graph Transformer [9], to build their GIVE and BIGTR modules in the AD-related applications. (2) Some network designs are not well-motivated. For example, they adopted different paradigms to include the longitudinal information in node embeddings and edge embedding respectively. (3) Some formulations are incorrect, i.e., the matrix size of Z, the augmented token features matrix appeared in the last paragraph of section 2.3. (4) Some implementation details are missing, such as the hyperparameters for the network implementation.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of this work is relatively low considering that they provide neither the source code nor the network design and training details.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    (1) In this work, the authors used two different strategies, i.e., GRU and graph convolution, to extract the longitudinal information for node embedding and edge embedding respectively. It would be better to clarify the advantages of these two strategies over each other and compare different strategies in the experiment. (2) It would be great to share the implementation and training details in the manuscript which would promote the reproducibility and impact of this work. (3) Considering that they used the trainable type identifier in section 2.3 to indicate the different positions of each vertex and edge in the Transformer-based BIGTR. The model size may be extensively increased as the number of ROIs and the number of longitudinal scans grow. Please compare the model size in Table 1, given that the dataset size is limited. (4) Please compare with more “multiple time points feasible deep learning methods”, such as “Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition (CVPR 2020)” in the comparison since you should demonstrate the proposed network could outperform other methods in modeling longitudinal information. (5) As the author mentioned that they used self-supervised learning to alleviate the over-fitting problem, please provide some details on this issue. It would help this work to be more interesting and influential. (6) Given that the authors employed a Transformer-based framework, they could perform the visual interpretation by analyzing the attention matrix in the Transformer layer. It would be better to compare the results of two interpretations from Transformer and their HyperDrop methods [6].

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The limited innovation and missing implementation details.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This work introduces a framework for longitudinal brain function connectome embedding by considering both spatial and temporal associations of brain regions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • As the authors mentioned, longitudinal FC embedding is a novel approach for fMRI-based diagnosis.
    • The idea of evolving graph convolution dual hyper-graph transformation is interesting. The authors carefully designed the embedding networks for longitudinal FC embedding.
    • Both edge-level and node-level interpretation can offer detailed insights for neurological analysis.
    • The authors conducted experiments on two public datasets and compared the proposed method with many comparative models.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The dataset is too small to train this big model. As the number of tokens increases, the number of parameters also increases (This model uses ROI node, ROI-ROI edges for all times, and Time-Time edges for all ROIs). How can this big model be trained with about 125, 60, and 91 subjects for three classification tasks? Considering it is 5-fold cross-validation, it is less.
    • The explanation about token-level self-supervised classification that could alleviate the over-fitting problem of small-scale datasets needs more explanation.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It needs to provide more details about the network architectures and learning strategies.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • What if evolving graph convolution work on the original graph instead of the neighborhood graph?
    • The authors need to conduct experiments repetitively (e.g., ten repetitions) to validate the model.
    • Please explain more about how the tokens are augmented. How can the trainable Type Identifier alleviate the over-fitting issue?
    • Is it more effective to create a unified graph from multiple longitudinal scans, rather than using each individual longitudinal fMRI scan as a standalone input during training?
    • For the extension version, it can be helpful to visualize the dual hyper graph transformation for better understanding.
    • To this reviewer’s understanding, HyperDrop results can be obtained for each subject. It is wondering about the consistency or variations of the results across subjects. How did the authors obtain Fig. 2?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed framework applies an existing method to rsfMRI data for AD diagnosis, having the technical contribution marginal.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    3

  • [Post rebuttal] Please justify your decision

    It remains concerned about the limited sample size to train the big model and the marginal technical improvement.



Review #3

  • Please describe the contribution of the paper

    This paper constructs a Brain Tokenized Graph Transformer (Brain TokenGT) for brain functional connectome (FC) embedding. It essentially consists of two modules: Graph Invariant and Variant Embedding (GIVE) for generation of node and spatio-temporal edge embeddings, and Brain Informed Graph Transformer Readout (BIGTR) for token augmentation and readout. Authors validate the method on three classification tasks, outperforming methods from literature by using GIVE embedding and BIGTR readout with relative small scale datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strength of the proposed method is that the spatial and temporal information of the FC trajectory is obtained based on GIVE embedding. The spatio-temporal embeddings will be useful for describing the disease progression.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed method uses trainable type identifiers and non-trainable node identifiers to augment tokens to alleviate over-fitting issues. The authors are recommended to further discuss the augmentation procedure in detail because it is important for small scale datasets. The proposed method includes two modules involved three procedures, the authors are recommended to explain whether it is an end-to-end method and to add the description of computing efficiency of the proposed method.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The experimental environment and configurations are provided in the supplementary file.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The authors are recommended to further discuss the performance of token augmentation in detail because it is important for small scale datasets.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    It is an interesting work. The spatio-temporal embeddings will be useful for describing the disease progression. But some description of the method needs more specific.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    Authors have answered my questions.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This work introduced a framework for longitudinal brain function connectome embedding by considering both spatial and temporal associations of brain regions. It consistd of two major components, i.e., GIVE for node and edge representations generation, and BIGTR for representations fusion via Transformer. Experiments on OASIS and ADNI datasets validated its superiority against several baselines. It is an interesting paper. During the review period, the reviewers raised several concerns, including limited technical novelty, incorrect formula, lack of implementation details, whether the small datasets are sufficient for the proposed research. The authors are encouraged to address them in their rebuttal.




Author Feedback

We thank reviewers and AC for the valuable comments, and we are delighted by the positive comments including “interesting paper/work/idea” (AC&R2&R3), extensive experiments (R1&R2&R3), “demonstrate the potential clinical impact” (R1), “offer detailed insights for neurological analysis” (R2), “useful for describing the disease progression” (R3).

Q1 The size of datasets and model (R1&R2). Due to the demanding nature of fMRI experiments, brain fMRI datasets commonly have small size, e.g., N = 70 for HIV diagnosis using fMRI in IBGNN+ (MICCAI 2022). Here, our sample size is ~100 participants (with 2-3 time points per participant). This is typical for longitudinal fMRI research in AD. Furthermore, our approach has appropriate model size for small-size datasets. Using transformer encoder of up to 2 layers, our model contains 0.67/0.78/0.79M parameters for the three tasks respectively. The number of parameters is comparable with CNN-based model (~0.9M) [8] and is much smaller than graph transformer-based models (~4M) [7]. Lastly, we acknowledge that more ROI/time point would lead to higher model size. Our model requires O(n^2) (n tokens) running memory, ~5s/epoch (to R3); however, we could lower it to O(n) by utilizing Performer kernel (ICLR 2021). Taken together with the fact that most typical longitudinal brain fMRI research only includes 3-4 timepoints per participant (Yu et al, MICCAI 2022), our approach is unlikely to suffer from inappropriate model size.

Q2 Token augmentation (R1&R2&R3). We augment token embeddings by non-trainable node identifiers (NI) and trainable type identifiers (TI). NI are normalized Orthogonal Random Features used to represent the connectivity structure in G^T. TI are trained jointly with the model end-to-end, specifically, we maintain a dictionary (torch.nn.Embedding), in which the keys are types of the tokens, the values are learnable embeddings that encodes the corresponding token types. It facilitates the model’s learning of type-specific attributes in tokens, compelling attention heads to focus on disease-related token disparities, thereby alleviating overfitting caused by non-disease-related attributes. Besides, it inflates 1 G^T for an individual-level task to thousands of tokens, which could also alleviate overfitting in the perspective of small-scale datasets.

Q3 Technical novelty and design of GIVE (R1). Instead of simply integrating previous work, we performed innovations based on the unique characteristics of FC trajectory, i.e., invariant set of nodes with variant connections. We “carefully designed the embedding networks for longitudinal FC embedding” (R2). Specifically, 1) in GIVE, we innovatively define spatial and temporal domains in FC trajectory by adding temporal edges, leading to spatio-temporal embeddings (R3). 2) In BIGTR, we generalize the previous graph transformer to multi-timepoints version so that it could model FC trajectory. Importantly, we innovatively perceive the neighborhood encompassing each node as a “dynamic” graph, whereas G^T is regraded as a “static graph based on the foundamental observation of FC trajectory, which make the paradigms for nodes and edges are different.

Q4 Baseline (R1). We added CVPR2020 model into our experiments. (%) HC vs. MCI: Acc-71.02 (10.30), AUC-71.74 (10.95); AD conversion: Acc-73.33 (13.33), AUC-72.50 (13.84); Amyloid: Acc-68.89 (16.31), AUC-68.67 (16.97). Ours significantly outperformed CVPR2020 in all.

Q5 HyperDrop (R2). We first averaged the scores across all subjects for each node/edge and then presented the top 5 nodes/edges. The top5 nodes and edges were among the top10 in ~83% subjects, indicating good consistency of the salient nodes/edges we showed in Fig2.

Q6 Incorrect formula (R1). Sorry for the typo. The second dim of Z is h+dp+2dq.

Q7 Implementation details (R1&R2). “The experimental environment and configurations are provided in the supplementary file” (R3). We will release the code upon acceptance.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors provided a good rebuttal, showing excellent understanding to the state of the art. Two reviewers provided their post-rebuttal comments and both of them maintained their original scores. The AC also carefully studied the paper. Although the paper has certain merits, the AC thought the weakness outweighed the strength. Therefore, a rejection is recommended. The manuscript will become a competitive one after further improvements.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Most of the major concerns (such as model capacity with limited training samples) have been well clarified.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Key strengths:

    • Interesting paper that considers spatio-temporal analysis of longitudinal fMRI data.
    • Evaluation was performed on multiple baselines and ablations

    Key weaknesses:

    • Token augmentation which is key to handling the small dataset sizes common in fMRI datasets needs clarification
    • Limited technical novelty in that existing approaches are combined

    The rebuttal I think helped clarify token augmentation, which needs to be incorporated into the paper. While it is true that existing methods are utilized, I do think that some adoption is required to handle the specific brain/fMRI analysis problem. While transformer model size can be a problem the rebuttal shows that the model size is reasonable, as only 2 layer encoder is used. Overall, I think this paper would be of interest, so I lean toward accept.



back to top