Authors

Jiangcong Liu, Le Xu, Yun Guan, Hao Ma, Lixia Tian

Abstract

Effective representations of human brain function are essential for fMRI-based predictions of individual traits and classifications of neuropsychiatric disorders. Contrastive learning techniques can be favorable choices for representations of human brain function, if it were not for their requirement of large batch sizes. In this study, we proposed a novel method, namely, contrastive learning with amplitude-driven data augmentation (CL-ADDA), for effective representations of human brain function and ultimately fMRI-based individualized predictions. SimSiam, which sets no requirement on large batches, was used in this study to obtain discriminative representations among subjects to facilitate later predictions of individuals’ traits. The fMRI data in this study was augmented based on recent neuroscience findings that fMRI frames with high- and low-amplitude are of quite different functional significance. Accordingly, we generated a positive pair by concatenating the fMRI frames with high-amplitude into one augmented sample and the frames with low-amplitude into another sample. The two augmented samples were used as inputs for CL-ADDA, and individualized predictions were made in an end-to-end way. The performance of the proposed CL-ADDA was evaluated with individualized age and IQ predictions based on a public dataset (Cam-CAN). The experimental results demonstrate that the proposed CL-ADDA can substantially improve the prediction performance as compared to the existing methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43907-0_37

SharedIt: https://rdcu.be/dnwcO

Link to the code repository

https://github.com/tianbjtu/CL-ADDA.git

Link to the dataset(s)

N/A

Reviews

Review #2

Please describe the contribution of the paper

The authors propose to extend normative models for multi-modal neuroimaging data using a variational approach. Authors also propose a deviation score in the normative modelling context for multi-modal data. They validate their model on large-scale datasets (UKB, ADNI) using significance ratio based on their new deviation metric, showing improvement over previous VAE-based models.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) Readability: We wish to thank the author for the overall readability of their work. The writing and the explanations are globally very clear and concise. 2) Related works: Related works about multi-modal VAEs have been extensively described and addressed in a fully understandable way and the model contributions with respect to these works are well motivated. 3) Novelty: Novelties with respect to model training and deviation scores computation appears to be valuable as shown by the experimental section. Concerning the deviation score computation, the Mahalanobis deviation score is computed with respect to the posterior distribution parameters. It would have been interesting to compare with the prior distribution (isotropic Gaussian with a diagonal covariance matrix). 4) Future works: conclusion discusses the potential use of conditional VAEs and confound variable integration to derive conditional normative modelling. I think that these perspectives are appealing and their mention is on point.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) Normative modelling vs Anomaly Detection: I think that it is unusual to address Anomaly detection under the name of « Normative Modelling ». Perhaps, we would have expected the use of a confounding variable such as age, sex or site to call it Normative Modelling. As there is no confounding variable to condition with, this work seems more similar to Anomaly Detection. In that case, the deviation score computation the use of traditional measures could have been discussed such as [1]. 2) Experimental Results: In experimental results, a measure of AUC would have been more relevant as a reader would have been more likely to understand to which extent this kind of method enables to classify Disease vs Healthy thanks to an accuracy-like metric. Additionally, the authors should explicitly mention how they cross-validated their hyperparameters (learning rate, early stopping, etc.), although I would assume it is based on the deviation metric computed on UKB. [1] Outlier Exposure: DEEP ANOMALY DETECTION WITH OUTLIER EXPOSURE, Dan Hendrycks et al., ICLR 2019
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Code is released and the datasets can be accessed on demand so this work should be reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

See the weaknesses.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is overall clear and concise. The motivations are well described and justified. The novelty is well grounded and comparisons with concurrent methods and uni-modal cases have been extensively performed which is valuable.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #1

Please describe the contribution of the paper

Taking the neuroscience findings that fMRI frames with high- and low-amplitude are of different functional significance as guidance to augment the data and generate the positive pairs in contractive learning.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) Propose a neuroscience knowledge-oriented approach to data augmentation and positive pair generation in contractive learning. 2) Introduce SimSiam as a method to learn brain function representation based on fMRI data.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) If the output of Encoder F serves as the input of the predictor, the SimSiam network’s impact may be limited. This could explain why removing Contrastive Learning didn’t significantly affect the network’s accuracy in the ablation study. 2) The design of the experiments could be better.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

It is ok.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

1) Using the term ‘Individualized Prediction’ may not be appropriate for this new prediction model, as almost all prediction models provide output for each individual, rather than for a group. This may cause confusion. 2) The experimental results were obtained from only one round of 10-fold cross-validation. To increase the robustness of the findings, it would be better to repeat the experiment at least five times and report the mean and standard deviation of the results. Additionally, it may be beneficial to include some traditional machine learning models as competing methods. 3) Please provide information about the basic parameters of the fMRI data used in the experiments, such as the number of frames and TR of each fMRI scan. Given that only half of the length (High-amp/Low-amp) is used to generate the FC matrix during training, it would be helpful to know how long a scan needs to be to achieve an effective result. 4) In the RSS of the co-fluctuation matrix shown in Fig. 2, it may be better to ensure that the high-amp and low-amp windows cover the entire time length, since the sum of frames with high co-fluctuation RSS and frames with low co-fluctuation RSS should equal the entire length of the fMRI signal. 5) Some mathematic symbols in equation (3) should be italic. 6) For training, one hyperparameter may be sufficient for Equation (6). 7) Please think about how to use the SimSiam network better in your future study.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The idea of data augmentation is innovative. The application of neuroscience knowledge-oriented data augmentation can be effective and useful in other related scenarios. Therefore, mechanism and knowledge-based data augmentation is of great importance and warrants further research.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This paper proposes CL-ADDA, an amplitude-driven data augmentation, to implement contrastive learning for fMRI-based individualized predictions.Since the high- and low-amplitude of fMRI frames are equipped with different functional significance, they are input into SimSiam (a contrastive learning method) as a positive pair. Authors verify that CL-ADDA outperforms the classical and existing augmentations with SimSiam at both age and IQ prediction tasks.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. CL-ADDA is a simple data augmentation method that can be combined with SimSiam, which can effectively alleviate the problem of insufficient data in fMRI related tasks to a certain extent.
2. The framework is clear and this paper is well-organized.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. Limited novelty: data augmentation driven by amplitude is a common approach in the medical field. For example, [1] achieved data augmentation by changing the amplitude in the Fourier Transform. Theoretically, CL-ADDA can be regarded as a special case of [1]
2. Limited disccusion of scalability for contrastive learning: authors want to verify that the proposed amplitude-driven data augmentation can be used at contrastive learning. However, authors only verify it at SimSiam framework.Although the authors emphasize that the use of the SimSiam framework is to alleviate the need for large batches, there are currently some papers [2,3] that use the batch size of 96 or 128 to train large-batch-required contrastive learning methods on fMRI-related tasks. Therefore, the scalability of amplitude-driven data augmentation is not clear.
3. Applicability might be limited: If the data augmentation method proposed in this paper can only be used on the SimSiam framework, then its applicability is limited.
[1]Anaya-Isaza, Andrés, and Martha Zequera-Diaz. “Fourier transform-based data augmentation in deep learning for diabetic foot thermograph classification.” Biocybernetics and Biomedical Engineering 42.2 (2022): 437-452. [2]Wang, Xuesong, et al. “Contrastive Functional Connectivity Graph Learning for Population-based fMRI Classification.”Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part I. Cham: Springer Nature Switzerland, 2022. [3]Liu, Yulong, et al. “BrainCLIP: Bridging Brain and Visual-Linguistic Representation via CLIP for Generic Natural Visual Stimulus Decoding from fMRI.”arXiv preprint arXiv:2302.12971(2023).
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Good, the parameters involved in the paper and the details of the method are given.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. Extension to other contrastive learning methods: this can help to verify the scalability of the proposed method.
2. It is necessary to supplement the hyperparameter analysis: what effect does different alpha in Equ. (2) have on the results, is it right to give high-amplitude FC maps a greater weight?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

According to the novelty and suitability of the experiment to the purpose of the method, I recommend a weak accept for this paper.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

All the reviewers agree that the proposed method is a neuroscience knowledge-oriented approach for data augmentation. The manuscript is well presented and clear. All the reviewers gave high scores regarding the novelty and effectivity.

Author Feedback

We thank the reviewers and the meta-reviewer for their positive assessment of our work and helpful suggestions for improvement. The following are our responses to the major critiques and the changes to be made in the manuscript.

The SimSiam network’s impact (R1). We did use the output of Encoder F as the input of the predictor, but this does not mean that the SimSiam network has a limited impact. Though an encoder F can be trained based on a single branch of SimSiam, the performance of such an encoder would not be as good as that based on the SimSiam structure. In fact, when we replaced SimSiam with a single branch structure, the MAEs for age and IQ predictions increased by 5.51% (from 6.992 to 7.377 years) and 11.28% (from 4.531 to 5.042), respectively.

The design of the experiments (R1, R3) We accept that our experiments were insufficient. According to the comments, we further performed two sets of experiments: 1) performed 5 rounds of 10-fold cross-validation, and observed only subtle changes in the accuracies for both ages (r = 0.888 ± 0.012, MAE = 6.939 ± 0.198 years) and IQ predictions (r = 0.614 ± 0.017, MAE = 4.573 ± 0.397). As changes in the main results are not permitted, we will not change the main results in the paper; 2) extended to other contrastive learning methods (MoCo [1] and SimCLR [2]) to verify the scalability of the proposed amplitude-driven data augmentation strategy. Our analyses demonstrate that both MoCo and SimCLR perform worse than SimSiam, and we will add the results to the Supplementary Materials (Table S1).

The novelty of amplitude-driven data augmentation (ADDA) (R3) The “Fourier transform-based data augmentation” proposed by Anaya-Isaza et al. (2022) was designed for augmentation of 2D thermal images, while our ADDA was specially designed for fMRI time series. Data augmentation methods for 2D images are applicable for time-series. One may argue that 1D Fourier transform can be applied to fMRI time-series, and augmented samples can then be obtained through changing the amplitude in the Fourier Transform. However, such a strategy will largely destroy the inter-relationships among multiple time-series. Therefore, the proposed ADDA can NOT be regarded as a special case of Fourier transform-based data augmentation.

The scalability and applicability of amplitude-driven data augmentation (ADDA) (R3) We performed additional studies by replacing SimSiam with MoCo [1] and SimCLR [2] to check the scalability of ADDA. The results will be added to the Supplementary Materials (Table S1). Its combination with SimSiam is only an application case of ADDA. In other words, ADDA can be used independently, e.g., in combination with graph convolutional network (GCN), BrainNetCNN, etc.

The influence of the hyper-parameter alpha (R3) We realized that the hyper-parameter alpha is unnecessary on seeing this comment. A better solution is to concatenate the outputs of the two encoders and make age/IQ predictions with a single predictor based on the concatenated features.

Other Problems (R1) 1) we will redraw Fig. 2 to ensure that the high-amp and low-amp windows cover the entire time length; 2) we will improve our Equations (3) and (6); 3) we will try to improve our expressions in the paper.

[1] He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning, In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9729-9738 (2020). [2] Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning. pp. 1597–1607. PMLR (2020).

back to top

CL-ADDA: Contrastive Learning with Amplitude-Driven Data Augmentation for fMRI-Based Individualized Predictions