Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Ana Lawry Aguila, James Chapman, Mohammed Janahi, Andre Altmann

Abstract

Understanding pathological mechanisms for heterogeneous brain disorders is a difficult challenge. Normative modelling provides a statistical description of the `normal’ range that can be used at subject level to detect deviations, which relate to disease presence, disease severity or disease subtype. Here we trained a conditional Variational Autoencoder (cVAE) on structural MRI data from healthy controls to create a normative model conditioned on confounding variables such as age. The cVAE allows us to use deep learning to identify complex relationships that are independent of these confounds which might otherwise inflate pathological effects. We propose a latent deviation metric and use it to quantify deviations in individual subjects with neurological disorders and, in an independent Alzheimer’s disease dataset, subjects with varying degrees of pathological ageing. Our model is able to identify these disease cohorts as deviations from the normal brain in such a way that reflect disease severity. Code and trained models are publicly available at https://anonymous.4open.science/r/normativecVAE-395C.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16431-6_41

SharedIt: https://rdcu.be/cVD6X

Link to the code repository

https://github.com/alawryaguila/normativecVAE

Link to the dataset(s)

https://adni.loni.usc.edu/data-samples/access-data/

https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access

Reviews

Review #1

Please describe the contribution of the paper

The authors aim to detect outliers (neurodegenerative diseases) from learning a distribution of atrophy patterns from healthy MRI.

Healthy Brain MRI from Biobank were processed using Freesurfer, resulting in gray matter volume of 16 subcortical nuclei and 66 neocortical areas (so presumably 82 features per scan). A variational autoencoder was used with a latent space of dimension 10. Additional variables are added as condition (input to encoder and concatenation of latent space).

The proposed criteria for detecting outliers is a z score over the latent space.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Simple idea with reasonable validation.

Interesting finding in Figure 4 that the pattern detected seems consistent with known neurodegenerative patterns associated with AD.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

It is unclear what the VAE approach add beyond a standard outliers detection of the underlying data, the only comparison provided in figure 3 seems to be with a mean square error of the original data.

It is also unclear how the hyperparameters were chosen and influenced the results.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

method is not clear enough to be reproduced, the network architecture is not provided.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

More investigations of the network design could be informative (dimension of latent space, network complexity,….), as well as including a state of the art statistical analysis on the raw data.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Clear and simple paper with promising results.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

4
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Not Answered
[Post rebuttal] Please justify your decision

Not Answered

Review #2

Please describe the contribution of the paper

The paper proposes a normative modeling framework with confounder removal for Alzheimer’s disease datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper tackles an important problem of normative modeling + confounder removal.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
There are major novelty concerns:
1. The confounder removal method (cVAE) is largely outdated. There exist many other options (e.g., domain adversarial networks, conditional normalizing flows, etc.)
2. The proposed deviation metrics are identical to the normative probability map (NPM) by Marquand et al. [16]. In fact, [16] suggests extreme value statistics to perform statistical tests on the NPM (which are essentially z-scores) for subject-wise statistics.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Reasonably reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

The paper tackles an important problem which, if successful, would have a broad impact. However, there are critical concerns about the novelty of the contributions. I would appreciate if the authors could comment on those.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

2
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

For above reasons, the paper holds little technical and analytical contributions.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

4
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Not Answered
[Post rebuttal] Please justify your decision

Not Answered

Review #3

Please describe the contribution of the paper

The authors proposed a latent deviation metric and use it to quantify deviations in individual subjects with neurological disorders. They claim model is able to identify these disease cohorts as deviations from the normal brain in such a way that reflect disease severity.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

variational autoencoder is a better model compared to autoencoder in terms of parameterizing your data. I am happy that the authors used this model. However, I am a bit unsure of the main contribution of the paper. Why authors directly started talking about Deviation metrics in page 4 without linking it to Eq. 2? Maybe I am missing something but It would be great to explain Pinaya’s method .
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

why github code is in abstract? Fig. 4 is kind of needs some work. Table 1, red colors needed to be black. You can use different symbols to emphasize the model.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

anonymous git is a good idea, but it should be in main text.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

Method needs re-writing. It seems the main contribution seems very minor without properly linking to the baseline.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Overal, paper is a good one. But, in rebuttal I want to see a clear method part. If we have a clear VAE model in the paper, we need also relevant theory from literature too.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Not Answered
[Post rebuttal] Please justify your decision

Not Answered

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

All reviewers acknowledged the importance of this work, given the problem is tackling. However, R2 showed some concerns about the novelty of this work. During the rebuttal phase, authors should address questions raised by all the reviewers with particular focus on those pointed out by R2.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

9

Author Feedback

We would like to thank the reviewers and meta-reviewer for their insightful comments and suggestions.

Novelty of work (R1, R2, R3). Existing normative modelling work operates in the feature space (i.e., individual brain regions/voxels). There exists no subject-level quantitative measure on the latent space of autoencoder models. Making use of the parameters of a probabilistic autoencoder (VAE), we derive a robust latent deviation metric which incorporates variability in healthy subjects as well as subject level uncertainty. By forming our metric as a z-score, it is easy to interpret and admits standard statistical analysis, similarly to [16]. This new latent metric improved detection of deviations over metrics calculated in the feature space, i.e., our baseline reference (Fig 3). We also provide a method for mapping deviations in each latent vector to the feature space (Fig 4).

Comparison with NPM in [16] (R1, R2). We respectfully disagree with R2’s comment and highlight the following differences between our latent deviation metric and NPM in [16]. Firstly, the NPM are calculated for each individual brain region and then combined via extreme value statistics. We instead calculate deviations in the latent space, thus incorporating the interactions between brain regions into our metric. Secondly, [16] train a gaussian process (GP) for each brain region. We train a single VAE for all regions. Also, GPs are computationally costly compared to VAEs. Thirdly, [16] use the covariates as input of GP to predict the brain regions, we instead remove the effect of covariates from the analysis. The key similarity between our work and [16] is that both use subject level estimates of mean and uncertainty provided by the learning model (VAE and GPR, respectively) to compute subject-level z-scores, which are common in normative modelling.

In response to R1, we trained GPs on the raw data using the PCNToolkit. Due to time and computational restrictions, we limited the training data to 1000 subjects. We generated p-values using the same approach as for Fig 3: UK Biobank: HC vs MS p=3.19E-3, HC vs BP p=3.61E-3. ADNI: HC vs EMCI p=0.0245, HC vs LMCI p=4.22E-5, HC vs AD p=1.67E-7. P-values are broadly not as significant as those calculated from DL; training incurred significantly higher computational cost.

Choice of method (R2). For normative modelling, autoencoders are the deep learning method of choice. We extend the autoencoder method to a conditional VAE to deal with confounds within the modelling framework. The methods suggested by R2 are interesting and could perhaps be included in further work. However, normalizing flows have been shown to struggle with out-of-distribution data (arXiv:2006.08545) which could be problematic given our aim is to perform outlier detection. Likewise, adversarial learning would require an additional network and loss term per confound which could prove more difficult to optimise.

Model parameter screening (R1). In response to R1, we screened different latent space dimensions (5,10,15 and 20) and encoding/decoding layer sizes (20, 40, 60, 80). We monitored the separation of CN vs AD. Our chosen parameters (listed in Fig 1) gave average performance. Whilst there is room for improvement, changing parameters could lead to worse performance on other disease cohorts. It is unclear how to best optimise for all cohorts simultaneously.

Clarification on methods (R1, R3). For subject k, [21] assume variation from the healthy population is encoded in X ̂_k, the decoder reconstruction, and use the difference of original data X_k and X ̂_k to derive a deviation metric in the feature space, DMSE. Instead, we use the parameters of the VAEs encoding distribution, μk and σk. We use the training samples to calculate the expected values (i.e., the population norm): we approximate point estimates as a gaussian such that μ ̅=E(μ_train) and σ=E(σ_train) in Eqn 4. We will update the confusing change in variable name in Eqn 3 and 4.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Reviewers aimed to address the major comments on the paper. However, the main concerns of the paper raised by R2 regarding alternative methods have not been fully addressed in my opinion. In addition, reviewers decided to keep their ratings.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Reject
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

18

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
- The authors seems to convince the reviewers, I vote for accepet.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

na

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The rebuttal clearly addressed the major concern of paper’s novelty with additional experiments that hsould improve the exposition of the work. Authors are encouraged to improve the paper writing to address clarification concerns and add discussions and insights on the difference of the work in the context of [16] and other relevant works.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

5

back to top

Conditional VAEs for confound removal and normative modelling of neurodegenerative diseases