List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Duo Xi, Dingnan Cui, Jin Zhang, Muheng Shang, Minjianan Zhang, Lei Guo, Junwei Han, Lei Du, Alzheimer’s Disease Neuroimaging Initiative
Abstract
Brain imaging genetics is a rapidly growing neuroscience area that integrates genetic variations and brain imaging phenotypes to investigate the genetic underpinnings of brain disorders. In this field, using multi-modal imaging data can leverage complementary information and thus stands a chance of identifying comprehensive genetic risk factors. Due to privacy and copyright issues, many imaging and genetic data are unavailable, and thus existing imaging genetic methods cannot work. In this paper, we proposed a novel multi-modal brain imaging genetic learning method that can study the associations between imaging phenotypes and genetic variations using genome-wide association study (GWAS) summary statistics. Our method leverages the powerful multi-modal of brain imaging phenotypes and GWAS. More importantly, it does not need to access the imaging and genetic data of each individual. Experimental results on both Alzheimer’s Disease Neuroimaging Initiative (ADNI) database and GWAS summary statistics suggested that our method has the same learning ability, including identifying associations between genetic biomarkers and imaging phenotypes and selecting relevant biomarkers, as those counterparts depending on the individual data. Therefore, our learning method provides a novel methodology for brain imaging genetics without individual data.
Link to paper
DOI: https://doi.org/10.1007/978-3-031-43904-9_60
SharedIt: https://rdcu.be/dnwH6
Link to the code repository
N/A
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
In this work the authors propose a method that can solve imaging genetics questions based on gwas summary statistics and does not require individual level data. The base method is a version of multi-task CCA. They compare their proposed method to the classic version that requires individual level data on ADNI (and find very similar results). Furthermore, they apply it to available summary statistics for UK Biobank.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- innovative method that leverages data that is available in abundance (gwas summary statistics)
- good comparison to the classic version of DMTSCCA providing the ‘gold standard’
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The method has some tuning parameters that in absence of individual-level data cannot be tuned. It would be good to show how sensitive the results are to changes in the tuning parameters
- The method, like many CCA based approaches, seems to be limited in the number of genetic markers that can be modeled due to the size of the resulting matrices and the computation cost of the optimization. The used size of 5000 genetic markers is quite low and will not allow for many meaningful analyses in the field. However, the approach is an interesting step in the direction.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
It would be good to be more precise on the genetic range that is being examined, e.g. chr19:start-end for the ADNI analysis and the specific locations for the UKB analyses.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- The approach replaces the SNP data covariance matrix with estimates from publicly available SNP data. This approach is quite elegant and has been previously used (among others) when computing polygenic risk scores [https://doi.org/10.1002/gepi.22050]. The phenotype covariance matrix is based on the correlation of SNP effect sizes. Here the authors may want to consider more advanced methods that consider the genetic LD structure (such as LD score regression - which in total sample overlap of the GWAS reduces to phenotype correlations), a better estimate on the phenotypic might bring the results even closer to the DMTSCCA values
- It would be good if the paper would state information on computational burden, will it be possible to scale this approach to genome-wide analyses? If not, what is a realistic upper limit on the number of SNPs that can be modeled?
- The figures are really small and hard to read, please (if at all possible within the page limit) make them more readable
- Can the authors comment on/clarify why W and S (e.g., in Figure 1) are quite similar? The S content should be similar by construction, but W should be individual for each imaging modality.
- Can the authors comment on/clarify how the genetic regions for the UKB analyses were selected?
- also for the UKB analysis the two FS-based features are typically quite similar, it would be more interesting to use summary statistics based on, e.g., DTI etc.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
6
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
innovative method that leverages data that is available in abundance (gwas summary statistics), although there are some drawbacks, which are shared with many CCA based approaches, the proposed method is an interesting step in the direction of leveraging summary statistics.
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
I don’t think the rebuttal really adressed the comments raised by the reviewers and therefore I don’t increase my score. It is still an interesting approach.
Review #3
- Please describe the contribution of the paper
While the proposed method, S-DMTSCCA, for identifying disease-sensitive brain imaging phenotypes and genetic factors associated with brain disorders using GWAS summary statistics is a promising contribution to the field, the lack of thorough comparison with state-of-the-art methods, limited experimental evaluations, and insufficient details in the figures and tables are significant weaknesses. Additionally, the method only focuses on a limited number of SNPs and a single chromosome, which may not be representative of the entire genome. Overall, while the proposed method shows potential, these limitations prevent it from being a significant advancement in the field.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The paper proposes a novel method, S-DMTSCCA, for identifying disease-sensitive brain imaging phenotypes and genetic factors associated with brain disorders using GWAS summary statistics, which has the potential to be more applicable to large-scale studies. However, the strength of this contribution is uncertain due to the lack of thorough experimental evaluations and comparison with state-of-the-art methods in the field.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
The study’s focus on a single chromosome and a limited number of SNPs could limit the generalizability of the findings to the rest of the genome. Future studies should consider analyzing more comprehensive genetic data to fully understand the genetic basis of brain disorders.
-
The paper lacks a comparison of the proposed method with state-of-the-art methods. This omission makes it challenging to assess the novelty and effectiveness of the proposed method in comparison to other available approaches.
-
The paper does not provide sufficient details on the DMTSCCA method used for analysis, making it challenging for readers to fully understand the approach and its potential limitations. Future studies should provide more comprehensive explanations of the methodology to facilitate replication and validation of the results.
-
The experimental results lack sufficient details in the figures, making it difficult for readers to fully understand the results. Future studies should provide more comprehensive explanations of the figures to facilitate the interpretation and replication of the findings.
-
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
It is unclear how reproducible the experiments in the paper are, as the authors did not provide detailed information on the code or data used. This lack of reproducibility makes it difficult for other researchers to validate or build upon the findings.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
The paper presents promising results that demonstrate the effectiveness of S-DMTSCCA in identifying important SNPs and their related imaging QTs simultaneously using multiple modalities GWAS summary statistics. However, it is worth noting that the experiments conducted in this study may not be enough to fully evaluate the performance of S-DMTSCCA. For example, the paper did not compare its proposed method with state-of-the-art methods in the field of brain imaging genetics, which could provide a more comprehensive evaluation of its performance. Additionally, the paper only analyzed a limited number of SNPs and focused on one chromosome, which may not be representative of the entire genome. Therefore, future studies should consider analyzing a larger number of SNPs across multiple chromosomes to obtain a more comprehensive understanding of the genetic basis of brain disorders and to further evaluate the performance of S-DMTSCCA.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
4
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper lacks a thorough comparison of the proposed method with existing state-of-the-art methods in the field of brain imaging genetics.
The experiments conducted in the study may not be sufficient to fully evaluate the performance of the proposed method, such as the limited number of SNPs and chromosomes analyzed.
The paper does not provide sufficient details on the methodology used for analysis, making it challenging for readers to fully understand the approach and its potential limitations.
The experimental results lack sufficient details in the figures, making it difficult for readers to fully understand the results.
The paper lacks a discussion on the limitations of the proposed method and potential areas for future research.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #4
- Please describe the contribution of the paper
The paper describes a method to perform a sort of canonical correlation between multiple quantitative traits (in this case brain related measures from PET and MRI) and multiple biological measurements (SNPs). The method describes how this can be achieved without individual level data, i.e. only from summary level data that is often publicly available.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The paper approaches an interesting problem, of how to jointly investigate SNPs and multiple phenotypes together in a unified framework, using only summary level data. Thus, making the research and analysis possible to a wider research crowd.
The paper is well structured and generally clear.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The major weakness is lack of novelty. The researchers are not solving a new problem, rather showing how it can be approached in a more constrained setting. This constrained setting is however quite important.
There are certain hyper parameters that the authors use based on investigation of individual level data. The authors do not clearly describe how these parameters could be found in case of no individual level data, or how sensitive the results are to these parameters.
It is still not clear to me how this method can improve over a standard GWAS, the authors do not try well enough to argue for that. They do not even mention linkage disequilibrium. The mention SigmaXX, which is probably the LD data, but it is not clear enough.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The authors do not take the reproducibility report seriously in my opinion. The mark yes to everything, also e.g. that they discuss the memory footprint of the method. This is not done in any obvious manner in the manuscript.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
Here are some comments questions and speculations:
1) How is linkage disequilibrium taken into account, is it the SigmaXX matrix? How sensitive is the method to different LD data being used?
2) A lot of wasted white space in sides of both figures.
3) The locus plots in panel a of figure 1 have no x-labels.
4) I don’t understand panel b) in Fig 1. same for Fig 2.
5) What are the red dots in panel c of Fig 1? Multiple dots outside the brains…
6) In section 2.1. you talk about bold V, yet I cannot find it in the equations.
7) In section 2.2. you say that in a GWAS people make the SNPs have unit variance. This is actually not always done. Very often this is not done, and then you would need frequency information to further scale this…
8) You need to state more clearly the benefit of doing this kind of joint analysis. Specifically when summarising the results in the conclusion.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
6
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper approaches an interesting problem and describes the approach well. I am not entirely convinced of how useful this is.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
Summary
- The authors proposed a CCA-based approach that can work with summary statistics
Strength
- A method that works with summary statstics is quite valuable since many large scale studies only release summary stats
Weakness
- It is pointed out by two reviewers that focusing on only 5K SNPs is by no mean sufficient. Since the effect size is very small, more accurate model is probably closer to a polygenic model where the effect is distributed among many many SNPs. Hence, a method should be able to handle large number of SNPs.
- This is more like a suggestion that weakness: “[…] genetic LD structure (such as LD score regression - which in total sample overlap of the GWAS reduces to phenotype correlations), a better estimate on the phenotypic might bring the results even closer to the DMTSCCA values”
Author Feedback
Thanks for the rebuttal invitation. We itemize our responses to major points as follows: 1) The number of SNPs (R1+R2): We agree that 5,000 is not a very large SNP data, and have tested our method on the 19th chromosome with 150,000 SNPs. We find that our method can also identify similar sets of risk SNPs to that obtained from 5,000 SNPs. That is, as a methodology paper, using either 5,000 SNPs or 150,000SNPs can validate the effectiveness. And in imaging genetics, many methodology papers use a small set of SNPs to validate their methods. In addition, using very large SNP data will increase the computational time for both comparison methods and our method, and thus we should add many additional works to reduce the computational time which could be another topic for imaging genetics. Finally, the major contribution of our study is to develop a GWAS summary statistics-based imaging genetics method that does not use individual-level imaging and genetic data, other than developing an effective computational imaging genetics method. 2) Figures (R1+R2+R3): The red dots in the figures denote the top 10 imaging QTs identified by DMTSCCA and S-DMTSCCA (R2+R3). In Fig.1.c, red dots indicated the identified important brain regions. If the surface file of the visualization software did not contain every brain region, some imaging QTs were outside the brain. And we will fix this. In Fig. 1.a, X-labels were the names of SNPs, and we only marked the significant loci due to the page limitation (R3). Fig.1.b and Fig.2.b were heat maps showing the important imaging QTs. We will rearrange Fig1.c, Fig.1.b, and Fig. 2 as suggested. 3) Tuning parameters (R1+R3): First, on the ADNI data, the parameters of comparison methods were tuned conventionally using a grid search method. Meanwhile, for a fair comparison, we also used ADNI data to tune the parameters for our method. Second, to evaluate the performance of UK Biobank GWAS summary statistics, we used a two-step method to tune the parameters for our method. We first tuned the parameters using datasets at hand whose individual-level data were accessible, and then we applied our method to GWAS summary statistics. The comparison methods cannot handle GWAS summary statistics and we did not tune parameters for them on the experiments using GWAS summary statistics. The details of the parameter tuning were included in the second paragraph in Section 3. 4) Task-relevant vs task-specific (R1): For some most significant SNPs which have strong correlations to AD, e.g., rs429358, both task-relevant and task-specific markers contained them. However, the effect sizes of the task-relevant S and task-specific W for these SNPs were different. Importantly, when we pay attention to the top 10 identified SNPs, besides the most significant SNPs, W and S found different task-relevant and task-specific SNPs, e.g., rs10414043 only identified by S, and rs73052335 only identified by v1 and v2 corresponding to different modalities and rs5117 only identified by v3 corresponding to the other imaging modality. This demonstrated the value of identifying task-relevant S and task-specific W. 5) Details of DMTSCCA (R2): DMTSCCA was not the major contribution of this study. The details of the model and the optimization can be found in the reference where DMTSCCA was proposed. 6) Comparison with SOTA methods (R2): DMTSCCA was one of the SOTA imaging genetics methods, and it can reduce to many SCCA-based methods. Thus, we used it as the comparison method. In addition, several SOTA methods, including DMAAN (2023) and DS-SCCA (2022), were deep learning methods to capture the nonlinear relationship between imaging QTs and SNPs. And they could not be re-developed to handle the GWAS summary statistics. Therefore, they were not suitable for comparison. Finally, some other methods, e.g., PGS or PRS, were mainly focused on predicting an individual’s risk other than identifying risk markers, which were inappropriate for comparison either.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The major points are addressed.
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The paper describes a method to perform a sort of canonical correlation between multiple quantitative traits and multiple biological measurements (SNPs).
key strengths:
- using only summary level data for CCA analysis
- well motivated and clearly written
key weaknesses:
- work on a small set of samples
- insufficient comparison.
The rebuttal gives a good explanation of why a small set of SNPs were used. The explanation for not comparing with some SOTA methods is not convincing.
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The overall consensus among reviewers is an accept although concerns raised by R1 have not been adequately addressed in the rebuttal. I recommend acceptance but suggest that the authors revise their manuscript to address R1’s questions.