Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Julio E. Villalón-Reina, Clara A. Moreau, Talia M. Nir, Neda Jahanshad, Simons Variation in Individuals Project Consortium, Anne Maillard, David Romascano, Bogdan Draganski, Sarah Lippé, Carrie E. Bearden, Seyed Mostafa Kia, Andre F. Marquand, Sebastien Jacquemont, Paul M. Thompson

Abstract

Multi-site imaging studies can increase statistical power and improve the reproducibility and generalizability of findings, yet data often need to be harmonized. One alternative to data harmonization in the normative model-ing setting is Hierarchical Bayesian Regression (HBR), which overcomes some of the weaknesses of data harmonization. Here, we test the utility of three model types, i.e., linear, polynomial and b-spline - within the norma-tive modeling HBR framework - for multi-site normative modeling of diffu-sion tensor imaging (DTI) metrics of the brain’s white matter microstruc-ture, across the lifespan. These models of age dependencies were fitted to cross-sectional data from over 1,300 healthy subjects (age range: 2-80 years), scanned at eight sites in diverse geographic locations. We found that the polynomial and b-spline fits were better suited for modeling relation-ships of DTI metrics to age, compared to the linear fit. To illustrate the method, we also apply it to detect microstructural brain differences in carri-ers of rare genetic copy number variants, noting how model complexity can impact findings.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16431-6_20

SharedIt: https://rdcu.be/cVD42

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    This paper evaluated, under the normative modeling Hierarchical Bayesian Regression framework, three model fitting strategies for the age effects on brain white matter microstructure across the lifespan. Using multi-site diffusion tensor imaging data, the authors found that compared to the linear fit, the polynomial and b-spline fits were better for modeling age trajectory. The authors further found that compared to the linear model, the b-spline model resulted in fewer ROIs with significant effect of a rare neurogenetic syndrome on microstructural brain differences, suggesting modeling complexity can impact statistical findings and therefore must be determined with caution.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • A novel application of the normative modeling Hierarchical Bayesian Regression framework in modeling age effects on white matter diffusion tensor imaging metrics.
    • Strong evaluation of different model fitting strategies under the HBR framework, illustrating their impact on 1) modeling age effects in multi-site brain DTI data, and 2) subsequent hypothesis testing based on the fitted outcomes.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Normative modeling (https://dx.doi.org/10.1016%2Fj.biopsych.2015.12.023) and hierarchical bayesian regression framework are not particularly novel.
    • Some aspects regarding the clarity and organization needs to be improved significantly. A general introduction about the Hierarchical Bayesian Regression framework needs to be provided first, followed by mathematical description in Methods. Texts introducing about 16pDel still appeared in the Methods section which should be moved to Intro. All the ROIs from the JHU atlas were provided in abbreviations without being fully spelled out in their first occurrence. Lastly, at the end of Results, it was stated that “For additional data, please see the Supplements,” but no Supplementary materials were found.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Sufficient details have been provided regarding the datasets. For code, it’s stated that PCNtoolkit (https://github.com/amarquand/PCNtoolkit) was used which is a publicly available package, but it is unclear whether the specific codes for the data analysis in this work were part of the package or will be made publicly available to enable reproducing the results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Per suggestion above, please work on improving the clarity and organization of the paper. 1) A brief introduction should be provided in Introduction about the Hierarchical Bayesian Regression framework for general audience. 2) Introducing 16pDel at Methods is deemed unnecessary. 3) All the abbreviations for the ROIs should be fully spelled out in their first occurrence. 4) Provide Supplementary materials.
    • Page 2: Citation(s) are needed for the statement “These deletions increase the risk for a myriad of neuropsychiatric disorders, including neurodevelopmental delay, autism spectrum disorder and attention deficit hyperactivity disorder.”
    • Table 1: Fix typo for “Sire 1”
    • Fig. 3: The texts are hard to read. Also I was wondering whether the analysis for the polynomial fit was performed and may be shown (if not, should justify in the texts more clearly), to compare the three fitting approaches.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The reasons that I lean towards a weak acceptance is that this work presents an interesting application of normative modeling in hierarchical bayesian regression framework, with strong evaluation of the different model fitting strategies. The results clearly show the utility of this framework for evaluating the feasibility of analysing (and interpreting results of) neuroimaging data pooled from different clinical studies, a known challenge in data harmonization. The merits slightly weigh over the weakness that this paper may address in its revised form.

  • Number of papers in your stack

    2

  • What is the ranking of this paper in your review stack?

    4

  • Reviewer confidence

    Somewhat Confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    Sufficient level of details were provided to address the main issues raised by meta-reviewer.



Review #3

  • Please describe the contribution of the paper

    This work aims to the comparison of three harmonization models, linear, polynomial, and b-spline in a large-scale dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This study uses a large-scale dataset consisting of multiple site datasets, provide a more sensible explanation.

    Also, the work provides new insight into the detection of rare genetic copy number variants, noting how model complexity can impact findings.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The motivation and evaluation of the study are limited in most clinical applications.
    • The explanation from the results about genetic-driven neuroscience seems a bit far-fetched.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It should be easy for authors to provide source code for reproducibility analysis.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Unclear why the study involves the evaluation of rare genetic copy number variants.
    • The paper seems to be poorly prepared and very rough, e.g., equation, table and reference.
    • Although the authors used HBR normative modeling to infer the distributional properties of the brain’s WM microstructure based on large datasets from healthy subjects, there is lack of results in age analysis, or across lifespan.
    • The discussion about the association between research motivation and genetic-related analysis is underexplained.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Lack of preparation. The manuscript is poorly organized. Interpretation of results needs to be improved

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    I don’t think the authors have properly solved my concerns about the current work. The author’s feedback hardly changed my initial judgment of this manuscript.



Review #4

  • Please describe the contribution of the paper

    The authors test the usage of Hierarchical Bayesian Regression for multi-site imaging data, to adjust for site. They experiment with linear, spline and polynomial models for age and sex, since the distribution of these can be confounded by site.

    They compare the models in terms of associations that they can find, where they compare micro brain structure measures of controls and carriers of a rare deletion on chromosome 16 (16p11.2).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is generally well written, It is work on exciting ensemble of data in relation to clinically relevant metrics. The authors are not creating a new method, but rather trying to understand the impact of modelling choices on downstream analysis. This is important work that has a direct relevance for applications in this domain.

    The major finding of the paper is that it surely matters how you model age in your data. This is very often overlooked, but yet a very important thing to investigate. Nonlinear age effects can be pervasive in biological data.

    The authors do a good job of describing the dataset and the way they process and model their data. They provide clear references to methods used and clearly cite the source of all software that they use in their analysis.

    The visualizations are ok, and the statistical comparisons are mostly good.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There are some weaknesses worthy to point out:

    1) The language is generally good, but thee re some typos, e.g. “Sire” instead of “Site” in the first row of table 1. Missing space in Site4 in row 4 of table 2. There are a few more issues like this that need to be fixed. Please use a spell checker tool or something like grammarly.com to fix this.

    2) In the 3rd paragraph of section 2.2, the authors claim: “Consequently, the algorithm does not yield a set of “corrected” data, i.e., with the batch variability removed, as in ComBat. It instead preserves the sources of biological variability that correlate with the batch effects.” Firstly, I would not call this an algorithm, but a model. If the biological variability correlates strongly with batch effects, then I doubt that you will be able to preserve them.

    3) Links to software should be in footnotes, e.g. PCNtoolkit.

    4) The paper suffers a bit from lack of novelty. This is mostly applying known and published methods to a new dataset, trying to solve a problem that many have attempted before.

    5) Captions of images are lacking. There are so many acronyms, these should be spelled out in the caption, e.g. fig 1.

    6) The y-axis numbers in Fig 1 should be numerically sorted, not alphabetically, the order is not 1,10,2,3,4…., 10 should appear at the end. It is not explained in the caption what the colors in the image mean.

    7) Fig 2 also needs acronyms spelled out.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility report seems to be honestly filled out.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Investigate the points listed in the main weakness comment.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is a decent dataset and relevant problem to tackle.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    I keep the same opinion as before.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The submission describes a comparison of three harmonization models for multi-site diffusion tensor imaging under the Hierarchical Bayesian Regression framework (linear, polynomial, and b-spline) in a large-scale dataset in order to demonstrate how modeling complexity can impact statistical findings. This study uses a large-scale dataset consisting of multiple site datasets.

    The level of technical novelty in the paper is low. The aim is to apply known and published methods to a new dataset and interpret the results. The experimental section helps illustrate the impact of modeling choices on 1) modeling age effects in multi-site brain DTI data, and 2) subsequent hypothesis testing based on the fitted outcomes.

    The submission is sufficiently clear, however, there are many places where clarifications should be made: study motivation, resolving acronyms, correcting typos, labeling color correspondences in figures, index ordering, …etc. The discussion section could also include more details about interpreting the results.

    No Supplementary materials were submitted, but that gets referenced in the submission.

    In the rebuttal, I call the authors to focus on discussing the following:

    • more details on the choice of study focus, eg why the evaluation of rare genetic copy number variants was selected
    • there is a lack of results in age analysis
    • technological contribution
    • interpretation of the results
  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4




Author Feedback

We incorporated all reviewers’ excellent feedback - thank you. We now better justify and motivate our focus on 16p11.2 deletion syndrome for our multi-site data analysis: because it is a rare condition (1:4000 live births), multisite data must inevitably be combined to achieve adequate statistical power. Rare genetic variants (CNV: copy number variants) are an illustrative (and extreme) case of the ubiquitous problem where clinical sites can only scan a few dozen patients each, so multisite data pooling is crucial and beneficial. To address our reviewers’ concerns, we now added text to the Introduction to clarify the high clinical relevance of our work (for neurogenetics) and show why rare variant studies motivate the method (which is of course applicable to other brain diseases).

We did not initially include an analysis of age effects on brain metrics, to focus on the abnormality detection performance of different models in detecting CNV effects on the brain, using normative data from multiple sites across the lifespan. In clinical settings, the main goal is to detect or flag abnormal deviations caused by brain disease at specific ages, albeit using an appropriate age reference. To address the reviewer request, we will add a succinct summary of the age trajectory of the DTI metrics that show strongest age effects and a corresponding figure in the supplements.

One the technical novelty of our work: we systematically test and validate a new multi-site normative modeling method, not previously applied to DTI. This is the first study adapting nonlinear Hierarchical Bayesian Regression theory to study rare neurogenetic conditions, and the first to analyze multisite brain DTI. It is crucially important to test new open-source medical imaging algorithms on new data modalities (e.g., diffusion MRI), and in novel contexts (rare genetic variants), to offer a roadmap to generate rigorous, reproducible findings. Our work aims to ameliorate the reproducibility crisis in the field, as rare variant effects detected in small samples would be unlikely to be robust or reproducible. By adapting the mathematics of normative models beyond structural MRI to DTI, we thoroughly compare alternative models and parameterization choices when merging diverse international data into a single coherent model. Our results reveal how model selection and diverse reference data affect the conclusions regarding abnormalities, in a field (rare variant genetics) where secure knowledge is lacking. We will clarify the technical contribution of this work (non-linear Bayes modeling, boosting rigor and reproducibility, novel extension to diffusion MRI metrics) in the Conclusion of the revised manuscript.

Biological interpretation of the results will be added. Abnormal tissue excesses associated with rare neurogenetic syndromes suggest cell migration failures or aberrant pruning, arising from loss (deletion) of genes crucial for these processes. Only with our normative-model derived maps can we test the reproducibility of these anomalies in new international samples.

Finally, we adjusted the figures per reviewers’ recommendations. We added full names for all atlas ROIs in the Methods section; as recommended, we moved all information regarding the 16p CNV problem statement to the Introduction.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This work aims to the comparison of three harmonization models, linear, polynomial, and b-spline in a large-scale dataset from three publicly available data sets.

    The introduced and tested methods are not new. The contribution lies in comparing their effect on large scale data analysis in the case of DTI over a large age range.

    The authors reacted to reviewers comments and clarified many sources of confusion from the reviews.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I agree that the novelty is low, but the method can be useful in the comunity for practical applications.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    15



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Work is relevant and well motivated. The rebuttal well adressed the point mentionedy by the MR, one of the reviewer increase there score, making the overall score clearly positive.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    8



back to top