Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Chenyu Xue, Fan Wang, Yuanzhuo Zhu, Hui Li, Deyu Meng, Dinggang Shen, Chunfeng Lian

Abstract

In addition to model accuracy, current neuroimaging studies require more explainable model outputs to relate brain development, degeneration, or disorders to uncover atypical local alterations. For this purpose, existing approaches typically explicate network outputs in a post-hoc fashion. However, for neuroimaging data with high dimensional and redundant information, end-to-end learning of explanation factors can inversely assure fine-grained explainability while boosting model accuracy. Meanwhile, most methods only deal with gridded data and do not support brain cortical surface-based analysis. In this paper, we propose an explainable geometric deep network, the NeuroExplainer, with applications to uncover altered infant cortical development patterns associated with preterm birth. Given fundamental cortical attributes as network input, our NeuroExplainer adopts a hierarchical attention-decoding framework to learn fine-grained attention and respective discriminative representations in a spherical space to accurately recognize preterm infants from term-born infants at term-equivalent age. NeuroExplainer learns the hierarchical attention-decoding modules under subject-level weak supervision coupled with targeted regularizers deduced from domain knowledge regarding brain development. These prior-guided constraints implicitly maximize the explainability metrics (i.e., fidelity, sparsity, and stability) in network training, driving the learned network to output detailed explanations and accurate classifications. Experimental results on the public dHCP benchmark suggest that NeuroExplainer led to quantitatively reliable explanation results that are qualitatively consistent with representative neuroimaging studies.



Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_19

SharedIt: https://rdcu.be/dnwx1

Link to the code repository

https://github.com/ladderlab-xjtu/NeuroExplainer

Link to the dataset(s)

http://www.developingconnectome.org/


Reviews

Review #4

  • Please describe the contribution of the paper

    This paper proposes a novel attention inspired framework for providing visual explanations for surface convolutions, applied to a cortical regression task. My understanding is that the attention mechanism implements a linear projection of the feature maps from left and right hemispheres to lead a loading that indicates how much each location, in each of the hemispheres combined, contributes to the prediction. This seems a good idea. I’m not so sure about the proposed regularisation, nor the results which seem confounded by data leakage, and focused on group average differences.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Good idea to combine across hemispheres Good idea to constrain the model on class-constrained attention Interesting ideas about problem specific regularisation to improve performance on small datasets

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Problems with notations The data set has not been used correctly and there is likely data leakage impact the performance of the results The visualisation results point to the model focusing on group average properties not individual differences

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Currently reproducing these results would not be straightforward. The authors should release a link to code The authors should fix their notation, in the absence of code this needs to be much clearer. The authors should give more details about the data set and which examples were used for train and test. As dHCP is open this could be released with the code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    In general I think this is an interesting paper, with some good features. However, I have concerns over the robustness of the validation and the inference from the results.

    My first comments refer to the clarity of the related works section, where I think some of the points are made clearly, or perhaps some related work is not discussed.

    First the authors write: “Notably, such post-hoc approaches are established upon a common assumption that reliable explanations are the results caused by accurate predictions. This assumption could work in general applications that have large-scale training data, while cannot always hold for neuroimaging and neuroscience research, where available data are typically small-sized and much more complex” If this is true, then this implies we should not trust the predictions of a classifier network full stop, which undermines the motivation for the paper. Is the idea here, to motivate the integration of class weighted attention, and the proposed regularisations? If so it could be much clearer.

    I think another point is to highlight the limitations of traditional CNNs, related to the inductive bias (limited receptive field of CNN filters) which limits a global view of how distributed features combine to make a picture. See also Dahan S et al MIDL 2022. Note that as this paper previously implemented self-attention for the cortex (as a vision transformer network), and is benchmarked on the same data, it should be referenced in the related work.

    There are problems with notation. Its not clear where f0 is used. Why does the notation for the attention gates change from A0 to Ain, where does Ain and Fin come from? It would be more helpful to stick with a consistent mutation which reflects the ico level notation, or the Unet level (whichever way you wish to see it). Personally I think something like FE4 to reflect EB-4 and FD4 for DB-4 etc would make more sense. Later you use Al i and Ahi .Please address.

    I like the idea of problem inspired regularisations but I’m not sure I agree with all of those used here. Dimitrova et al 2021 shows that the preterm phenotype is highly heterogeneous. In my opinion, this likely means that the disease specific variation is very difficult to disentangle against the backdrop of natural cortical variation, and that it perhaps does not make sense to implement a contrastive loss that seems to imply these things are completely separable?

    It also shows that the impact of preterm both is diffuse, not local, and impacts a significant part of the cortex, so does it make sense to impose sparseness? Indeed the final maps don’t look sparse at all?

    I think there might be problems with the dataset, as the dHCP released repeat scans of several preterm infants scanned twice, at birth and term equivalent age. It looks like the authors are using both scans for each individual, which firstly doesn’t make sense since classifying whether an infant is preterm or term is simple if the scan ages of the babies are many weeks apart. The authors should only be using the preterm infants scans taken at term equivalent age. For this there should be less than 100. There’s likely data leakage here as the same subject is probably appearing in both train and test but from different timepoints.

    As a result it is very difficult to read anything from the classification performance and results tables. A more meaningful challenge would be to regress degree of prematurity from the infants, using only scans taken from individuals close to term equivalent age. See the benchmark as described in: Fawaz, Abdulah, et al. “Benchmarking geometric deep learning for cortical segmentation and neurodevelopmental phenotype prediction.” bioRxiv (2021): 2021-12. And used in Dahan et al.

    It’s very difficult to read anything from the visual results. This is partly to be expected as the ground truth is not known. However the authors compare against the group average results from [1] which emphasises my concerns that the contrastive loss is forcing the network to look at commonalities across individuals within each group rather than being sensitive to an individual’s specific pattern of disease. Perhaps it would have been better to compare against a normative (gaussian process) model with respect to how much that one individual deviates from normal?

    minor: This is the dHCP data release citation: Edwards, A. David, et al. “The developing human connectome project neonatal data release.” Frontiers in neuroscience 16 (2022).

    The network architectures of the compared methods should be explained? Are they matched? How was the monet kernels parameterised

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I have concerns over data leakage I have concerns that the chosen regularisation metrics do not make sense and are driving predictions towards the group average.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    In this paper, the authors propose a new geometirc network, named NeuroExplainer to learn fine-grained explanation factors and further boost the performance of model andd representation extraction. This paper studies the problem of altered infant cortical development patterns associated with preterm birth, and according to the experimental results, the proposed method achieve better performance than current methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The paper is well-written and easy to follow. Fig 1 is slightly hard to understand by itself, but with the content in main text it’s also readable; (2) The problem studies in the paper is interesting and useful; (3) The proposed method is novel; (4) According to the results, the proposed method achieve significantly better performance than current methods

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) Both classification results and quantitative explanation results contain only one run of a method. The authors should provide confidence intervals; (2) Figure 1 is a little hard to follow, some more captions should help understanding; (3) The training procedure doesn’t seem to be optimal, since it didn’t include a learning rate scheduler nor early stopping criteria. Therefore, it’s possible that models weren’t trained to best.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method is reproducible if the code made public.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please refer to previous sections.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Both strengths and weaknesses lead to the final overall score.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    This paper introduces an explainable geometric deep network, NeuroExplainer, to to uncover cortical development patterns of preterm infants. Basically, the NeuroExplainer inputs high-resolution cortical attributes to form a hierarchical attention-decoding architecture using hexagonal convolutions. Particularly, the NeuroExplainer works in an end-to-end manner, where fine-grained explanation factors are fully learnable. By comparing the experimental results of both classification accuracies and explanation factors, the effectiveness of the proposed method is qualitatively and quantitatively evaluated.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is well-written and easy to follow.

    2. The technical part is detailed, as the implementation settings, equations, and figures are provided.

    3. The expriments are comprehensive, given both qualitative data and quantitative visualization are presented.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The experimental data is limited: in addition to the dHCP dataset, it is recommended to test the NeuroExplainer on other counterparts.

    2. The reasons of NeuroExplainer’s lower sparsity (0.73) shown in Tab.2 are missing. Compared to other methods (>0.9), this performance deserves a deeper investigation.

    3. The paper needs to present more examples, in order to show its potential in serving as a promising AI tool applied to other related cortical surface-based neuroimaging studies. Given the experimental results and visualization figures in current paper, its practical usage cannot be fully verified.

    4. Some typos are found: (i) on page 7, “3) To check the …” should be “2) To check the …”, and (ii) on page 8, “As shown in Fig. 3 …” should be “As shown in Fig. 4 …”.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Could be reproduced upon released codes and data.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please refer to the “weaknesses” above.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method is technically sound, coupled with sufficient quantitative experiments and qualitative visualization.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This work proposes a geometric deep-learning network to study cortical development in infant datasets. All reviewers see the technical contribution and the value of the method in studying infants’ brains. The reviewers also provide constructive feedback, in particular, comments from R#4. The reviewers also pointed out minor typos/formatting issues. The authors please make proper updates accordingly.




Author Feedback

N/A



back to top