Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Hannah Spitzer, Mathilde Ripart, Abdulah Fawaz, Logan Z. J. Williams, MELD project, Emma C. Robinson, Juan Eugenio Iglesias, Sophie Adler, Konrad Wagstyl

Abstract

Focal cortical dysplasia (FCD) is a leading cause of drug-resistant focal epilepsy, which can be cured by surgery. These lesions are extremely subtle and often missed even by expert neuroradiologists. “Ground truth” manual lesion masks are therefore expensive, limited and have large inter-rater variability. Existing FCD detection methods are limited by high numbers of false positive predictions, primarily due to vertex- or patch-based approaches that lack whole-brain context. Here, we propose to approach the problem as semantic segmentation using graph convolutional networks (GCN), which allows our model to learn spatial relationships between brain regions. To address the specific challenges of FCD identification, our proposed model includes an auxiliary loss to predict distance from the lesion to reduce false positives and a weak supervision classification loss to facilitate learning from uncertain lesion masks. On a multi-centre dataset of 1015 participants with surface-based features and manual lesion masks from structural MRI data, the proposed GCN achieved an AUC of 0.74, a significant improvement against a previously used vertex-wise multi-layer perceptron (MLP) classifier (AUC 0.64). With sensitivity thresholded at 67%, the GCN had a specificity of 71% in comparison to 49% when using the MLP. This improvement in specificity is vital for clinical integration of lesion-detection tools into the radiological workflow, through increasing clinical confidence in the use of AI radiological adjuncts and reducing the number of areas requiring expert review.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43993-3_41

SharedIt: https://rdcu.be/dnwNM

Link to the code repository

https://github.com/MELDProject/meld_graph

Link to the dataset(s)

N/A


Reviews

Review #4

  • Please describe the contribution of the paper

    This paper present a surface-based segmentation method for focal cortical dysplasia that use nnUnet + GCN with a distance loss + hemisphere classification loss.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. large and multicenter dataset
    2. well presented preprocessing, well-defined surface generation process
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Although the method itself make sense, what’s the reason to do surface-based instead of just 2D slice or 3D volume segmentation? Is it necessary? please provide reasoning.
    2. The performance experiments are not comprehensive, only an ablation study on the loss + a MLP method. would like to see more commonly seen method comparison, such as 3D resUnet, pure nnUnet, etc. Please provide comparison experiments for segmentation directly applied to T1/FLAIR.
    3. What’s the dice performance?
    4. The performance improvement from MLP to GC-nnU-Net is relatively low. I wonder if author can provide a little more detail in your rebuttal about the MLP approach so I can get a basic idea?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code will be available but the the segmentation annotation is going to be hard to obtain even if the researcher try to reproduce based on their own data.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    As stated in the weakness section, I think the paper would benefit from a better illustration of why surface-based method as the preprocessing required for this method is non-trivial; more methods comparison and segmentation performance results are also needed to give audience a better idea of the model performance.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    FCD detection is definitely a clinically challenging problem, and the authors carry such a valuable dataset. The method itself is a common combination of approaches in neuroimaging, such as those weighted sum of losses. A comprehensive experiments would add more value to it as the audience may want to see the segmentation performances for many approaches on this relatively big dataset for reference.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper uses GCNN for semantic segmentation of focal cortical dysplaysia. Significant performance gains are achieved over vertex based approaches using an MLP.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    It’s clear how they contrast methods and have an appropriate benchmark. A large number of subjects are studied.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Performance is likely limited by inter-rater variability but no comments are provided on who labelled the data, how many people labelled the data, etc. Little consideration is given to single lesions vs multiple lesions in individuals.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Sufficient details are given but I didn’t find links to code or data

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    See my comments on weaknesses

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper shows clear benefits with a large dataset. Inter-rater variability is an issue.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This work proposed to approach the problem as semantic segmentation using graph convolutional networks (GCN) to learn spatial relationships between brain regions and for robust segmentation of subtle epilepsy-causing lesions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The model was developed using a large size multi-center dataset and validated using independent data.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Only standard MLP was compared. More advanced state-of-the-art approaches should be included to confirm the advantages of the proposed model. It would be helpful to visualize the training and validation performance along the training epochs. Also, I would suggest examining the effects of varying the number of training samples on the model performance.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No code link was provided in the manuscript. The authors indicated that they will share the code to public after the paper is accepted.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Only standard MLP was compared. More advanced state-of-the-art approaches should be included to confirm the advantages of the proposed model. It would be helpful to visualize the training and validation performance along the training epochs. Also, I would suggest examining the effects of varying the number of training samples on the model performance.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See my comments above.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The study focuses on segmenting subtle epilepsy causing lesions using a graph convolutional approaches, and address a very challenging problem of detecting these subtle lesions on MRI that are considered normal at first visual review.

    The strengths of this study include:

    • Novel use of graph convolutional networks to segment the cortex and FCD in particular, that is able to learn spatial relationships between brain regions.
    • Implementation of a graph convolution networks within the nnUnet framework, top use a basic Unet for the task.
    • Well-written and well-organized manuscript.

    Please provide additional details about the spiral convolution and how one converts a convolution with a 3x3 grid into a spiral. The used features are already predefined. I’m wondering if there is work focused also on optimizing the features to be used.




Author Feedback

We thank the reviewers for their positive and constructive feedback. The main comments relate to inter-rater reliability in lesion masks, comparison to SOTA volumetric approaches, and surface-based feature optimisation.

Inter-rater reliability in lesion masks: Given the subtlety of the pathology, lesion masking is extremely challenging. At each of the 22 participating sites lesions were masked by a single expert neuroradiologist. At one test site, 3 neuroradiologists masked the same 10 patients. On average, a given radiologist’s lesion mask only captured 42% of a second’s mask. Expanding the first mask by 20mm increased this mean overlap to 84%. As expert interrater agreement is low, we do not report DICE. Instead we report whether a predicted cluster overlaps within a 20mm borderzone of the manual mask (sensitivity) and whether there are false positives (specificity).

Surface-based approach / comparison to volumetric SOTA approaches: MRI data sharing in clinically-acquired datasets is challenging due to ethical constraints. To collate a large dataset in a rare pathology, epilepsy centres preprocessed their MRI data, extracting only surface-based features for subjects. Centres shared anonymised matrices of surface-based features registered to a template space, circumnavigating data privacy constraints on sharing identifiable volumetric MRI. This approach nevertheless limited our ability to benchmark against SOTA volumetric approaches such as nnUnet which require the volumetric MRI data.

Optimisation of surface-based features: Choice of surface-based features was based on prior work[1,2], which demonstrated these features have power to differentiate lesional and healthy cortex, and quantified the significant added value of our current preprocessing procedures. In order to test the added benefit of network changes against this baseline MLP approach, the preprocessing was kept unchanged from this benchmark study. Future work may further improve performance through optimisation of surface-based feature preprocessing and collection of a volumetric MRI dataset.

Other comments: Visualising performance along the training epochs: we have included curves tracking train and validation losses (Fig S1). Effects of varying the number of training samples on performance: we evaluated performance varying training sample size and whether we could expect further performance gains with a larger cohort (Fig S2). Code availability: available now at www.github.com/MELDProject/meld_graph Single vs multiple lesions: FCD is primarily associated with a single lesion. There are some case reports of multiple lesions in patients with FCD but these are very rare. All patients included in this study had a single lesion. Details about the spiral convolution: A 2d convolutional filter is defined by the n x n neighborhood of the current pixel, whereas in a spiral convolution, the filter is defined by an outward spiral of length n, starting on the current vertex. This captures a ring of information around the current vertex, similar to how a 2d filter captures a ring of information around the current pixel. Limited performance improvement from MLP: The AUC improves from 0.64 (MLP) to 0.74 (GC-nnU-Net+dc). Fixing sensitivity at 67%, this translates to an improvement in specificity from 49% to 71%. This is a significant and meaningful improvement in performance. For radiologists reviewing putative lesions, a reduction in the maximum number of clusters from 13 to 3 will save time and improve confidence in the tool. MLP details: In brief: two hidden layers [40,10], 1 output node were used to classify each vertex as leisonal/non-lesional. To adjust for class imbalance, for each patient 2000 random lesional and non-lesional vertices were sampled per epoch. Threshold value was set using the Dice score on the train cohort [2]. Refs: [1] Adler, Wagstyl et al., Neuroimage Clin 2017 [2] Spitzer, Ripart et al., Brain 2022



back to top