Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Benjamin Billot, Colin Magdamo, Steven E. Arnold, Sudeshna Das, Juan Eugenio Iglesias

Abstract

Retrospective analysis of brain MRI scans acquired in the clinic has the potential to enable neuroimaging studies with sample sizes much larger than those found in research datasets. However, analysing such clinical images ``in the wild’’ is challenging, since subjects are scanned with highly variable protocols (MR contrast, resolution, orientation, etc.). Nevertheless, recent advances in convolutional neural networks (CNNs) and domain randomisation for image segmentation, best represented by the publicly available method “SynthSeg”, may enable morphometry of clinical MRI at scale. In this work, we first evaluate SynthSeg on an uncurated, heterogeneous dataset of more than 10,000 scans acquired at Massachusetts General Hospital. We show that SynthSeg is generally robust, but frequently falters on scans with low signal-to-noise ratio or poor tissue contrast. Next, we propose “SynthSeg+”, a novel method that greatly mitigates these problems using a hierarchy of conditional segmentation and denoising CNNs. We show that this method is considerably more robust than SynthSeg, while also outperforming cascaded networks and state-of-the-art segmentation denoising methods. Finally, we apply our approach to a proof-of-concept volumetric study of ageing, where it closely replicates atrophy patterns observed in research studies conducted on high-quality, 1mm, T1-weighted scans. The code and trained model are publicly available at https://github.com/BBillot/SynthSeg.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_52

SharedIt: https://rdcu.be/cVRy6

Link to the code repository

https://github.com/BBillot/SynthSeg

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper
    • Novel robust automated segmentation framework on brain MRIs by adopting an hierarchy of conditional segmentation and denoising deep neural networks.
    • Thorough evaluation of the proposed model by comparing it against the SOTA and demonstrating its effectiveness in one of the volumetric studies related to ageing.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Proposal of a novel segmentation framework for clinical MRI scans that is robust against varying MRI protocols including contrast, resolution, deformations, low SNR and partial volume effects.
    • No retraining necessary.
    • Clearly stating the issues with current models and how SynthSeg+ tackles them with its novel architecture and how the configurational changes of intermediate stages impact the robustness and accuracy of the respective frameworks. Further, the paper also provides the rationale behind the outcome with respect to those changes.
    • POC volumetric study on age-related atrophy demonstrating the volumetric trajectories of respective brain structures. This fosters their work on adopting the framework toward neuroclinical use cases.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I don’t see any mention about when the proposed method fails either in not extracting the desired structures or producing outliers. Also, it would be beneficial to know what MR conditions affect the model’s outcome.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It looks like the authors are planning to release the code and trained models once accepted. This will help reproduce the results and help improve the model toward next frontiers.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • It would be beneficial to include conditions under which the proposed framework doesn’t perform well. This will help the reproducibility of the model and manage data accordingly.
    • Why Gaussian mixture model has been employed in step(b) of SynthSeg+. Can we use General mixture model instead considering the MR noise distribution (Rician) and possibly other unknowns.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed framework can be a potential benchmark in MRI automated segmentation of brain structures. This will also aid clinicians in analyzing neurological disorders better. Very well written paper by providing sufficient details for each section considering the space constraints. Clearly defined Additional details were provided in the supplementary material.

  • Number of papers in your stack

    3

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    The authors present SynthSeg+, a novel hierarchical architecture that enables large-scale robust segmentation of brain MRI scans in the wild, without retraining. According to the authors, the method shows considerably improved robustness relatively to SynthSeg, while outperforming cascaded CNNs and some state-of-the-art denoising networks. The authors demonstrate SynthSeg+ in a study of ageing using 10,000 highly heterogeneous clinical scans, where it accurately replicates atrophy patterns observed on research data of much higher quality.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors evaluate a previous tool, SynthSeg, on an uncurated, heterogeneous dataset of more than 10,000 scans, which is a remarkable number. The authors propose SynthSeg+, a novel method which uses a hierarchy of conditional segmentation and denoising CNNs. The authors show that this method is considerably more robust than SynthSeg, while also outperforming cascaded networks and stateof-the-art segmentation denoising methods. The paper is clear and readable. The figures are high quality.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The authors only provide comparison in terms of Dice parameter. Other metrics should also be provided. The authors could provide a fair comparison with other state-of-the-art techniques (in recent challenges) using public databases so the benefits of SynthSeg+ could be better displayed. A deeper discussion on related works should be provided, as the authors focus mainly in SynthSeg as a precedent.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code is not available yet but the authors guarantee that they will provide it if the paper is accepted. In that case I do not have other way to proof the reproducibility of the method without the database and code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The authors only provide comparison in terms of Dice parameter. Other metrics should also be provided. The authors could provide a fair comparison with other state-of-the-art techniques (in recent challenges) using public databases so the benefits of SynthSeg+ could be better displayed. A deeper discussion on related Works should be provided, as the authors focus mainly in SynthSeg as a precedent.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors propose and improved version of SynthSeg. However, the comparisons implemented are not enough, as there are plenty of works (in challenges for instance) with public databases which could have been used for comparison purposes.

  • Number of papers in your stack

    3

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    The authors have addressed correctly most of the reviewer concerns.



Review #3

  • Please describe the contribution of the paper

    This paper proposes a SynthSeg+ method based on SynthSeg network to conduct medical image segmentation.
    The proposed SynthSeg+ consists of two U-net components and one denoiser part in the manner of a hierarchical architecture.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The description of issues is very clear. This method yields promising results and is feasible in clinic.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1, Technically, the novelty of this paper is relatively weak because its major idea is a modified SynthSeg network where both U-Net and Denoiser network used in the paper have been proposed in other references. This paper just put them into a new project to carry out the segmentation for brain images. 2, The article mentions that the training dataset in S1 and S2 is augmented with four linear transformations. However, the denoising model is trained with some distorted images which will bring some negative effect on the training of S2. Therefore, the experiment still needs more reasonable schemes. 3, The composition of rotation, scaling, shearing, and transformations should be a liner transformation as well. It is not a nonlinear transformation. The author does not tell us how to choose the parameters of four transformations and how many mappings are used in the article.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of this method is a challenge since there are several ambiguous aspects such as some details of S2 and the parameter used to generate the low-resolution image.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    No

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This method is based on SynthSeg network. And its three components including two U-Net blocks and one denosier block which have been proposed and widely used in other references.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    3

  • [Post rebuttal] Please justify your decision

    Authors of the paper have made many efforts to give some wonderful one-to-one responses. However, I do not agree with some arguments in it. 1) The arbitrary resolution could be solved by some upsampling and downsampling technologies while the contrast issue is overcomed by data normalization. They are all very common methods in many data augmentation package. 2) There might be some new problem in uncurated clinical dataset which need more details to fully discuss these challenges. However, I did not find any new challenges except arbitrary resolution and contrast. 3) For the experiment, nnUNet might be an option to perform comparison since it could do pretty works on the arbitrary resolution issue and contrast issue.

    Therefore, I still wanna hold my decision.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper describes a robust segmentation method with validation on clinical data. Although the reviewers are impressed by the large scale clinical validation work, concerns on lack of novelty and lack of comparison to other state of the art methods were raised as well. The authors should provide their feedback to these concerns in the rebuttal.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7




Author Feedback

We thank the reviewers for their useful feedback. The main concerns with our submission are its novelty and the lack of comparison.

  • Novelty: We propose SynthSeg+, the first method that can readily segment clinical scans of arbitrary resolution and contrast, at large scale, and without retraining; the only previously existing method (SynthSeg) faltered in a considerable amount of scans. We emphasise that MICCAI explicitly welcomes application-oriented submissions that “demonstrate the clinical value of a method, or adapt state-of-the-art methods to a new problem”. We believe our manuscript fully fits in that category, since we apply our new method to a big (N > 10,000), uncurated clinical dataset from the PACS of our hospital. Moreover, we achieved this purpose by combining state-of-the-art methods in a novel fashion. By inserting a denoiser within a cascaded network, we overcome the shortcomings of each module: the denoiser improves the robustness of the first segmenter, while the second segmenter corrects the smoothness of the denoiser.

  • Lack of comparison: The only existing method that can segment scans of any resolution and contrast is SynthSeg, which we used as baseline in our submission – and whose robustness we vastly improve. As explained in the introduction, the only other possible alternatives would be Bayesian segmentation and domain adaptation. The former cannot handle changes in resolution and performs worse than SynthSeg (see Billot et al., 2021), whereas the latter is hardly usable for heterogeneous data as it needs retraining for every new contrast or resolution.

The other concerns are:

  • Inclusion of metrics other than Dice: we have added surface distances to Figure 5 without using any additional space. They show the same trend as Dice.
  • Comparison on public benchmarks: we are not aware of any publicly available dataset of heterogeneous clinical brain scans. We believe that our dataset is sufficiently large and heterogeneous to support the conclusions of this study.
  • Lack of nonlinear augmentation in S1 and S2: As explained in the manuscript: “the nonlinear transform is composed with three rotations, scalings, shearings, and translations.” Here we meant “composition” in the mathematical sense, indicating that we use both nonlinear and linear augmentations. We apologise for the confusion, and we have rephrased this in the manuscript.
  • Number of spatial transforms: As explained in 2.2, all augmentations (including spatial deformations) are performed on the fly, precisely to avoid relying on a fixed number of precomputed transforms.
  • Lack of details on augmentation parameters: these are already given in the supplement (Table S1).
  • Inclusion of cases where SynthSeg+ fails: these have been added to the supplement.

Again, we thank the reviewers, and hope that they will value our method. We are convinced that, by enabling the analysis of huge amounts of existing clinical scans, SynthSeg+ will largely increase sample sizes and statistical power of neuroimaging studies.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    In my opinion, the rebuttal addressed most concerns well. I agree with the authors that the work fits MICCAI under the demonstration of clinical value.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    9



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I still hold borderline opinion on this paper. One concern is that it did not benchmark with the SOTA nnUNet. On the other hand I agree that it has clinical significance. I recommend borderline acceptance.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    9



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    My judgment is aligned with that one of MR2: the main value of this work is the large dataset used in the experiments. In that light, the paper can be seen as a validation paper and can provide interesting insights. However, I agree with the remarks on the limitations of the work regarding the methods used in the benchmark.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    5



back to top