Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Lennart Bastian, Alexander Baumann, Emily Hoppe, Vincent Bürgin, Ha Young Kim, Mahdi Saleh, Benjamin Busam, Nassir Navab

Abstract

Statistical shape models (SSMs) are an established way to represent the anatomy of a population with various clinically relevant applications. However, they typically require domain expertise, and labor-intensive landmark annotations to construct. We address these shortcomings by proposing an unsupervised method that leverages deep geometric features and functional correspondences to simultaneously learn local and global shape structures across population anatomies. Our pipeline significantly improves unsupervised correspondence estimation for SSMs compared to baseline methods, even on highly irregular surface topologies. We demonstrate this for two different anatomical structures: the thyroid and a multi-chamber heart dataset. Furthermore, our method is robust enough to learn from noisy neural network predictions, potentially enabling scaling SSMs to larger patient populations without manual segmentation annotation.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_44

SharedIt: https://rdcu.be/dnwwX

Link to the code repository

https://github.com/alexanderbaumann99/S3M

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper describes a method to find automatically correspondences between complex and variable shapes in order to build a Statistical Shape Model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Interesting problem.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Poor introduction. Lack of technical details and consistency in the presentation of the methods. Poor presentation of the results.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The entire code will be open-source.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The introduction is poorly organized. First, the authors state that creating SSMs is cumbersome as it requires to find correspondences by using a deformable image method using “manual landmark annotations” but present, in the next sentence, “unsupervised methods” to do this automatically! They also claim it requires “smooth surfaces” to generate correspondences whereas many mesh registration methods use curvature features… which are significant only when the surface is not smooth. The authors should refer in particular to https://www.sciencedirect.com/science/article/abs/pii/S1361841521003169 What do you mean by “label noise” or “segmentation inaccuracies”? Topological variations are in general due to true anatomical variants and not “noisy annotations”.

    Figure 1 is interesting but we do not understand what are the input/output for each block. Moreover, blocks are not explained in the paper: why sharing weights in GNN? What is the link between LBO and FMNet? What is exactly the representation of T? We have the feeling that the method is a concatenation of tools which were developed before by the team and not a consistent pipeline dedicated to one precise task.

    Results are poorly presented. Why do you not present the average model of the thyroid and its variation through some modes? A thyroid measures around 4 to 5 cm so results of Table 1 are not very significant. What do you mean by “topological noise”? Figure 3 is more informative but why do you present Shapeworks results whereas “SURFFMNet can represent the more complex compositions 2 and 3 better than ShapeWorks”? We can find many research on 3D and 4D heart model (see e.g. https://www5.cs.fau.de/conrad/data/heart-model/ or https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010196 ). How could you compare your method with these results?

    In conclusion, this paper presents some interesting preliminary results which should be carefully analyzed in order to explain clearly the method and to extend experiments on large datasets.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    2

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Poor introduction. Lack of technical details and consistency in the presentation of the methods. Poor presentation of the results.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    I thank the authors for taking time to answer precisely to reviewers’ questions. The context (unsupervised) and motivation are now clear. Supplementary references will be analyzed. A better visualization of results (modes of variation) is proposed. Nevertheless, the description of the method is still quite short. The chapter 3 should be extended with more scientific justifications and explanations. Considering the authors’rebuttal, I change my opinion to: “weak reject”.



Review #2

  • Please describe the contribution of the paper

    • This paper proposes an unsupervised method that leverages deep geometric features and functional correspondences to simultaneously learn local and global shape structures across complex anatomies for statistical shape model creation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Based on geometric deep learning and functional correspondence, this paper proposes a novel unsupervised correspondence method for SSM curation.
    2. The proposed model is tested on two datasets and the experimental results are satisfactory.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The writing and organization need to be improved. Specifically, the methodology can be simplified while the visual comparison of ablation study can be appended.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    • This work is reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. A detailed description or schematic (architecture) of the graph neural network and FMNet can help the readers to reproduce the results.
    2. The results section can be enhanced by incorporating the compared methods’ output images.
    3. The analysis of the computational complexity of the proposed method and the compared methods can boost the significance of the approach.
    4. A deep literature survey and comparing the approaches is an excellent addition to the research.
    5. The writing and organization need to be improved. Specifically, the methodology can be simplified while the visual comparison of ablation study can be appended.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    • A novel unsupervised correspondence method for SSM curation based on geometric deep learning and functional correspondence is proposed, which is helpful for diagnosis. Thus I recommend to accept now.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    Statistical shape models lack automated approaches to derive corresponding landmarks in all of the datasets used. This paper evaluates the applicability of strategies for unsupervised correspondence and proofs the applicability in various medical domains. Thus, no manual labor for segmentations and landmark labelling is required. Key contribution: simultaneous training of deep geometric features as well as functional correspondence. State of the art approaches in unsupervised correspondence get outperformed. The proposed method is robust in case of noisy datasets, while other approaches work on smooth surfaces only.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The state of the art and the related work are presented in a very accurate and precise way. Evaluation is performed in an objective way, comparing results achievable by various frameworks.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    While the utilized (public) datasets are described in depth, the algorithmic details get not explained in-depth, thus reducing the level of reproducibility

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    While the utilized (public) datasets are described in depth, the algorithmic details get not explained in-depth, thus reducing the level of reproducibility

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Maybe instead of “µMatch [24] recently” a sentence beginning with an upper case letter can get adapted. To construct a graph, we first extract a surface mesh from a 3D volumetric grid using marching cubes [27]  lack of details. Is the marching cube calculated on an iso-value-base, taking into account the scalar distance regarding the provided segmentation thresholds? At least the cited original paper [27] works on a binary strategy while incorporating the distances regarding the thresholds allows to derive smoother results as the triangle sizes can vary. “We refer to the supplementary materials for the complete definition” is ok for the draft – but the supplementary material will not be available for the final version then. Key aspects should then be discussed directly in the paper! Data augmentation process not described in detail, cf. “and small surface deformations”… how??

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    evaluations proof applicability of the method

    paper is well-structured and sound

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper introduces a new approach to automatically finding correspondences between complex and variable shapes to build a Statistical Shape Model (SSM). The proposed method utilizes deep geometric features and functional correspondences to simultaneously learn local and global shape structures across complex anatomies for SSM creation. The method is tested on two datasets, and the experimental results are promising. However, using the term “unsupervised” is somewhat misleading, as segmentations are still required for the SSM task. The introduction is poorly organized, and the methodology could be simplified. The visual comparison of the ablation study could also be appended. The authors should clarify their use of terms such as “label noise” and “segmentation inaccuracies” and better explain the input/output for each block of the proposed architecture. The results are not presented effectively, and the authors should consider presenting the average model of the thyroid and its variation through some modes for all methods reported. Overall, the paper presents interesting preliminary results but requires further explanation and analysis.




Author Feedback

We thank the reviewers for their feedback and positive comments on the paper’s potential for diagnosis, well-structured presentation (R2, R3), and recognition as the best paper in both R2 and R3’s selections.

We would like to clarify several concepts to address the reviewers’ comments about the motivation and structure of the paper (R1, R2). While correspondence estimation is unsupervised, we will emphasize that our method requires segmentation masks (R1, MR). However, inaccuracies in segmentation labels can significantly affect downstream SSM models. The labels of the thyroid surface are noisy due to physical properties inherent to Ultrasound (US), such as phase aberrations and attenuation, which translate to artifacts in the labels. Fig. 2 illustrates this noise as well as the heterogeneity of the cohort.

While SSMs must encapsulate anatomical variations, they should be robust to noise. We push the limits of noisy conditions by generating an SSM from network segmentation predictions (dice score of 0.94 [25]), demonstrating that our method can reconstruct the surface better under adverse circumstances. To further highlight this robustness, we conduct an additional experiment. We measure the generalizability of SSM models built on noisy network predictions with respect to the ground-truth thyroid labels (Ours: 9.30, Shapeworks: 29.56). Our model built on noisy labels outperforms the SSM created from GT labels, suggesting that more data can improve the SSM even if the labels are inaccurate! This is supported by the differences in shape (Suppl. Fig. 1). We can include these results in the main paper if desired.

Expanding the literature review will indeed enhance the manuscript (R2). R1 provides several references we will include. The first distinguishes between pairwise and groupwise SSM construction methods, emphasizing the potential bias in pairwise approaches, which neglect the cohort during correspondence generation (Goparaju, 2022). The second (Unberath, 2015) and third (Rodero, 2021) works mentioned construct heart SSMs using Procrustes and LDDMM, respectively, pairwise optimization methods. In contrast, groupwise optimization approaches like ShapeWorks overcome this bias by jointly optimizing over a cohort. They demonstrate superior prediction of clinically relevant anatomical variations. In this MICCAI paper, we focus specifically on groupwise methods, as they are used most commonly for SSM curation. We have identified several situations where existing methods struggle.

We learn parameters groupwise with a Siamese network (shared weights) and achieve robustness by optimizing global structure with a functional mapping (T) between shapes under various augmentations. FMNet consists of a single linear MLP. We will clarify these points in the manuscript and Fig. 1 (R2, R3), including how we deform the surface and that marching cubes was run on binary labels (R3). The units used should be in mm^2 due to the Chamfer distance, giving the results high clinical relevance (R1).

We strongly disagree with the claim (R1) that our method is a simple “concatenation of tools developed by our team.” Our work is motivated by the fact that SSM methods have yet to leverage recent developments in functional correspondence. While our method builds upon recent literature from various research teams, the modeling of our proposed method is complex and extends beyond a simple concatenation of tools. The MICCAI community will benefit from these contributions, particularly where existing unsupervised methods fall short.

Lastly, we will incorporate modes of variation into Fig. 2 and include a visual comparison of all models in Fig. 3 to improve presentation (R1, R2). Regarding computation complexity (R2), our network has 0.68M parameters, while SURFMNet has 1.2M and µMatch has 0.2M. Moreover, ShapeWorks cannot be applied to unseen shapes and needs to be re-run for each new acquisition. Code will be released for full reproducibility.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors provided additional clarification and addressed some of the concerns raised by the reviewers. They clarified that while correspondence estimation is unsupervised, the method requires segmentation masks. They also explained the impact of segmentation inaccuracies on downstream SSM models and demonstrated the robustness of their method under noisy conditions. They proposed to include additional results and expand the literature review. However, the paper would need significant revisions and more experimental results to be ready for publication. The current version of the manuscript may not be ready for acceptance and would require another round of review before being accepted.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposes a technique for building a statistical shape model by automatically finding correspondences in a groupwise fashion. The method uses geometric deep learning to learn features and then learns functional correspondences. Some mild concerns were raised by the reviewers, which were addressed by the reviewers in the rebuttal. I recommend the paper to be accepted.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal did not address the concern of the discrepancy between the proposed expected outcomes and the actual results. For e.g. the paper showed examples of simple Y-shaped bifurcations with uniform radii instead of more complicated structures with spread-out branches with higher geometrical variability. The main concern was that the paper did not make a strong case for the benefit of this approach, by not comparing the actual reconstructions from their method (not the performance) against other established approaches.



back to top