Authors

Krithika Iyer, Shireen Y. Elhabian

Abstract

Statistical shape modeling is the computational process of discovering significant shape parameters from segmented anatomies captured by medical images (such as MRI and CT scans), which can fully describe subject-specific anatomy in the context of a population. The presence of substantial non-linear variability in human anatomy often makes the traditional shape modeling process challenging. Deep learning techniques can learn complex non-linear representations of shapes and generate statistical shape models that are more faithful to the underlying population-level variability. However, existing deep learning models still have limitations and require established/optimized shape models for training. We propose Mesh2SSM, a new approach that leverages unsupervised, permutation-invariant representation learning to estimate how to deform a template point cloud to subject-specific meshes, forming a correspondence-based shape model. Mesh2SSM can also learn a population-specific template, reducing any bias due to template selection. The proposed method operates directly on meshes and is computationally efficient, making it an attractive alternative to traditional and deep learning-based SSM approaches.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43907-0_59

SharedIt: https://rdcu.be/dnwdG

Link to the code repository

https://github.com/iyerkrithika21/mesh2SSM_2023/tree/main

Link to the dataset(s)

Pancreas: http://medicaldecathlon.com/

Reviews

Review #2

Please describe the contribution of the paper

The presented work proposes a method for generating sets of meshes with corresponding landmarks given meshes with unordered landmarks as input. The target application is statistical shape modelling. The correspondence establishment is done using an autoencodee that creates a latent space shape descriptor and another network that deforms a template mesh based on the individual shape descriptor. The template is optimised within the training process to gradually become more representative of the input shape space. The method has been compared to FlowSSM on a challenging pancreas dataset and shows improved performance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- the proposed architecture is simple and produces very good results
- evaluation has been performed on a challenging non-linear shape distribution
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- the method has only been compared to one other method. Other DL statistical shape methods like DeepSSM have not been included in evaluation
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Sufficient detail is provided.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The paper is generally written ok, but descriptions are sometimes a bit drawn out which makes it difficult to catch the main points. For example, the purpose of the analysis model is quite straightforward, but overly complicated described in the abstract and methods section. The state of the art discussion could be more specific with pros/cons of other methods as well. E.g. there is a variant of DeepSSM which can handle non-linear distributions as well (DeepSSM: A blueprint for image-to-shape deep learning models). I think the evaluation has been done well and shows an advantage over FlowSSM, but it would benefit from considering other methods as well.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I think the results are very good and the architecture is simple which should allow for easy adaptation and application. Evaluation would be more solid with additional methods for comparison.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #1

Please describe the contribution of the paper

This paper proposes Mesh2SSM which is an unsupervised correspondence generation framework for generating statistical shape model directly from mesh. The experiments suggest it outperforms the baseline method by a large margin.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The proposed method can work on mesh directly in a fully unsupervised way.
2. They use a variational autioencoder to learn a data-specific template from the latent space.
3. The improvement is prominent compared with FlowSSM.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. It’s not very clear that why P-VAE is necessary in this model. Why does the data-informed template facilitate correspondence generation? Can you provide any ablation study of this component?
2. This model is only compared with FlowSSM. It would be more convincing if more baselines were provided.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The author will release codes of this model and then all experiments can be reproduced.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

Please see the weaknesses.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The overall method is straightforward and easy to understand. The experiments show the improvement over prior baseline model. However, there are still some weaknesses that need to be addressed.
Reviewer confidence

Somewhat confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The paper presents an approach called Mesh2SSM for generating non-linear statistical shape models (SSM) using deep learning directly from meshes. The proposed method uses an autoencoder to extract the shape descriptor of the mesh and uses this descriptor to transform a template point cloud. Mesh2SSM also includes a variational autoencoder (VAE) operating on the learned correspondence points and trained end-to-end with correspondence generation network. The VAE branch serves as a shape analysis module for the non-linear shape variations and learns a data-specific template from the latent space of the correspondences that is fed back to the correspondence generation network. The method is demonstrated to have superior performance in identifying shape variations using fewer parameters on synthetic and clinical datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The main strengths of the paper include:
- Mesh2SSM is an approach that leverages unsupervised, permutation-invariant representation learning to estimate how to deform a template point cloud to subject-specific meshes, forming a correspondence-based shape model. This approach overcomes the limitations of traditional and deep learning-based SSM approaches.
- Mesh2SSM operates directly on meshes, which is an original way to use data compared to traditional SSM approaches that rely on landmarks or correspondence points.
- Mesh2SSM has the potential to establish statistical shape modeling from non-invasive imaging as a powerful diagnostic tool. The method is demonstrated to have superior performance in identifying shape variations using fewer parameters on synthetic and clinical datasets.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
There are some weaknesses in the paper that need to be addressed, including:
- This paper is limited to comparing with only one method.
- The overall pipeline of this method is not very clear (see in detailed comments)
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper would not be too difficult to reproduce.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- Can you explain the distinction between mesh encoder and point encoder? The article states that DGCNN is employed in mesh encoder, but isn’t DGCNN specifically designed for point cloud? What operation is used for point encoder?
- Why not create a template mesh with a set number of points and then utilize techniques to produce a mesh with a consistent topology?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The method is noval but not very clear.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper proposes a novel method in the context of deep learning-based statistical shape modeling. More specifically, the method contains an autoencoder-based correspondence generator, which can establish point correspondences between meshes in an unsupervised way by using graph convolutions, a latent space representation, and a pre-defined template. An additional shape analysis part of the approach utilizes a VAE that operates on top of the established correspondences and which can be utilized to perform non-linear shape modeling and to infer an unbiased, data-driven template that can then be used in the optimization of the correspondence generator.

Point-based statistical shape modeling is a classical approach for shape analysis and establishing point correspondences between the meshes to be analyzed is always a major problem. As highlighted by the reviewers, the paper introduces a rather simple but novel method to automatically infer the point correspondences between shapes via graph convolution-based/permutation-invariant representation learning that directly operates on the meshes. The approach is nicely evaluated on a challenging pancreas dataset that also highlights the advantages of the VAE component for non-linear shape analysis and on which it outperforms FlowSSM (a SOTA approach from MICCAI 2022). Identified weaknesses are rather minor and include the evaluation against additional deep learning-based shape modeling techniques such as DeepSSM and conducting an ablation study. In my mind, the paper would also benefit from an analysis of the model’s performance of meshes/shapes with more complicated topologies (e.g., non-genus-zero shapes).

Solid paper with a novel method that outperforms a SOTA approach on challenging data and for which consensus among the reviewers exists: I recommend its acceptance.

Author Feedback

Response: We appreciate the time and effort the reviewers have dedicated to providing valuable feedback which have tremendously helped us improve the quality of our work. What is the difference between mesh AE and point VAE? We apologise for the confusion caused by the terminology used. To clarify, we will rename the “point-VAE” model to “shape VAE” in order to better convey its functionality. The main difference between a mesh autoencoder (AE) and a shape VAE lies in the input and output representations they handle. A shape VAE operates directly on sets of landmarks or correspondences, aiding in the analysis of shape models. It takes a set of correspondences describing a shape as input and aims to learn a compressed latent representation of the shape. Importantly, the shape VAE maintains the same ordering of correspondences at the input and output, so it does not use permutation-invariant layers or operations like pooling. In contrast, a mesh autoencoder operates on the entire surface mesh representation of an object. Instead of working with individual points, it considers the connectivity and topology of the mesh. Crucially, the mesh AE learns a permutation-invariant latent representation of the mesh, meaning that the order of the vertices does not matter. Why do we need point VAE? Point VAE serve two primary purposes: Non-linear Shape Analysis: The point VAE branch in Mesh2SSM allows non-linear shape analysis of the learned correspondence points. This is important because anatomical variability can often exhibit non-linear patterns, such as bending fingers, soft tissue deformations, or variations in vertebrae with different types. Data-informed template facilitates correspondence generation: As seen with the results of box-bump experiments in Figure 1, the learned shape model is affected by choice of the template used. The shape model performs poorly when the template is a sphere or a box without the bump, as these shapes are far from the medoid shape, i.e., a box with a bump in the center (discussed in section 1). Hence, by learning the template point cloud from the data, Mesh2SSM reduces potential bias from selecting an arbitrary initial template leading to improved accuracy and robustness of the generated SSM. Why not create a template mesh with a set number of points and then utilize techniques to produce a mesh with a consistent topology? We agree with the reviewer; we can consider a mesh as the template where the vertices become our template particle positions. To achieve good surface sampling the template mesh will have to be pre-processed such that the vertices are uniformly spread across the surface of the anatomy with uniform edge lengths. How is DGCNN adapted for meshes? DGCNN (Dynamic Graph Convolutional Neural Network) is a popular deep learning architecture initially designed for point cloud data but can be adapted for meshes by leveraging the connectivity information present in the mesh structure.
Edge Features: In the first convolutional layer of the network, geodesic distance calculated on the mesh surface is used as edge features. By utilizing this adapted DGCNN architecture, mesh data can be effectively processed and analyzed, leveraging the connectivity information and capturing relationships between vertices. Why is Mesh2SSM not compared to DeepSSM? We apologise for the lack of detailed comparison between Mesh2SSM and other techniques. Mesh2SSM was not compared to DeepSSM as DeepSSM focuses on generating SSM directly from volumetric images as compared to meshes in Mesh2SSM. DeepSSM and its variants rely on supervised losses and require volumetric images, segmented images, and corresponding PDMs for training and relies on linearity for generating ground truth. TL-DeepSSM does not utilise PCA scores as shape descriptors; instead, it employs an established correspondence model similar to vanilla DeepSSM and learns a linear model. Mesh2SSM can serve as a ground truth feeder for DeepSSM.

back to top

Mesh2SSM: From Surface Meshes to Statistical Shape Models of Anatomy