Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zhu Chen, Ina Laube, Johannes Stegmaier

Abstract

Zebrafish are widely used in biomedical research and developmental stages of their embryos often need to be synchronized for further analysis. We present an unsupervised approach to extract descriptive features from 3D+t point clouds of zebrafish embryos and subsequently use those features to temporally align corresponding developmental stages. An autoencoder architecture is proposed to learn a descriptive representation of the point clouds and we designed a deep regression network for their temporal alignment. We achieve a high alignment accuracy with an average mismatch of only 3.83 minutes over an experimental duration of 5.3 hours. As a fully-unsupervised approach, there is no manual labeling effort required and unlike manual analyses the method easily scales. Besides, the alignment without human annotation of the data also avoids any influence caused by subjective bias.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43993-3_58

SharedIt: https://rdcu.be/dnwN3

Link to the code repository

N/A

Link to the dataset(s)

https://www.nature.com/articles/srep08601


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper, author has presented an unsupervised approach to extract descriptive features from 3D+t point clouds of zebrafish embryos and subsequently use those features to temporally align corresponding developmental stages. An autoencoder architecture is proposed to learn a descriptive representation of the point clouds and we designed a deep regression network for their temporal alignment.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Some strength of the papers are mentioned below. 1) The paper is well written and categorized including embryo development, point clouds, unsupervised learning and autoencoder 2) In this work, author presented a deep learning-based method for the temporal alignment of 3D+t point clouds of developing embryos 3) Author has used fully-automatic and unsupervised approach and this method does not require any time-consuming human labelling effort and additionally avoids any subjective biases.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) Is there any preprocessing needed before applying the encoder to extract descriptive features

    2) author need to add some more figure and tables to explain and justify the contents. 3) The approach is limited to the specific dataset.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Author has convey, clear, specific and complete information about data, code, models and computational methods and analysis that support the contents and result presented in the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The author has contributed for paper entitled “ Unsupervised Learning for Feature Extraction and Temporal Alignment of 3D+t Point Clouds of Zebrafish Embryos”. The content of paper is very interesting and well written. Here I would like to provide some comments mentioned below as per the reviewers comment and response provided by the author

    1) It would be interesting to mention a figure to talk about zebrafish embryos. 2) What is basis of choosing the Folding net as the cloud point features? How the what are the modifications is added to the network and the loss function 3) How author have chosen the synthetic variants fir each embryo in training sets, some justification would be important to mention. 4) How the labelling is performed for the datasets and how the author has validated the results? 5) It would be interesting to add the dataset in tables to make it more self-explanatory and more understandable. 6) Any need of creation the training and testing dataset for validation for the proposed work?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Author has well written the manuscript and has done much research on related work in the past to support the content of the paper.

    Author has adopted the method and amend the techniques as per the need of this experimental setup, which makes the author’s contribution significant towards this submission.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper describes a deep neural network (autoencoder) that can extract features from point clouds and that can align temporal 3D point clouds. The work is based on FoldingNet, proposing (1) a spherical surface as the folded structure and (2) a modified version of Chamfer distance loss function. This author(s) conduct experiments on synthesized datasets to show the effect of alignment.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • The target problem of the paper (i.e., align temporal point clouds of zebrafish embryos) is well motivated. • The paper proposes modifications of FoldingNet to adapt point clouds of embryos. In particular, the paper shows that the modified Chamfer Distance is effective in training FoldingNet, yielding accurate reconstructions and alignment results.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • It is important to have correspondence of landmarks in comparing the level of growth between different individuals. How is the correspondence considered in this work? • The whole framework is not fully understood for the following reasons: o How was the spherical surface folding included in FoldingNet? What is the benefit from this folded structure? o The section 2.2 needs to rewrite so that the motivation and the method are more clearly explained. o It would be good to provide an illustration about how the point clouds of embryos are aligned due to the proposed methods? • It is not clear how to create/collect the experimental datasets. Specifically, how the point clouds shifted (cos/sin-shifted).

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It is difficult to reproduce for the following reasons:

    1. The model/framework is not clearly explained or illustrated
    2. The code is not published.
    3. The synthesized dataset is not clearly explained.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. It would be good to provide an overview diagram to illustrate the methods including autoencoder and the regression network. In the diagram, the input and output are clearly noted.
    2. Polish the text so that the methods can be better understood.
    3. It would be good to provide clear and insightful experimental analysis.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper shows little improvement over the existing research. The methods are not clearly explained. Also, the experiments can be hardly interpreted. A major revision of the paper is needed.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    By revisiting the paper and reading the rebuttal letters, the work is clearer to me. But still, some important details regarding the methods and experiments are not understood, making it difficult to evaluate the work. Thus, I inclined to maintain my initial decision (i.e., reject) of this paper. While the paper proposed several new ideas (modified FoldingNet and modified Chamfer Distance) and novel evaluation methods (simulating developmental stages with cos/sin shifting), the paper may need more pages to explain these methods to readers who are not familiar with the zebrafish embryo development. From a high level, it is understandable that the paper produces good features via modified FoldingNet given point clouds at various time frames. Then the features are consumed by regression network and postprocessing algorithms to produce temporal aligned indices. However, due to the complexity of this research, it would be helpful to readers to explicitly show (with figures) or elaborate in text about some important details (e.g., details in Validation strategy section in the rebuttal letter). (1) Folding Net a. how the spherical surface unfolding and modified Chamfer Distance play a role in training FoldingNet b. how the data augmentation based on 4 zebrafish embryo play a role in training the FoldingNet (2) Regression network a. Since this network is trained independently of the folding net, why do the authors use the variants of the unseen test embryo to train this network, as described in rebuttal? b. How is generated sequence compared with reference time sequence to produce temporally aligned point clouds? It is still not quite clear to me how the upper/lower boundaries of time frames (shown in Supp Figure 3) are generated. c. The evaluation relies on synthetic developmental time sequences (cos/sin shifting). Both the text and the rebuttal didn’t explain how the to produce the shifting.



Review #3

  • Please describe the contribution of the paper

    Authors proposed an unsupervised approach to extract features from a point clouds and temporally align corrresponding developmental stages. They modified existing FoldingNet appropriate to the 3D structure of zebrafish embryo, propose temporal alignment based on the regression method, and introduce a method to create validation set to evaluate the performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Modified FoldingNet and Chamfer distance could be widely used to the other areas. They modified the folded structed as a spherical surface, instead of a plane, which will makke reconstruction much more easier if our target structure shows hemisphere or an ellipsoid.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Data set is quite small and specific, and validation is weak and done without any comparison methods. No ablation studies.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Meet all criteria.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. How are the point clouds extracted? Does the dataset just contain the point clouds, or are they extracted from the 3D stack images?

    2. Authors described that four wild-types are contained in the dataset, and they also used four embryos for training. (mentioned in “For network training with the four wild-type embryos”)

    Were the train and test set were clearly separated? In k-fold cross-validation, all data are used for training.

    If the test embryo appeared for training the autoencoder, the validation might not be well-designed, even considering it was tested to align to the shifted variants.

    1. Where does the randomness of the result come from? (appeared at “To reduce the influence of randomness, we repeat …”)

    2. One curiosity may be whether the modified PointNet and Chamfer distance played an important role in this temporal alignment of embryos. Did you compared the training regression network with features extracted using the default PointNet?

    3. In the Impact of Spatial Transformations section, authors show that rotation and translation introduce an increase in variance or reduce overall performance.

      • For rotation, average performance is increased, but the variance also increased.
      • For centering each point clouds to the origin, average performance increased quite largely.

    There was data augmentation by rotating the point clouds. I understood that the current network does not have translation invariance (i.e., if it had, shifting the point clouds with the same vector would not change the performance), and this translation invariance would be important to apply this method in the real world scenarios. This result raises a question if the model “overfit”s only to this dataset, which consists of four embryos.

    Did you tried data augmentation by shifting? Or, for example, training with centering as a default, and aligning with the data after centering in the test case could be a way to increase the test performance (while there is no test data…).

    1. Overall, there is a lack of comparison, and just showing “custom” experiments without comparing others. As they are proposing new things, at least an ablation study is needed. For example, checking the: (1) significance of modified PointNet and loss function by changing it to the default PointNet and the loss function (2) removing temporal information (“t”) from the point clouds and check if the performance degrades (3) Adding data augmentation by rotation and shifting and check the performance In fact, for (3), if the performance increased by adding shifting augmenation, this would be the main algorithm and removing it would be the ablation study.

    2. How does the temporal information used in point clouds? By adding additional dimension to the point clouds?

    Followings are minor comments:

    1. Not sure if the “Cos” example and the “Sin” example have significantly different meanings.
    2. It would be nice to check if the Modified FoldingNet work well for other widely-used benchmarks in 3D point clouds. It is more natural to reconstruct a 3D object from a spherical surface instead of a plane.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall good idea but validation seems weak even considering the difficulty of the validation. More detailed description and clarification would be needed.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    Authors well resolved lots of concerns. It would be perfect if they could modify or add the figures, where other reviewers also mentioned, which they could not due to the guideline.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors introduce an auto-encoder architecture to learn features from 3D point clouds of zebrafish embryos and a regression approach to use these features to predict developmental time points. The method is well motivated, performs well on the described datasets and automation it provides helps accelerate downstream analyses. While the method is built for a niche application, the reviewers have noted lack of detail on the method. Also, since the zebrafish data is not commonly encountered in miccai, it would help for authors to describe the preprocessing further. Overall, reviewers posed several concerns on the clarity of the paper that the authors should address. Lastly, the paper lacks comparison to baseline, even if they might be trivial. If no baseline exists, authors should describe why so. In conclusion, the paper is very interesting but requires some clarification by the way of a rebuttal and a revision.




Author Feedback

We thank all reviewers and area chairs for the very constructive feedback, and we are happy to clarify the following issues in a slightly revised version of the manuscript:

Input data details (Sec. 3.1): We directly operate on 3D point clouds that represent tracked centroids of fluorescently labelled cell nuclei obtained from 3D+t light-sheet microscopy images as described in (Kobitski et al., 2014). There is no preprocessing involved and temporal information is not added to the point clouds.

Motivation for the modified FoldingNet (Sec. 2.1): We learn a time-independent representation of separate time points of the 3D point clouds using the modified FoldingNet, which is a more intuitive approach for working with point clouds compared to voxelization-based methods. This step is unsupervised and using spherical templates instead of a planar template provides a better initialization for reconstructing sphere-like objects such as embryos from the learned representations. In preliminary experiments we found that our spherical representation performs better (see qualitative reconstruction results in Fig. 1, Suppl. Fig. 2). In the implementation, we use M evenly distributed 3D points on a spherical surface instead of a grid of M equidistant 2D points that are then concatenated with the learned representation and iteratively turned into a reconstructed 3D shape by MLPs using the modified Chamfer Distance loss (we refer to Yang et al., 2018 for details of the original FoldingNet).

Temporal alignment (Sec. 2.2): We use neither spatiotemporal landmarks nor manual labels to compare different individuals and instead exploit the time-independent feature representation learned by the modified FoldingNet to characterize the developmental stages of the embryos. The temporal alignment is learned by the regression network that intentionally overfits to a selected reference embryo and is then used to perform the temporal alignment of the validation data sets to the reference. We note that the focus is solely on matching corresponding developmental stages and not a spatial alignment.

Validation strategy (Sec. 3.2): Training of the unsupervised FoldingNet was performed in a 4-fold cross-validation setting, i.e., the respective test embryo of each fold was not (!) part of the training procedure of the auto-encoder. Randomness originates from training the FoldingNet on randomly selected subsets of 4096 points of randomly selected frames. The regression network was then trained solely on variants of the unseen (!) test embryos including a large degree of additional randomization for improved generalizability of the results (random point samples, point jitter, randomized development via sin, cos, Gaussian, and linear-based stretching/compression of the time axis by interpolating time points) and we report the average results of multiple runs for multiple constellations of training vs. test data. The quantitative evaluations shown in Fig. 3 and Table 1 indicate that a representation suitable for temporal alignment on an unseen embryo is obtained. This is also confirmed by visualizing the feature vectors using t-SNE in Fig. 4.

Ablation studies (Sec. 3.3): Performance minimally increased when adding rotation and decreased (!) when adding both rotation and centering (in Table 1, lower values are better). This indicates that rotation augmentation can slightly compensate for alignment errors of the initial point clouds, whereas centering the point cloud may remove important cues about the developmental stage of the embryos like the percentual coverage of the sphere (epiboly). The benefits of the modified Chamfer distance (MCD) over the classical one (CD) are shown in Suppl. Fig. 1, 4, 5, 6. We successfully applied the approach to mouse embryos to validate the general applicability of our method for other 3D specimens (data not shown due to limits).

As demanded by the MICCAI rebuttal guidelines we cannot add additional figures/experiments at this stage.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The reviewers have all upgraded their scores after rebuttal, warranting an accept decision for this paper. In final version, please update figures as suggested by reviewers.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presented an unsupervised approach using autoencoder to extract descriptive features from 3D point clouds of zebrafish embryos and subsequently use those features to temporally align corresponding developmental stages. While the problem studied in this paper is interesting (though it is not frequently seen in MICCAI community), the description of the problem and the method is not clear. As listed by R#2 in the post-rebuttal comments, several key issues should be addressed. Considering that the studied problem is not well known by readers and more details should be provided, it should better for the authors to expand the current paper and submit it to a journal allowing more space. The current form of the paper is not ready for publication.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Reviewers had several concerns and suggestions for improvement regarding the clarity and presentation of the paper. The method is interesting, and its performance is showcased on the described datasets. The rebuttal has clarified some aspects. However, it failed to address significant concerns, including the level of detail given the uncommon problem to most of MICCAI audience, lack of comparisons to baselines, and lack of details on the evaluation and validation strategies. Overall, the paper has potential but requires significant revisions to address the clarity, presentation, and comparison concerns raised. Therefore, I am inclinded to reject the paper at this stage as it would need another round of review before being accepted at MICCAI.



back to top