Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Netanell Avisdris, Leo Joskowicz, Brian Dromey, Anna L. David, Donald M. Peebles, Danail Stoyanov, Dafna Ben Bashat, Sophia Bano

Abstract

Fetal growth assessment from ultrasound is based on a few biometric measurements that are performed manually and assessed relative to the expected gestational age. Reliable biometry estimation depends on the precise detection of landmarks in standard ultrasound planes. Manual annotation can be a time-consuming and operator dependent task, and may results in high measurements variability. Existing methods for automatic fetal biometry rely on initial automatic fetal structure segmentation followed by geometric landmark detection. However, segmentation annotations are time-consuming and may be inaccurate, and landmark detection requires developing measurement-specific geometric methods. This paper describes BiometryNet, an end-to-end landmark regression framework for fetal biometry estimation that overcomes these limitations. It includes a novel Dynamic Orientation Determination (DOD) method for enforcing measurement-specific orientation consistency during network training. DOD reduces variabilities in network training, increases landmark localization accuracy, thus yields accurate and robust biometric measurements. To validate our method, we assembled a dataset of 3,398 ultrasound images from 1,829 subjects acquired in three clinical sites with seven different ultrasound machines. Comparison and cross-validation of three different biometric measurements on two independent datasets shows that BiometryNet is robust and yields accurate measurements whose errors are lower than the clinically permissible errors, outperforming other existing automated biometry estimation methods. Code is available at https://github.com/netanellavisdris/fetalbiometry.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16440-8_27

SharedIt: https://rdcu.be/cVRvS

Link to the code repository

https://github.com/netanellavisdris/fetalbiometry

Link to the dataset(s)

https://doi.org/10.5281/zenodo.3904280

https://zenodo.org/record/1327317

Reviews

Review #1

Please describe the contribution of the paper

The author proposed an end-to-end network BiometryNet for automatic fetal biometry estimation on the ultrasound images. It uses simple landmark annotations instead of complex mask annotations. Moreover, a dynamic orientation determination mechanism was proposed to reduce variabilities and improve landmarks’ localization. The results on the two independent datasets demonstrated the effectiveness of proposed method.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

(1) The author proposed an end-to-end landmark regression framework BiometryNet for fetal biometry estimation, which only used simple landmark annotations for training. It reduced the time for manual labelling and high inter-and intra-operator variabilities. (2) The Dynamic Orientation Determination mechanism was further introduced to determine measurement-wise orientation and provide consistent landmark class for various measurements.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

(1) I supposed the third part of contribution should not be considered in your main contribution if the annotated dataset is non-public. (2) The author did not discuss the influence of simple landmark annotations and fine annotation on accuracy. In addition, there are differences between different obstetricians in the labeling process, and the protocol among them should be given in the paper. (3) For different detection tasks (BPD and OFD), do you train one network separately for each task or train one network simultaneously for multi-task detection? If one network is trained for each biometry, the process of this work is complex. (4) A comparison with the advanced methods on fetal biometry estimation is missing.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The part of Landmark Regression Network seems to be reproducible and it is challenging to reproduce the part of Dynamic Orientation Determination. Ideally, the source code should be released.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

(1) In section 4, the description of proving the robustness of proposed approach was not detailed enough. It should be added the results of training on FC dataset and testing on HC18 dataset. (2) In section 3.1, the details of landmark annotation should be given. Moreover, the author extracts the BPD and OFD biometry from the major and minor axes of an ellipse least square. Did this approach affect the accuracy of result? (3) The comparison with the advanced methods on fetal biometry estimation should be added. (4) The influence of simple landmark annotations and fine annotation on accuracy should be illustrated. Moreover, the protocol of annotation among obstetricians should be given in the paper. (5) The femur length prediction task is not clear in the paper and the author should declare why the femur Length image are inputted in the network.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Due to less convincing experimental results, the reviewer thinks this paper cannot meet the standard of MICCAI and rates this paper as weak accept.
Number of papers in your stack

3
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This paper reports a contribution to a previously reported method for automatic detection of landmarks in medical images. The contribution allows the method for adaptation to rotating landmark pairs and was evaluated in 3 biometry tasks (biparietal diameter, occipito frontal diameter and femur legth) in two large data sets of fetal ultrasound images with accuracy errors within acceptable clinical values.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

In my opinion the original contribution is significant and the extensive validation in fetal ultrasound biometry tasks provides enough evidence to support the possible clinical application of the methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Previous works on automatic landmark detection in medical images, including ultrasound are not mentioned in the introduction.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The work is fully reproducible since public data sets were used
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

I just would like to suggest to the authors to revise the introduction and include previous works on automatic landmark detection in medical images, such as:

Amir Alansary et al., (2019), “Evaluating reinforcement learning agents for anatomical landmark detection”, Med. Im. Ana., vol53, pp.156-164.

where was reported a 3D landmark detection method based on deep Q-network (DQN) architectures which was evaluated on the detection of multiple landmarks in three different medical imaging datasets: fetal head ultrasound (US), adult brain and cardiac magnetic resonance imaging (MRI).
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

In my opinion the validation is remarkable and allows for an objective assessment of the clinical viability of the methods, which in turn seems to be very high.
Number of papers in your stack

3
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #4

Please describe the contribution of the paper

This paper introduces a landmark regression network for directly computing biometrics in US. Specifically, it modifies the original landmark class reassignment with a novel dynamic orientation determination to generalize it to multiple scenarios. This work validated the proposed method on a large dataset, and the results are sound and promising.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The idea of this paper is simple but effective.
- The results are strong where the proposed method outperforms other methods and ablation studies
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The motivation of the landmark class reassignment should be detailed.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

It is highly recommended for the authors to release their dataset and code for better reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- The authors should detail the motivation of the landmark class reassignment module as it is the main novelty of the proposed method. It is hard to follow the necessity of this module without reading the reference [3] in the original paper.
- Performing the experiment to demonstrate the superiority of the method can not be assigned as a contribution, please remove it.
- The metric curve in Fig.4 is hard to recognize. Please put the zoom-in patch of the convergence stage on it for better visualization
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Overall, this paper is good. It proposes a novel dynamic orientation determination to generalize the landmark class reassignment to multiple scenarios. It evaluated the proposed method on a large dataset. The results seem sound and promising.
Number of papers in your stack

6
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The authors present a landmark regression network for computing biometrics in US. They validated the method on a large dataset, and the results are promising.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

NR

Author Feedback

We thank the reviewers for their constructive feedback. Below we reply to the specific comments:

Code and data availability (R1,R4): We have released the code on GitHub and its link will be added in the camera-ready (CR). We plan to release the full annotated dataset when we complete the annotation of abdominal plane biometry.

Contributions (R1, R4): Based on reviewers suggestions, we have updated our CR to highlight only the key contributions.

Comparison to other methods (R1): We compared our method with a previous work (Sec. 4, study 2). In addition, our method achieved 0.77mm (MAD) and 0.04mm (MD <-> bias), on a larger test set. This is higher than [1] (best BPD performer method in 2022 review [20], which did not include [5]), achieved mean absolute difference (MAD) of 2.33 ± 2.21mm and mean difference (MD) of 1.49±2.85mm on a 355 BPD dataset. Similar results are obtained for the other measurements. Due to lack of space and different metrics and datasets used in previous works, leading to a problematic interpretation of comparison, we did not compare to all existing methods rather than to a recent one [5] (from 2021). We will add the above details in the discussion section in CR.

DOD Motivation (R4): Due to lack of space, the DOD motivation part was short. We will expand it and explain the motivation in the manuscript: “As fetal structure may appear in any orientation (Fig. 1), any solution should handle the orientation variability. A common way to handle such variability is augmentation. However, rotation augmentation may cause landmark class labeling (e.g left/right landmarks) to be inconsistent with image coordinates, i.e. the left and right points may be switched, which will hamper the network training (Fig 1 Supp.). This inconsistency can be corrected by performing landmark class reassignment (LCR) [3], which preserves horizontal (left/right) landmark class consistency after all augmentations and has been shown to improve biometry estimation accuracy. However, different biometric measurements may have different spatial orientation, e.g., OFD is mostly vertical and BPD is mostly horizontal… ”

Introduction for landmark based methods (R3): Based on reviewer suggestion, we will add [Amir Alansary et al., (2019), Zhang, Molin, et al. (2020)] in CR which presented reinforcement learning (RL) to perform automatic biometry or fetal pose estimation. However, training of RL methods may be time-consuming and may not be robust and accurate as direct estimation.

Influence of automatic extraction of landmarks from HC18 (R1): Automatic extraction of BPD/OFD as ground-truth may affect performance, as it may differ from the manually determined BPD/OFD by an expert sonographer. The expert sonographer validated all automatically extracted landmarks. This method may influence the variance of results (Table 1), as the variance of the network trained on HC18 is larger (on both datasets) than the FP-trained network. This, combined with the consistency of variance on test sets with respect to the training dataset may indicate that automatic extraction yields less consistent results. However, both yield results are better than the interobserver variability, as indicated in the manuscript.

Influence of annotations (R1): Due to lack of space, individual landmarks results are not presented. The most relevant measure is the measurement accuracy, which is reported in the paper. Similarly, the observer variability of the measurements (and not the individual landmarks) is reported.

Network training scheme (R1): Currently, we train one network for each task (as noted in Sec 2.1). In our experiments, without DOD, multi-task training harms the landmark estimation performance. Each network training (for task) takes ~2 hours. So the burden of training multiple networks is low. As future work, we plan to adapt DOD to work in a multi-task scenario.

Figure 4 Clarity (R4) : We have revised this figure as suggested which will be included in the CR

back to top

BiometryNet: Landmark-based Fetal Biometry Estimation from Standard Ultrasound Planes