Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Sidaty El hadramy, Juan Verde, Karl-Philippe Beaudet, Nicolas Padoy, Stéphane Cotin

Abstract

This paper proposes a method for trackerless ultrasound volume reconstruction in the context of minimally invasive surgery. It is based on a Siamese architecture, including a recurrent neural network that leverages the ultrasound image features and the optical flow to estimate the relative position of frames. Our method does not use any additional sensor and was evaluated on \textit{ex vivo} porcine data. It achieves translation and orientation errors of $0.449 \pm 0.189 $ mm and $1.3 \pm 1.5 $ degrees respectively for the relative pose estimation. In addition, despite the predominant non-linearity motion in our context, our method achieves a good reconstruction with final and average drift rates of 23.11\% and 28.71\% respectively. To the best of our knowledge, this is the first work to address volume reconstruction in the context of intravascular ultrasound.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_29

SharedIt: https://rdcu.be/dnwwI

Link to the code repository

https://github.com/Sidaty1/IVUS_Trakerless_Volume_Reconstruction

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    This paper proposed a trackerless ultrasound volume reconstruction network, especially for minimally invasive surgery. The network is based on a Siamese architecture with seq2vec RNN to model the US frames sequence.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The trackingless ultrasound volume reconstruction is of great interest to the image-guided intervention research community. The authors conducted the research mainly to contribute to minimally invasive surgery. This could draw more attention to deep learning technique’s potential in US volume reconstruction.

    2. Compared to previous work in US reconstruction, this work has three highlights (1) Siamese network structure, (2) ultrasound sequence modeling (instead of a fixed number of frames), (3) auxiliary information from optical flow. Including more frames for US probe trajectory estimation is a reasonable design. And this work combines the benefits of several previous research works into one with good theory support.

    3. A dataset with 137344 clips is a well-sized dataset for US volume reconstruction study. The authors also show the illustration of optical flow between US frames, which is the first one in this research field.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The rationale of setting window size as “2k+3” is unclear to me. The authors propose to estimate the transformation (optical flow) between (I_0, I_(k+2)), but the gap seems to be too large. My assumption it that the transformation between neighboring frames are too subtle. Using a large gap can ensure a better optical flow estimation. The authors could clarify on this set up.

    2. The design of “Siamese Network” needs more explanation. From my perspective, the network structure in Fig. 2 can also work without the Siamese design, i.e. feeding k+2 frames into the network for optical flow computation and transformation estimation. If the authors want to address the importance of Siamese design, then model ablation study is required.

    3. There are too few results shown in the paper. The authors only show the FDR and ADR metrics. Are there any qualitative evaluation for the reconstructed volumes? For example, you can compute the volume-wise similarity between the predicted and groundtruth volume reconstruction.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors claim “Code will be publicly released.” Considering the datasets are primarily private across research groups, I think the code should contribute to the paper reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Minor concerns:

    1. Some details in paper format should be further improved. For example, there should be a space between the word and reference index (overall [20] etal.)
    2. For the abbreviation of IOUS, I assume it should be intraoperative ultrasound, where the word “ultrasound” is missing.
    3. For the figures (eg. Fig. 2), please consider using PDF format for better resolution.
    4. I would suggest the authors to further polish the paper writing, with careful grammar checks.
    5. Please stick to one terminology, and use either tracker-less or trackerless throughout the paper.
    6. The introduction is too long. The authors could consider shrink the background a little more, and spend more paragraph on method, especially more results analysis.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper discusses an interesting topic: US volume reconstruction for minimally invasive surgery, which has not been explored before. The methodology itself is well-founded, introducing sequence modeling for sequence registration. I think it can draw more attention to DL’s potential in trackingless US reconstruction.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    To benefit the liver laparoscopic resection surgery, the authors proposed an interesting trackerless 3D ultrasound reconstruction approach based on intravascular ultrasound images. By training and testing the proposed approach on data acquired from an ex-vivo swine liver, preliminary results showed promising translational error, rotational error, final drift rate and average drift rate.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • This paper is written and organized very well. It is easy to follow, especially the introduction part. • An interesting trackerless deep learning approach that achieves relative pose estimation of ultrasound frames based on intravascular ultrasound images, which is of high clinical significance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • The training and validation dataset is from the same ex-vivo swine liver. The lack of data cannot demonstrate the performance very well. And when acquiring the data, the authors did not discuss whether the acquired ultrasound images with different movement speeds would cause performance bias.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This work has good reproducibility. To be reproducible, the code should be publicly available. And the dataset is important as well.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    As mentioned in limitation part, please demonstrate/discuss whether the acquired ultrasound images with different movement speeds would cause performance bias. The training and validation dataset is from the same ex-vivo swine liver. Please clarify whether current results could well represent the algorithm performance when applied to a large and diverse dataset. Table 1 shows comparison results with MoNet and CNN approaches. The results of MoNet are directly copied from Luo et al.’s work. And Luo et al.’s model is trained on arm scans. Please clarify why this result theoretically is comparable to the authors’ approach result. Please elaborate on how to calculate orientation error, is it sqrt(rot_x^2+rot_y^2+rot_z^2 )?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is written and organized very well. And this approach is well described and clincial useful.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    This paper presents a deep-learning approach to predict the relative pose of an ultrasound image for the purpose of 3D volume reconstruction without the need for extrinsic pose trackers. For each image, prominent spatial locations are determined using optical flow and fed into a Siamese network which is trained to predict the 6DoF pose of the US image relative to its immediate neighbor. Using data collected by scanning a porcine liver ex-vivo using an IVUS probe, the training, validation and testing is performed, and the results reported. The results suggests that the networks performance is comparable to that obtained by methods that employ extrinsic tracking methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed Siamese architecture to predict the IVUS image pose for the purpose of 3D volume reconstruction is novel. Based on the limited validation study, it appears that the method achieves clinically meaningful accuracy. Its performance seems to be comparable methods that employ extrinsic tracking methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The validation is limited to data collected using an ex-vivo porcine liver. The performance of the method is compared to the state-of-the-art not under the same experimental conditions but based on reported numbers on papers. However, given the space restrictions, I understand that extensive validation should better be saved for future.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors mention that the source code with parameters will be released upon the acceptance of the paper. In the text, the methods are described well with parameter values used for training. With access to the source code and parameter sets along with the clear description of the methods in the paper, I believe, that an interested reader will be able to reproduce the result without much effort.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    [1] The paper reads well. It starts with a good background on the clinical problem and a brief look at the state-of-the-art in the literature. Limitations in the methods described in the literature are identified. The computational framework is described well. Details of data collection, training and validation study are given. The illustration of the distribution of relative pose in data should be appreciated as it provides insight into the dataset. The results are discussed briefly as well.

    [2] In the abstract, the authors statement ‘To the best of our knowledge, this is the first work to address ultrasound volume reconstruction in the context of minimally invasive surgery.’ seems misleading when several papers can be found on the topic in the literature. May be the authors meant to say that this is the first attempt to do trackerless volume reconstruction with IVUS?

    [3] With help from sparse optical flow, prominent features are identified, which then are fed into a the Siamese architecture for pose prediction. One could naively attempt to estimate the relative image pose by registering nearby frames using these prominent features. How would such a naïve approach perform compared to the machine-learning-based approach. Such a comparison would have highlighted the need for deep-learning methods.

    [4] The method seems to perform well on IVUS data that exhibits primarily rotational changes between frames (see Fig. 5). Does it work equally well in situations where translational changes are dominant? How does it work when both translational and rotational changes are observed in data? In the results section, the proposed method is compared to the MoNet[15] and CNN-based method reported in [9], merely based on statistics reported in the respective papers. Without evaluating the methods under same experimental settings such comparisons do not carry significant scientific meaning. However, I understand that such extensive validation should better be saved for future work, especially when restrictions on the number of pages are in place.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed Siamese architecture-based method to predict the US image probe is novel. The authors demonstrate the performance of the method using data collected with an ex-vivo porcine liver. Although, the validation study is not extensive, when coupled with novelty in the methods, I believe, that the paper has adequate content suitable for MICCAI audience.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a method for 3D volume reconstruction from trackerless intraoperative ultrasound. From a sequence of consecutive images, the method starts by sampling 2 pairs of images and tracking point features along the sequence. The results are fed into a siamese network that predicts relative poses between the pairs of images. The results can then be assembled into probe motion trajectories and reconstruction results.

    Strengths:

    • There is a consensus among the reviewers that the method contributions in this paper are relevant and interesting

    Weaknesses:

    • Some reviewers state that the method validation is limited both in terms of method comparisons, and in terms of only testing it on a single ex-vivo dataset.

    Due to reviewer consensus I would recommend acceptance here, but I still recommend the authors to address minor fixes as suggested by reviewers as well as commenting on the limitations of the current experiments.




Author Feedback

Dear Area Chair, dear Reviewers,

We would like to thank you for the constructive comments. We address the main points below.

  • R2 commented on the input window size and the role of the Siamese architecture, a size 2k + 3 allows having two equal sequences of size k+2 each, sharing a common frame. The choice of the hyperparameter k is very important for the optical flow estimation and is to be done based on the data acquisition speed. For instance, if data is acquired with a high speed, k should be chosen very small in order to make sure there are common features between the source and target ultrasound (US) frames. As mentioned in the results section, we have experimented with different values of k for our specific dataset. Regarding the architecture, the Siamese one permits the prediction on both k + 2 sequences at a time; this allows penalizing the Seq2Vec network based on the accumulation loss on the overall window of size 2k+3. Our experiments show that the accumulation loss contributes to the reduction of the drift errors.

  • R1 and R3 asked about the scientific meaning of our comparison with MoNet [15] and CNN [9]. We agree that the methods are not evaluated under the same experimental settings. In reality, both methods from the literature use IMU sensors as additional data. Such a hardware is very complicated to include in the IVUS because of its size and calibration complexities. In addition, these methods deal mainly with linear probe motion (translation) while in our case non-linear motion (rotation) is predominant. Therefore, we argue that despite the complexity of our problem and lack of any additional hardware, our method still achieves comparable statistics in terms of drift errors.

  • R2 and R3 commented on the metrics, we build on prior work and use FDR and ADR metrics to quantify the quality of the reconstruction. Regarding the orientation and translation errors, they are the mean square error between the relative predicted and ground truth transformations.

  • Regarding R1 and R3’s comments on the dataset, as it can be seen in Fig 5, our dataset mainly contains rotations about the main axis of the IVUS probe. Our method was specifically conceived to deal with such scenarios. In future work, we will extend the validation by confronting our method against different scenarios.

All the points above will be clarified in the final version of the paper.



back to top