Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Maximilian Rohleder, Charlotte Pradel, Fabian Wagner, Mareike Thies, Noah Maul, Felix Denzinger, Andreas Maier, Bjoern Kreher

Abstract

Epipolar geometry is exploited in several applications in the field of Cone-Beam Computed Tomography (CBCT). By leveraging consistency conditions between multiple views of the same scene, motion artifacts can be minimized, the effects of beam hardening can be reduced, and segmentation masks can be refined. In this work, we explore the idea of enabling deep learning models to access the known geometrical relations between views. This implicit 3D information can potentially enhance various projection domain algorithms such as segmentation, detection, or inpainting. We introduce a differentiable feature translation operator, which uses available projection matrices to calculate and integrate over the epipolar line in a second view. As an example application, we evaluate the effects of the operator on the task of projection domain metal segmentation. By re-sampling a stack of projections into orthogonal view pairs, we segment each projection image jointly with a second view acquired roughly 90° apart. The comparison with an equivalent single-view segmentation model reveals an improved segmentation performance of 0.95 over 0.91 measured by the dice coefficient. By providing an implementation of this operator as an open-access differentiable layer, we seek to enable future research.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_6

SharedIt: https://rdcu.be/dnwAD

Link to the code repository

https://github.com/maxrohleder/FUME

Link to the dataset(s)

N/A


Reviews

Review #6

  • Please describe the contribution of the paper

    This is a nice topic of research about combining information from two X-ray projections with epipolar geometry properties in order to consider redundant information in CNN models, which are tested on a segmentation task in the proposed work.

    The requirements is to have at least two calibrated views (with intrinsic projection matrixes are known). Then a per-view U-NET is trained with weights sharing trough the epipolar operator that consider the signal integral in each opposite view.

    Evaluation showed an improvement of the results (dice) brought by the epipolar operator compared to (1) separated view segmentation or (2) joint PA and LAT U-NET.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Originality of the proposed work Consideration of the projection matrixes (in opposite to numerous works assuming parallel projections that cannot be easily translated with real medical imaging system) References cited constitute a good background about epipolar geometry Well written paper

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I believe that the main weaknesses are about the evaluations:

    It is compared: Single view (U-NET), Dual View (U-NET where PA and LAT images are concatenated along channels) and Dual View + Epipolar View Translator Operator (one U-Net per view + weight sharing trough the epipolar operator).

    The Dual View configuration not resembles to the Dual View + Epipolar View Translator Operator architecture, a strong suggest would be to also compare to a Dual View (one U-Net per view + weight sharing ) to better conclude about the improvements brought by the epipolar operator, because in this present from we cannot really conclude about the performances.

    In Table 1, this is not clear how are pulled the segmentation results for PA and LAT views, for instance please provide the results for Single View PA, Single View LAT, Dual View PA, Dual View LAT, etc… Perhaps two tables (one for PA and one for LAT) could be presented to separate the result for each view.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Some details are missing (i.e.: dose level variation in data augmentation, step size of the operator integration…) However, an implementation from github will be provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Is it possible to embed more than two views ? (I believe yes, but it would be nice that authors elaborate on that point for future works)

    Is is possible to adapt the method for fan-beam projections (instead of cone-beam projection) ?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Important topic of research about how to combine multiple views information that have merit. Experiments and results should be improved before acceptance.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    Enabling Geometry Aware Learning Through Differentiable Epipolar View Translation with the idea of enabling deep learning models to access the known geometrical relations between views.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Author introduced a new operator, Dual View U-Net with EVT Operator, a novel formulation for jointly segment of two projection images, It comprises two UNet backbones with shared weights which process the two given views.
    2. This operator give a neural network with spatially registered feature information from a second view of known geometry, which is considered as a novel and interesting approach when combined with the epipolar geometry in the Cone-Beam Computed Tomography imaging.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Figure 1 of this research work is a formal structure of a work in epipolar geometry in cone beam CT, therefore it is not a sole work of the author. Author should give the citation of the structure as it has been used in other reseach work in Luo, Shuang & Luo, Shouhua. (2018). An Epipolar Based Algorithm for Respiratory Signal Extraction of Small Animal CT. Sensing and Imaging. 19. 10.1007/s11220-018-0187-x. but different set of data.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Most of the listed reproducibility checklist were met.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. A very comprehensive work was done as explained in Methods, Experiment and Result, which the formulation of the proposed operator was derived through the formulaes. Quite informative and relevant to the workdone.
    2. It is a good idea if author discussed more on the results in the result section rather than a long and comprehensive writings that were done in the discussion and conclusion section. Try to write a short and concise conclusion with future work to be done.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Author managed to arrange the theorethical parts i.e. formulae creation and the practical part of the research in a way that make the paper readable in a lay man term.
    2. It is a novel approach with the Epipolar View Translation Operator with the epipolar geometry and the results for quantitative evaluation is 95.33% on average are good indication that the model can be deployed.
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The model proposes a dual view segmentation model based on novel epipolar view translation operator. The model aims to register feature information from the first to second view. It is done based on the calculating the epipolar line, and epipolar map, then during backpropagation matching contribution of points along the epipolar line is increased. This strategy is implemented via skip connections in a UNET model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Interesting implementation of feature registration based on epipolar geometry.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Experimental evaluation is limited. The model is not compared against sufficient baselines.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Details might be missing for data preparation. I suggest adding supporting code for data (and model) to this publication.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    More experimental evaluation against other models or versions, including ablation study can help to make your paper more convincing. For example, more baselines can include concatenating the two views at the input level, then running a single UNET that outputs two segmentation maps. Or a version that includes adding self-attention units to connect the two streams. More generally, previous models for “multi-view segmentation” (e.g., Learning Where to Classify in Multi-view Semantic Segmentation – and the many ones that follow this paper) might be a good fit for comparison and discussion.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    interesting approach but limited experimental evaluation.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The authors propose a differentiable View Translation Operator based on epipolar geometry and show embedding the novel operator in traditional UN-et would improve segmentation performance on orthogonal view pair images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The authors introduce both the forward- and backward- passes of the new operator via analytical formulations. How the gradient back propagates is very important in neural network training.
    2. Metal segmentation in single-view projection is very hard since metal and dense bones can not be easily separated. The new operator leveraging epipolar geometry consistency makes sense and the results show the performance is improved.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper would be better organized since the notations in methods are dense and this part is not easy to follow. While the discussion and conclusion part occupies more than half a page.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors claim they will release the codes of the view translation operator.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. It is strange that the dual view model performs worse than single view model.
    2. The projection images are too dark to see the contours of metal implants in Fig3.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The new operator is very useful for feature research.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a method to convert intermediate deep network features from one view to another. The feature transformation is done based on epipolar geometry, making it appropriate to model multiple views of a cone-beam CT. The idea is applied to the segmentation of metal objects from 2 rotated CT views. The operator enables designing a network architecture that performs 2-view joint segmentation of metal components with implicit geometry consistency.

    Strengths:

    • All reviewers agree that this is an interesting and useful idea
    • The application of the operator to a multi-view segmentation is clear and intuitive

    Weaknesses:

    • Some reviewers mention the lack of experimental comparisons to fully justify all the choices in the segmentation architecture design.

    All reviewers are consensual on accepting this paper. I recommend the authors take in conseration the minor suggestions regarding Fig. 1 and Fig. 3, and comment on the suggested extra comparisons (but noting that performing new experients at this stage is out of scope)




Author Feedback

Dear Reviewers,

Thank you for your valuable feedback and constructive comments on our submission titled “Enabling Geometry Aware Learning Through Differentiable Epipolar View Translation.” We sincerely appreciate the time and effort you have dedicated to reviewing our work. We are delighted to learn that the overall verdict is positive, resulting in the provisional acceptance of our paper for the conference. In this response letter, we address the points raised by the reviewers.

We are grateful for recognizing the strengths in our paper, such as the introduction of the forward and backward passes of the new operator through analytical formulations, which are crucial for neural network training (R1, R4). Your appreciation of the challenging task of metal segmentation in single-view projection and the improvement achieved through our novel operator leveraging epipolar geometry consistency is encouraging (R1). We also value your positive remarks about the interesting implementation of feature registration based on epipolar geometry (R2, R4).

We would also like to express our gratitude for recognizing the originality of our work and our consideration of “real world” projection matrices (R6). Our aim was to move away from simplifying parallel beam assumptions commonly found in previous works and instead take into account the complexities of real medical imaging systems. We are glad to know that the background references provided were helpful, and the clarity of our paper was appreciated (R4, R6).

Regarding the suggestion for more extensive evaluation and experimentation (R2, R6), we agree with the need for further comparisons, including the incorporation of attention mechanisms (R2) and benchmarking against state-of-the-art methods in computer vision (R2). However, it is important to note that the scope of this work was to motivate the general-purpose operator and provide evidence of its usefulness for multi-view problems. The focus was not on finding the best segmentation method, but rather on demonstrating the effectiveness of the operator for metal segmentation. We appreciate the feedback in this direction and consider it as a valuable suggestion for future research to thoroughly evaluate our method against other segmentation approaches in the computer vision domain.

We appreciate the constructive feedback regarding Figures 1 and 3. We apologize for not properly acknowledging the source of the formal structure in epipolar geometry in cone-beam CT for Figure 1, and we will include the appropriate citation. For Figure 3, we will adjust the contrast window to enhance the visibility as suggested.

Lastly, we would like to address the follow-up questions by R6 regarding the applicability of our method to (A) multiple views and (B) fan-beam geometry. Regarding (A), while we believe the idea is extendable to multiple views, the challenge lies in joining multiple translated feature maps due to the loss of depth information. We consider a simple multiplicative join of feature maps translated from different perspectives as a potential approach. However, if an intermediate 3D representation is acceptable, this approach is similar to a consistency check as mentioned in the introduction. Regarding (B), the epipolar geometry simplifies significantly for fan-beam systems, rendering the operator less useful in the dual-view setup where a point detection in one view occupies the entire line detector in the second view.

Once again, we sincerely appreciate your feedback and the opportunity to improve our work. We will carefully consider your comments and suggestions while revising the paper. Thank you for your valuable contributions.

Sincerely, The authors



back to top