Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Ilyas Sirazitdinov, Axel Saalbach, Heinrich Schulz, Dmitry V. Dylov

Abstract

Localization of tube-shaped objects is an important topic in medical imaging. Previously it was mainly addressed via dense segmentation that may produce inconsistent results for long and narrow objects. In our work, we propose a point-based approach for explicit centerline segmentation that can be learned by fully-convolutional networks. We propose a new bi-directional encoding scheme that does not require any autoregressive blocks and is robust to various shapes and orientations of lines, being adaptive to the number of points in their centerlines. We present extensive evaluation of our approach on synthetic and real data (chest x-ray and coronary angiography) and show its advantage over the state-of-the-art segmentation models.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16440-8_66

SharedIt: https://rdcu.be/cVRwT

Link to the code repository

N/A

Link to the dataset(s)

https://www.kaggle.com/competitions/ranzcr-clip-catheter-line-classification


Reviews

Review #2

  • Please describe the contribution of the paper

    A bi-directional endocing scheme without autoregressive blocks is proposed for various shapes and orientations of lines.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The centerline based n-connected points to cover the segmentation seems interesting.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Limited Novelty, such heatmap based methods are widely used in landmark detection/pose estimation tasks [11]. The author applied it to such a specific task.

    • Inefficient evaluation. The author verified their methods on synthetic data. Why not evaluate it on a real-dataset?

    • overclaimed. The author argues that segmentation-based methods can generate some false positive pixels. However, the proposed method can also generate an incorrect centre line via the wrong keypoint locations.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The implementation code is unavailable and some part of implementation are unclear.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Typos: ‘we propose a different data structure for for the efficient segmentation of tube-shaped objects’ -> ‘for for’

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The novelty.

  • Number of papers in your stack

    3

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    The author has addressed all my concerns, and the idea behind the paper do have an insight into the community. Thus I raise my score to accept.



Review #3

  • Please describe the contribution of the paper

    This paper proposed a encoding schema for representing/segmenting tube-shaped objects in 2D medical images. The representation can be directly infered by a neural network.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Although there are already many works that also use connections of key points for different tasks (e.g. CurveNet for point cloud processing, deepsnake for semantic segmentation), the proposed centerline encoding in this work is somewhat novel. Also, the bi-directional design is also interesting and sound.

    • The emprical experiments and comparisons seem satisfactory. Qualitative results are also meaningful.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • In page 3, the authors mentioned that ‘p_i and p_{i+1} connect with an edge’. Does this denote a direct connection with a straight line? Or the edge can be a curve depends on other characteristics (if so, how?)? The ‘edges’ in Fig.1 prediction look like curves to me.

    • In page 3, the authors mentioned that non-end points are sampled from the centerline. I though those key points directly predicted by the HRNet? How is this sampling related to the predicted n key points?

    • In page 3, the authors mentioned that ‘we fix n to be identical for all tubes regardless of their shape and length’. If so, the problem is essentially a signal sampling problem, where the authors use the HRNet trying to capture the most significant waves. Therefore, for smaller n, there will be an under sampling problem. For larger n, it will become oversampling. Although the authors provided an ablation study w.r.t n (Table 2), the evaluation was bound by the specific dataset and specific tube geometry and hence is not generic. This problem is actually an limitation of the proposed method, the authors should at least have some discussions on this.

    • Also, the width of the centerline can be another important factor for the performance. Apparantely, assumptions on all tube-shaped objects are in the same width is invalid. It would be better if the authors can explain their choice on width selection and even better if an ablation study can be provided.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Please add more discussions (or ablation studies) and fix the unclearness as stated in the weakness section above.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, this paper proposed a novel and interesting approach for an existing problem. The intuition was justified and the experiments are good. I think this paper has met the standard of MICCAI.

  • Number of papers in your stack

    6

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    I have seen authors’ rebuttal and they have answered one of my questions. I choose to keep my original rating.



Review #4

  • Please describe the contribution of the paper

    The authors address the problem of centerline segmentation in 2-D images and propose a bi-directional point-based centerline encoding as target for a neural network (HRNet) which circumvents an “implicit” pixel-based segmentation. The authors evaluate their approach on three datasets (synthetic, semi-synthetic, CLIP) and show improvements compared to recently published segmentation-based methods for most investigated centerline types.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The authors present an interesting encoding approach to improve centerline segmentation which utilizes a point-based encoding of the centerline.
    • The authors provide results on three different data sets (synthetic, semi-synthetic, CLIP)
    • The authors compare their results to multiple recent methods for centerline segmentation.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • From my understanding, the paper’s main contribution is the formulation of the bi-directional centerline encoding. The evaluation, however, does not focus on showing a benefit of this encoding. Specifically, different architectures are used for the segmentation tasks and the proposed method. Is is therefore not possible to judge whether improvements are mainly due to a better suited architecture ( or due to proposed encoding. This could have been easily avoided since most (all) of the reference methods could also work with the proposed encoding as targets.
    • It is not clear how / whether the hyper-parameters used for the different methods were tuned. I would expect that especially aspects like learning rate, #epochs, etc. may be different across architectures.
    • The focus and the presentation of the results is not ideal from my perspective. Important results (standard deviation, quantitative values) are only presented in the supplementary material, an analysis of failure cases (or non-optimal results) is missing.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    According to the reproducibility checklist, the authors do not plan to provide their training or evaluation code. The method itself is fairly straightforward and well described, and a re-implementation should be possible. Most hyperparameters used for training the models are provided in the supplementary material, however, only the CLIP data set is available for reproducing results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    General comment: I have to admit that I have a hard time understanding why the methods works with apparently very good performance. The network is required to obtain a substantial understanding of the global structure of the target object (with partially very different sizes and shapes) to be able to place landmarks equidistantly on the target image. I would have liked to see a more detailed discussion and potentially analysis on this aspect, e.g., is this dependent on the receptive field of the network?, is the HRNet an essential ingredient for this task? what happens to the predictions of a structure if an image is cropped or stretched? If I am overlooking something and this is a rather straightforward insight, I am happy to stand corrected.

    The description of the method can be improved:

    • It is a bit tricky to say that no post processing is required when there is an obvious step to get from the heatmaps to the line. I would encourage the authors to diffuse such sentences.
    • From my perspective, the description of the ground truth heatmap should be part of the method description (not of the encoding) - I was missing this in “Training and Inference” and was surprised to find this later in the text.
    • The clarity in the method description could be improved: I was not sure what the authors mean by “horizontal coordinates of the endpoints are closer than some threshold”? Closer to what? How were these coordinates extracted? The output of the network should still be a heatmap. Also, how are s (scaling) and t (threshold) selected? Why n=31?
    • There is very little information on the selected architecture (see also comment above). One additional sentence that summarizes the concepts of this would help put this into context.
    • It semi-synthetic data is described very briefly, and makes it rather difficult to assess how realistic these images are. Also, for the real and semi-synthetic data, multiple structures are segmented, however, it is not clear how this was performed. Are separate networks trained for each centerline type? Was there a multi-task setting?

    The evaluation and description of the results can be improved:

    • The experiments do not reveal what role the architecture itself plays - potentially in combination with the proposed encoding scheme (see also comment above).
    • The results on the synthetic data are - from my perspective - not particularly interesting. They are okay in supplementary material, however, I find the added value of Table 1 in the paper to be relatively limited.
    • Figure 2 is rather hard to read/interpret (also: variance of results?). I understand that the authors want to also illustrate the ccs-aspect, but this overloads the figure from my perspective and contains little additional value (e.g., if there are a few false positive pixels, this will have limited impact on a line fitting afterwards). A bar or box chart (that includes the standard deviation) would have been more expressive from my perspective.
    • The authors show impressive qualitative results, however, I am missing an analysis of failure cases. I am missing an analysis of cases where it didn’t work. At least for some structure, a SDR of 15% indicates that there are some points missing. Also, in the real and semi-synthetic examples shown, the centerlines seem to be rather well behaved. It is not clear how such a method would behave in case of loops. Did the authors observe any outliers?

    Additional comments and typos:

    • abstract: “mainly addressed via dense segmentation” - this is a bit simplified. There have also been regression approaches that formulate this as regression tasks in Kordon et al., MICCAI 2019, https://doi.org/10.1007/978-3-030-32226-7_69
    • abstract: “agnostic in the number of points in their centerlines” - this seems to contradict the statement that the authors propose to use a fixed number of points. This should potentially be rephrased to improve clarity.
    • Convexity as such doesn’t really play a role for pixel-based segmentation. It is true that it may be easier to design postprocessing for some convex objects, but this can be formulate more precisely in the text.
    • p. 1: “and with the attention mechanisms [18] to address a guidewire segmentation and tracking in endovascular aneurysm repair problem. A U-Net model with the spatial location priors was used in [10].” - these sentences reads a bit bumpy and could potentially be improved.
    • p. 2: “to reconstruct [the] true shape of [the / a] device” - expression.
    • p. 2: “for for” - typo.
    • p. 2: “n-connected points” - no dash, otherwise this distorts the meaning (compare n-connected graphs)
    • p. 3: “We use the conventional for landmark detection heatmap regression loss” - expression.
    • p. 4: “fracture of points” - typo.
    • p. 6: “Hausdorf” - Hausdorff.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    It is appreciated that the authors compare their proposed method to multiple previously published methods, however, from my perspective, the experiments do not really assess the contribution proposed in the paper (e.g., the centerline encoding). Additionally, a description on how (whether) the reference methods were tuned and an analysis of failure cases are missing.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Introduction: This paper investigates centerline segmentation and a new bi-directional point-based encoding for training the neural network.

    Strengths: • All three reviewers concur that the centerline encoding is interesting and novel. • The reviewers also highlight the comparison of the algorithm’s performance to other centerline segmentation-based methods. Weaknesses: • Although the approach seems interesting, R3 and R4 have substantial questions regarding the approach and evaluation methodology. Specifically, R4 mentioned that validation of the main contribution of the paper (i.e., the bidirectional centerline encoding) was not performed.

    Points to be addressed by authors: • I would encourage the authors to rebut the suggestions/criticisms raised by the reviewers.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    8




Author Feedback

We want to thank the reviewers for their constructive feedback. Main criticisms are addressed below. — R2: …widely used in landmark detection/pose estimation tasks [11].. In [11], the heatmap regression is used to localize the static points (e.g. wrists, elbows), whereas in our work, we localize n-2 dynamic uniformly sampled points, which is way harder and makes our work substantially different from [11]. Quantitatively, this brings advantage over the majority of models (ref. Table 1). We emphasized the difference in Sec. 4, Par. 1.

R2: The author verified their methods on synthetic data. Why not evaluate it on a real-dataset? Actually, we evaluated our method on the real dataset of 6364, 2994, and 3177 images of different classes (please see Sec. 3.1 and Supp. Table 1). The semi-synthetic data was used only as an interim step.

R2: …author argues that segmentation-based methods can generate some false positive pixels.. The beauty of our method is that even if it generates a wrong heatmap, it predicts merely a wrong curvature of the tube, preserving its structure intact. In contrast, some false positive or false negative pixels of the binary segmentation predictions may look like multiple disconnected components (filaments), which requires a challenging regularization or post-processing. We referred to this advantage in Sec. 2, Par. 1 and will further elaborate the difference in the Discussion section.

— R3: Also, the width of the centerline can be another important factor for the performance… We confirm that we used a coarse width approximation, based on the average size of the real devices and the known spacing. However, we are certain that the width can be extracted from the dispersion of the heatmaps, which can address the excellent suggestion of the reviewer. We will add a paragraph to the Discussion to demonstrate.

— R4: …the paper’s main contribution is the formulation of the bi-directional centerline encoding… …The experiments do not reveal what role the architecture itself plays… MR1: …the validation of the main contribution of the paper…

The main contribution of the work is two-fold. First, to indeed show the superiority of the bidirectional encoding. To do that, we used the HRNet encoder, which is one of the SOTAs in the landmark detection task. Second, we show that our method is competitive to the SOTAs in the task of tube and catheter segmentation (Fig. 2, Supp. Table 3). We agree that the proposed comparison of different encoders may be valuable in the future work to understand the role of the architecture and will highlight that in the discussion. We have not done it with the segmentation architectures because there has not been any evidence that such architectures are good at the heatmap regression task.

R4: It is not clear how / whether the hyper-parameters… For our method, we used a grid search to pick the hyperparameters. For the competing models, we kept the original hyperparameters.

R4: …I have a hard time understanding why the method works with apparently very good performance… We agree that we have not discussed the intuition behind the method, so we will add a paragraph to the discussion to cover that gap. In our opinion, our method indeed requires a large receptive field of encoder to capture the global structure of the device. We may find a similar evidence in [15], where the uniformly sampled points of the contour are regressed to the real contour of an object, requiring some implicit calculation of the distance to the contour. Intuitively, our method implicitly calculates the length of the device and uniformly puts the points along the centerline.

R4: - impressive qualitative results, however, I am missing an analysis of failure cases.… …I find the added value of Table 1 in the paper to be relatively limited…. We agree that the failure cases Section is missing in the paper and will showcase these cases by freeing up the space occupied by the auxiliary Table 1. Thank you for the suggestion.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The original reviews of the paper were generally positive. The authors have responded to the suggestions of the reviewers to their satisfaction. I recommend acceptance of the article.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NR



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The key strength of this work is a seemingly novel bidirectional point based encoding for localizing guidewires and other tubular structures in interventional images.

    From the reviews, a main concern that I share is the lack of clarification what role the architecture and the bidirectional encoding actually plays in the performance. This is not clear from the paper and the authors did not convincingly address this in the rebuttal. An ablation study to carefully determine the influence of the bidirectional encoding would be required. Furthermore, the authors state in the rebuttal that the choice of HRNet was informed by the fact that there is no evidence that segmentation architectures are good at the heatmap regression task. I have to strongly object against this statement, since UNet like architectures have been widely and successfully used for landmark localization in the last five years. This and the fact that those approaches are not referenced in the literature survey, indicates that the authors did not study the available related work sufficiently. Moreover, it is strange that there are no comparisons to actual dedicated competing methods for the vessel and instrument segmentation task, compared architectures are general ones, not tailored to this task. I don’t think there are no tailored state of the art approaches around for these tasks.

    In summary, this work is at borderline, and my assessment is that the mentioned weaknesses slightly outweigh the benefits like the nice qualitative results.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    14/19



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal answers well my questions. Thus I raise my score to accept.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



back to top