Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Doruk Oner, Hussein Osman, Mateusz Koziński, Pascal Fua

Abstract

Many biological and medical tasks require the delineation of 3D curvilinear structures such as blood vessels and neurites from image volumes. This is typically done using neural networks trained by minimizing voxel-wise loss functions that do not capture the topological properties of these structures. As a result, the connectivity of the recovered structures is often wrong, which lessens their usefulness. In this paper, we propose to improve the 3D connectivity of our results by minimizing a sum of topology-aware losses on their 2D projections. This suffices to increase the accuracy and to reduce the annotation effort required to provide the required annotated training data.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_57

SharedIt: https://rdcu.be/cVRzb

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a novel method to improve continuity in the segmentation of 3D elongated structures by minimizing a 2D connectivity loss in multiple 2D projections. The 2D connectivity loss itself was recently proposed, but has not been used in medical imaging (not extensively or not at all) and its application for 3D segmentation is a neat idea.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Interesting, novel approach to ensure connectivity in elongated structure segmentation

    Smart way of incorporating a recent, successful 2D region separation loss in 3D segmentation

    Good validation showing convincing improvements compared to logical baselines as well as to previous methods proposed to improve topological correctness

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I missed information on the datasets, especially the number of images in each set and train/test splits.

    Parameters were selected based on their performance on one of the test sets, whereas for previous approaches Perc and PHomo parameters as suggested in the papers were used. While there was no extensive tuning for the proposed method and the same parameters were successfully used in multiple datasets, this makes the improvements with respect to the previous topology-aware approaches less convincing.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproducibility seems OK, public data and largely based on a publicly available code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    In the introduction, perhaps clarify that only in some cases indeed 2D annotations can be used to train 3D models - in many other applications, 2D projections of 3D volumes would not give sufficiently reliable information to enable annotations, but the proposed 2D connectivity loss would still work.

    Is Fig 3 the brain dataset only? please clarify in the caption.

    Would it be possible to include results for Perc and PHomo with parameters selected in a similar manner as for the proposed method?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Original idea and convincing results (even taking into account that parameters of compared methods may not be optimal). Best paper in my stack.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    7

  • [Post rebuttal] Please justify your decision

    The rebuttal addressed my concerns, and seeing the other reviews did not raise any new ones. I appreciate the new results with parameter tuning of the baselines on one dataset and including ClDice. I am raising my score by 1 point.



Review #2

  • Please describe the contribution of the paper
    • better segmentation of 3D linear structures (e.g. vessels) by using 2D topological aware loss on projections
    • ground truth can be annotated on projections for higher efficiency
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • effective translation of technology from 2D satellite image analysis (with its focus on connectivity for street network analysis) to 3D medical image analysis
    • elegant solution to prevent gaps in the segmentation of linear structures
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • reproducibility (see below for details)
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Apparently, the authors have not understood how to fill out the checklist: for many questions answered with yes, the information is actually missing in the paper (e.g. no tests for statistical significance, memory footprint, analysis of situations in which the method failed, etc.). Since no code will be published, these details will remain unknown.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    While the paper generally features a good description of training parameters,some details remain unanswered, e.g. what the window size for L_TOPO calculation is. Fig 3: which use case does this describe? Table 1: does bold mean statistical significance? I hope yes, but please clarify in description.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The main idea is really nice and weights more for me than problems with reproducibility, I believe MICCAI audience will like this work.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    The authors used an existing 2D loss function on three 2D projections of 3D datasets. In addition, they have performed experiments on multiple 3D datasets, which showed improved topology-aware scores.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The authors have done a detailed evaluation of three datasets and achieved an improved topology-aware score over existing topology-aware loss functions.
    2. The method can offer 3D segmentation from 2D annotation.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The authors mentioned that occlusion is an example of this formulation not guaranteeing connectivity preservation. However, the dataset used in this example has occlusion very often. This is a huge contradiction. How would the authors still argue in favor of their approach?
    2. The technical contribution of this paper is heavily sacrificed by the plug-and-play application of an existing loss [23] in 3 canonical projections of 3D space.
    3. The literature review is incomplete. The authors failed to cite the following papers on topology-aware loss function (Byrne et al. STACOM 2020, Shit et al. CVPR 2021), including methods that work directly on 3D data.
    4. While APLS and TLTS were proposed for 2D scenarios, e.g., road networks, I am not convinced that it is still a good topology metric from 2D projection because of the abundance of occlusion. On the other hand, Betti error would be a better alternative to topological metrics.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Details about the baseline models are missing, e.g. whether trained on 2D or 3D.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. Do Perc and PHomo also work on the projection of 3D output in combination with the MSE loss? Otherwise, it will not be fair to compare. Authors should provide more details on the baseline models for reproducibility.
    2. On page 3, the authors claim that “continuity of 3D structures implies continuity of their 2D projections.” However, that is irrelevant. One should look for how discontinuity in 3D can be captured in optimal projections in 2D. It may require having more than three canonical projections.
    3. What is β? It is not explained anywhere in the text.
    4. Why does MSE in 3D perform worse than MSE in 2D for the Brain and neurons dataset?
    5. Since the data is highly sparse and most of the time, the error is a false negative, did the author apply any weighting for the cross-entropy? Why didn’t they consider Dice as a baseline since it handles mild class imbalance pretty well?
    6. Why has no 3D metric been reported? Since the main task is in 3D, authors should consider evaluation in 3D.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Limited technical novelty, and weak and unexplained assumptions heavily undermine the empirical evidence of performance improvement. Hence I recommend rejection.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    Despite producing promising results in special cases of linear structure where occlusion is less abundant, the paper severely lacks technical contribution since the idea of segmenting 3D from 2D projection is well known in the medical domain [16, 17]. Hence I could only improve the rating from reject to weak reject.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper generalizes a topological loss function developed for 2D satellite images, to 3D tubular structures, by utilizing the 2D loss on multiple projections. While the main novelty comes from the multi-view approach, this is a nice and simple trick if effective. For the rebuttal, it is important that the authors address:

    • The concerns of Reviewer 3 regarding occlusion
    • The reproducibility concerns of Reviewer 2
    • The concerns regarding experimental setup of Reviewer 1.
  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4




Author Feedback

We thank the Reviewers for their comments.

R1.1 What is the number of images in each data set?

Brains: 14 scans, size 250x250x200, 10:4 train:validation splits. Neurons: 13, 216x238x151, 10:3. MRA: 42, 416x320x128, 31:11.

R1.2 The hyper-parameters of your method were selected based on test performance. Do the same for the baselines.

We did not tune the hyper-parameter of our method for each data set individually. We followed the approach of the baseline authors: Search for the optimal parameter on one data set (Brain), and use the same value for the remaining ones.

As requested, we identified the optimal hyper-parameter for Perc and PHomo by running an ablation on the Brain, as we did for our method: We multiplied the weight of the topological loss term by 0.1 and 10, with little change: Qual APLS TLTS Perc .1 94.6 77.1 84.3 Perc 1. 94.5 76.6 84.1 Perc 10. 95.0 74.1 84.0

PHomo .1 94.1 80.9 83.5 PHomo 1. 94.7 81.5 83.9 PHomo 10. 94.9 81.6 83.2

R1.3 and R2.1 Fig 3 contains results for the Brain?

Yes. The remaining results are in the supplementary.

R2.2 The window size for L_TOPO calculation not specified.

It is 48 pixels. For reproducibility, we will publish the code on github.

R3.1 The data contains occlusions which contradicts evaluating connectivity in projections.

As seen in Fig 4, occlusions by neurons and blood vessels are sparse. We counteract them by evaluating the loss in several projections. We use 3 for ease of implementation, but more could be used if needed. Using projections at oblique angles was already shown in [17].

R3.2 The contribution is diminished by plugging existing loss [23] in projections.

Realizing that [23] is applicable to this case was non-obvious and is a contribution in its own right, which brings useful results.

R3.3 Two existing methods were not referenced.

We will reference them:

As explained in sec 4.2 of Shit et al., they focus on volumetric segmentation. For our neurons (top row of Fig 4), volumetric annotation is impossible, because apparent neurite thickness is modulated by dye distribution. We only have centerline annotations. A naive way to adapt this algorithm is to thicken the centerlines. On the Brain dataset, after finding the optimal parameter (Alpha=0.2), this yields: Qual APLS TLTS clDice 95.1 78.2 85.4 which is competitive with Perc and PHomo, but not with our method.

Byrne et al. extend their previous work, that we already reference as [5], to multi-class segmentation. [5] uses the same mechanism as PHomo, but imposes a pre-defined Betti number on the output, which makes it less suited to our task, where each scan has a different topology.

R3.4 Evaluating 3D topology by metrics in 2D projections is unreliable.

All the metrics are evaluated in 3D.

R3.5 Were the baselines Perc and PHomo run on projections?

No. Because they are applicable in 3D, we see no reason to suspect they would work better in 2D. Our motivation for using [23] in projections is that it works better in 2D but cannot be extended to 3D directly.

R3.7 The supplementary contains an ablation on Beta, not introduced in the manuscript.

Beta balances the connectivity and dis-connectivity components of [23]. We will clarify this.

R3.8 Why is MSE-3D worse than MSE-2D in the Brain and the Neurons?

They perform on par. In Brain, MSE-2D performs better on the pixel-wise metric, but worse on the connectivity one. In Neurons the situation is reversed. The differences are small. This might seem counter-intuitive, as the gradient of MSE-2D is sparse after back-propagation over the projection. But note that projection preserves the strongest structures and the gradient is channeled back to them.

R3.9 Why use CE? Dice handles class imbalance better.

We use weighted cross-entropy to deal with class imbalance. Our results seem unaffected by the imbalance. F.e., in Tab 1, on Neurons the Completeness of CE is higher than Correctness.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Reviewers 1 and 2 really appreciate the paper, while reviewer 3 is still concerned with the paper being too simple to warrant publication at MICCAI. I disagree with this assessment: If we are willing to publish complex solutions to important problems, we ought to also be willing to publish simple solutions to the same problems. With this in mind, I recommend acceptance of the paper.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The key strength of this work is to use a technique enforcing connectivity of roads in satellite imagery for the purpose of 3D vessel segmentation, to enforce connectivity in a 3D setting. This is performed on three projection views, which seems to be a neat idea and despite being simple, leads to an effective approach.

    Reviewer concerns regarding occlusions, reproducibility and experimental settings were addressed well in the rebuttal. Especially the occlusion problem does not seem to be problematic due to the use of several projections. Reproducibility will be assessed when code is made public. The experimental setup clarifications were provided.

    Overall, this work seems to be promising to provide a solution for a timely and important problem and therefore it is likely to be of interest for the MICCAI community.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    3



back to top