Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zheng Zhang, Xiaolei Zhang, Yaolei Qi, Guanyu Yang

Abstract

Coronary artery segmentation on coronary-computed tomography angiography (CCTA) images is crucial for clinical use. Due to the expertise-required and labor-intensive annotation process, there is a growing demand for the relevant label-efficient learning algorithms. To this end, we propose partial vessels annotation (PVA) based on the challenges of coronary artery segmentation and clinical diagnostic characteristics. Further, we propose a progressive weakly supervised learning framework to achieve accurate segmentation under PVA. First, our proposed framework learns the local features of vessels to propagate the knowledge to unlabeled regions. Subsequently, it learns the global structure by utilizing the propagated knowledge, and corrects the errors introduced in the propagation process. Finally, it leverages the similarity between feature embeddings and the feature prototype to enhance testing outputs. Experiments on clinical data reveals that our proposed framework outperforms the competing methods under PVA (24.29% vessels), and achieves comparable performance in trunk continuity with the baseline model using full annotation (100% vessels).

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_28

SharedIt: https://rdcu.be/dnwyH

Link to the code repository

https://github.com/ZhangZ7112/PVA-CAS

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a partial-label learning framework under the setting of requiring annotating only vessels within specified RoIs. Firstly, a local segmentation model learns within the RoI and produces noisy global pseudo-labels. Secondly, a global segmentation model learns from the global pseudo-labels and updates pseudo-labels guided by an RoI-quality test. Thirdly, a prototype-learning method is used to improve continuity.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • This paper proposes an improvement on the workflow of vessel annotation, allowing the clinicians focus their valuable labor only on high-value RoIs.
    • This paper proposes a sound framework under its proposed setting.
    • This paper evaluates vessel segmentation with proper vessel-specific metrics in addition to vanilla DSC.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The PVA setting can be sensitive to how the RoI is selected, as the RoI should be representative. This topic is not discussed in this paper.
    • The overall design of the proposed components and evaluation metrics encourage recall instead of precision (see details), which is less harmful in this case persuing continuity, yet (1) can be limited when scaling to FP-sensitive tasks; (2) competting partial annotation methods spend more on balancing FP reduction.
    • The FPA part is not clearly written. See detailed comments.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code is claimed open yet and the dataset used remains private.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    FP Recution:

    • It is unclear how the PLU confidence degree filters false positives, as it encourages recall yet casts no limit on rejecting FPs.
    • It is unclear how FPA rejects FPs without distracting background features from the prototype. FPA Clarity:
    • How is $O(h,w, d)$ used?
    • What is $S(h,w,d)$ in eq (6)?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The valuable improvement on the annotation workflow.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This work proposes a partials annotation method for segmenting coronary arteries, hence avoiding the tedious task of annotating the entire coronary tree.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Annotating coronary arteries is tedious and requires significant experience. This work is a good step towards more sample-efficient methods that maximise value gained from the existing annotations.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It would be good to include nn-UNet as a baseline, to ensure the comparison against other models isn’t affected by the choice of hyper-parameters and experimental setup, some of which can depdend on the dataset characteristics.

    Some columns in Table 1 have very high standard deviation compared to the the actual difference in performance, so it is possible that this difference is due to random chance. However overall I think Table 1 provides reasonable expectation that this method outperforms other alternatives.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The manuscript includes details on the implementation, parameters and hardware used. Additoinal details on the dataset characteristics and ethics would be useful.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Additional of nn-UNet baseline would increase confidence in the reported comparisons against other methods.

    There are some details missing, such as the resolution of the data, whether it is contrast enhanced, whether it is from a single/multiple center and using a single/multiple types of CT machines.

    It would be good to clarify how the vessels are labelled, i.e. if it is up to the clinician, or the same branches were labelled in all patients.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The methodology employed shows fairly good performance against alternative methods, and addresses an important problem of the lack of large training datasets in coronary artery segmentation.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper presents a weakly-supervised vessel segmentation framework based on Partial Vessel Annotation (PVA). It consists of two stages: 1) pseudo label generation; 2) pseudo label updating and enhancement. The experimental results show that this method achieves high efficiency with only very limited annotations required.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The investigated topic of label efficiency is crucial in medical image segmentation, especially for coronary artery segmentation in 3D volume. The results reveals that the proposed method achieves higher accuracy and trunk continuity while with only 24.29% vessels annotated.
    • The framework is easy to follow in a local-to-global perspective, including the pseudo label generation and updating.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. My major concern focuses on the Stage2:
      • Self-training is used to update pseudo labels with the confidence-based EMA. In this way, the network suffers from the risk of over-confident predictions, where the error in pseudo labels may accumulate and even be more difficult to correct. The authors should discuss more whether these errors with systematic biases (i.e., model over-confidence) can be corrected or reduced. - The feature prototype analysis block is not clear. Why can it contribute to improving the trunk continuity?
    2. For comparison methods, EWPA and DMPLS belongs to partial annotation-related weakly supervised frameworks. However, they cannot achieve better performance than the baseline 3D UNet. The authors should explain why these methods can achieve satisfactory performance in their respective tasks, yet be limited in the coronary artery segmentation tasks. Furthermore, for a more fair evaluation, some weakly-supervised vessel segmentation methods (e.g., [17]) should be involved in method comparsion.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • The dataset is private and the code and model are not released.
    • The exact model architecture is not provided. Not a problem if code is released.
    • Hyper-parameter strategy is not mentioned, but maybe no hyper-parameter search was performed (which would be fine)
    • Time/cost/memory not reported
    • No statistical significance tests
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see the weakness part.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • Major concern about the pseudo label correction, which could be trapped in systematic errors with over-confident network predictions.
    • Unfair experimental results without vessel segmentation methods involved in comparison.
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    3

  • [Post rebuttal] Please justify your decision

    The risk of error accumulation seems still difficult to avoid. The authors use EWMA with confidence guidance to correct potential errors in pseudo labels. However, as mentioned in the reference [18], there are some systematic errors with even higher prediction confidence due to the over-confidence nature of CNN. Therefore, correcting the pseudo labels with only prediction confidence-based guidance (i.e., EWMA module) could be highly limited to improve the label quality.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper describes a deep learning method for coronary artery segmentation in CT that can be trained with partially annotated images. A requirement for these annotations is that is contains fully segmented individual coronary arteries. This is an interesting approach. Although Dice scores are lower than those of a model trained with full annotations, the completeness of the artery appears to be comparable. The reviewers agree that this is an interesting paper for MICCAI, but raise some issues that should be addressed in a rebuttal. In particular, Rev. 3 raises some valid concerns about the pseudolabel correction.

    Strengths

    • Creating data sets for coronary artery segmentation (and other small artery segmentation) is a tedious job. A model that can learn on partial annotations would be valuable.
    • The authors not only evaluate the performance of the method in terms of Dice coefficients, but also using criteria that are more specific to coronary artery segmentation.

    Weaknesses

    • The baseline model is not SOTA, I concur with the reviewers that it would have been good to add nn-UNet as a baseline.
    • It would have been interesting to see if using this approach with the same ‘annotation budget’ as for full vessel tree segmentation, the model could learn more.
    • Reviewer 3 indicates a real risk in the self-supervised training, where errors can accumulate.




Author Feedback

We sincerely thank all reviewers for appreciating our work on Partial Vessels Annotation (PVA) (“The reviewers agree that this is an interesting paper for MICCAI”-Meta).

  1. Label-efficient setting (“avoiding the tedious task of annotating”-R2, “very limited annotations required”-R3).
  2. Effective framework (“valuable improvement”-R1, “fairly good performance”-R2, “achieves high efficiency”-R3).
  3. Good organization (“very good clarity”-R2, “easy to follow”-R3)

Q1: Coping with the risk of error accumulation. (Meta, R1, R3) A1: Our quality control test and EWMA mechanism in LPU reduce the risk. 1) Our quality control test rejects low-confidence predictions to reduce the risk. As the confidence degree (Eq.2) assesses the quality of predictions, low-confidence predictions are more likely to generate errors. 2) Our EWMA is a variation of prediction ensemble [9], which filters noisy labels. EWMA gradually diminishes the negative influence of existing errors through the weighted average of predictions across multiple phases.

Q2: Interesting exploration on annotation budget (Meta) A2: We conducted the experiment to align “annotation budget” by using less labels for full annotation. Experimental results showed that “(a) our method under PVA” outperforms “(b) the baseline trained with full annotation” using the same budget. [Metric (a)/(b)] Dice 71.45/66.72 | RDice 83.14/77.87 | OV 75.46/65.63 | OF 0.895/0.717, 0.915/0.816, 0.879/0.750. Additionally, after the dataset is expanded, further experiments will be conducted to verify whether the conclusion remains consistent when increasing labels for partial vessels annotation instead.

Q3: Reasons of the comparison methods (EWPA/DMPLS) being limited in coronary artery segmentation. (R3) A3: 1) EWPA: Class center of coronary artery is sensitive to noise. EWPA calculates the class centers by averaging the features of the corresponding classes for each image to generate pseudo labels. However, coronary artery takes a very small volume ratio in an image, which makes the class center for coronary artery sensitive to noise. 2) DMPLS: Pseudo labels are not improved. DMPLS randomly mixes outputs of the two branches in their network to generate pseudo labels, but lacks mechanisms, which adaptively allocate weights or perform selection based on prediction quality, to improve the pseudo labels considering the existing noise interference from veins and pulmonary arteries.

Q4: How FPA contributes to improving the trunk continuity. (R1, R3) A4: FPA extracts additional feature similarity information to enhance the perception to vessels, resulting in improved trunk continuity. Specifically, FPA selectively aggregates vessel features within annotated regions across all samples to ensure the accuracy of the vessel feature prototype. Then, the vessel segmentation is improved by incorporating the information of “similarity between the feature prototype and the extracted features from convolution” into the final inference.

Q5: Validity of comparison methods. (R3) A5: Our comparison methods are valid for supporting the good performance of our method. 1) They are novel partial annotation-related weakly supervised methods, and they also incorporate techniques such as pseudo-labeling, prediction ensemble as we do. 2) To our best knowledge, existing vessel segmentation methods were not directly related to partial annotation, as our work is the first method for vessel segmentation of coronary artery using partial annotation. For example, method [17] concerns with cross-anatomy domain adaptation.

Q6: About more details of the data. (R2) A6: Due to double-blind requirement, detailed information of the data was withheld but will be released along with the code.

Q7: Labeling vessels under PVA. (R1, R2) A7: 1) Only several vessels were labeled in each image from their ostia to distal ends. 2) Vessels of greater clinical significance were more likely to be annotated.

Thank you again for careful reading.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    In their rebuttal, the authors addressed several of the comments raised by the reviewers. Some concerns remain about error accumulation and it would be good to add least discuss this issue briefly in the camera-ready version. However, this is a limitation of many weakly or semi-supervised methods, and not specific to the current approach. Overall, the work has sufficient merit for MICCAI and might be of interest to its audience.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors propose a progressive weakly supervised learning framework to achieve accurate segmentation under partial vessels annotation (PVA). The work addresses a relevant problem and has achieved good performance. The authors have well responded to most of the reviewers’ concerns. In particular, they have clarified the methodological concerns. However, there remains one weakness, which is the lack of nnU-net as the baseline. Based on my experience, there is a good chance that nnU-net performs much better than the competing methods included in the experiments. Overall, the strength outweights the weakness, and the paper can be accepted.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presents a weakly supervised coronary artery segmentation framework based on partial vessel annotation. Overall an interesting paper, with strong practical relevance. However, there are concerns even after rebuttal due to limitations in baselines and the confirmation bias issue of pseudo-labeling. Recommendation is to reject



back to top