Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zhixing Zhang, Ziwei Zhao, Dong Wang, Shishuang Zhao, Yuhang Liu, Jia Liu, Liwei Wang

Abstract

Automatic labeling of coronary arteries is an essential task in the practical diagnosis process of cardiovascular diseases. For experienced radiologists, the anatomically predetermined connections are important for labeling the artery segments accurately, while this prior knowledge is barely explored in previous studies. In this paper, we present a new framework called TopoLab which incorporates the anatomical connections into the network design explicitly. Specifically, the strategies of intra-segment feature aggregation and inter-segment feature interaction are introduced for hierarchical segment feature extraction. Moreover, we propose the anatomy-aware connection classifier to enable classification for each connected segment pair, which effectively exploits the prior topology among the arteries with different categories. To validate the effectiveness of our method, we contribute high-quality annotations of artery labeling to the public orCaScore dataset. The experimental results on both the orCaScore dataset and an in-house dataset show that our TopoLab has achieved state-of-the-art performance.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43990-2_71

SharedIt: https://rdcu.be/dnwMv

Link to the code repository

N/A

Link to the dataset(s)

https://github.com/zutsusemi/MICCAI2023-TopoLab-Labels/


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper at hand proposes an automatic multi-stage pipeline for coronary artery labeling. In a first step this method extracts image features using an image encoder and map these features onto the centerlines. Secondly, these features get encoded for each subsegment/branch of the coronary tree with a Transformer. Next the individual segments interact through a graph convolutional network and finally get passed through an anatomical connections (AC) classifier, which ensures that the final predictions adhere to the anatomical definition of the subsegments. All aforementioned components are novel contributions to the field. Additionally, the authors promise to release annotations on a public datacollection for this task.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • All the steps of this multi-stage pipeline are well motivated and reasonable
    • The performance on the task at hand is reported to be superior to related work algorithms
    • Providing annotations on a public data collection for this task would be a huge step towards being able to compare methods
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Validity of the results: This one is the most severe weakness and already justiies a reject. The authors report precision, recall and F-score with the latter being the harmonic mean of the others. However, if you calculate the harmonic mean between precision and recall it leads to other values for the F-score than reported. Therefore, the results stated in the paper are not reliable.
    • Clinical motivation: The authors state on multiple occasions, that the accurate labeling of the coronary arteries is a crucial part of diagnosis. This statement is not backed with any reference and from my understanding of this subject the main purpose of centerline labeling is reporting of where potential lesions are located. Furthermore, the subset of coronary subsegments predicted does not adhere to the segments defined by the American Health Association (AHA), which is, to the best of my knowledge, the reference standard to subdivide the coronary artery tree. For example, the RCA is also divided into a proximal, mid and distal part, which allows for exactly the task behind the clinical motivation to accurately report the location of potential lesions. To my surprise, recent related work also skips this subdivision. Still, to me this lessens the clinical value of the paper and the value of the annotations provided within the scope of this paper.
    • Dependency on centerline extraction: The centerline extraction method used in this work is not reported to be used on this exact task before. From my understanding this relatively simple thinning approach is likely to overemphasize larger vessels and neglecting smaller side branches. However, as the actual performance on this task is not reported, this remains an open question. Additionally, if one assumes a powerful centerline extractor, the anatomically defined subbranches may bifurcate without leading to a new anatomical label for each of the downstream vessels. The algorithm presented does not support this case, hence I conclude that this might be a future problem.
    • Performance gap between related work and reimplementation results: There is a huge gap between the performance of the related work publications on their respective data collection and their performance reported in this paper. E.g., the paper in Reference [1] reported an F1 of 0.95 originally and only achieved 0.82 and 0.88 on the private dataset in this paper or the OrCA dataset respectively.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    As the metrics reported are not valid they cannot be reproduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please double check your evaluation script on how you calculate your metrics as they are off. Also, consider adapting the clinical motivation behind your approach and actually labeling based on the AHA segments. Furthermore, consider reporting the quality of your centerline extraction approach as it severly impacts the downstream task of labeling.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    2

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    As the results reported are not reliable/valid the paper cannot be published.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    3

  • [Post rebuttal] Please justify your decision

    The authors addressed the main points raised by me insufficiently. Regarding the metrics not adding up, they state that the mean performance over all segments is reported. While this line of reasoning may hold for the results stated in the paper itself, it does not hold for the supplementary material, specifically Table 2 for the SAN, AM, RPDA, RI and RVC segment. While having good and convincing points in the rebuttal on why there is a gap between the performance of related work on the original data and the data leveraged in this study, the authors do not state that this reasoning will be added to the paper, which was my main query. The same negligence of adapting the paper due to valid concerns holds for the clinical motivation. In the rebuttal, the authors agreed that they are merely providing the first step, and subsequent manual subdivision is necessary to be performed by the physician. This is in line with my raised point, and therefore this limitation should be acknowledged in the manuscript. Furthermore, if the centerline extraction performance is argued with a prior manual segmentation, this reduces the clinical applicability of the method, as one can only apply it if a tedious prior segmentation is performed.



Review #2

  • Please describe the contribution of the paper

    The paper presents a novel framework called TopoLab for topology-preserving automatic labeling of coronary arteries. It incorporates anatomical connections into the network design explicitly using a hierarchical feature extraction module and an anatomy-aware connection classifier. The paper also contributes high-quality annotations for the orCaScore dataset, which will be released publicly. TopoLab demonstrates improved performance over previous state-of-the-art methods, particularly in topology-related metrics.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Few things

    • The TopoLab framework introduces a new approach to incorporating anatomical connections into the network design explicitly since they use a hierarchical feature extraction module and an anatomy-aware connection classifier, enabling the network to preserve topology by design which is novel.

    • Unlike previous methods that classify each segment independently, the proposed anatomy aware classifier performs classification for every connected segment pair, effectively prioritizing anatomically predetermined connections

    • The proposed method improved performance in artery labeling, especially in topology-related metrics, suggests its potential for enhancing the diagnosis process of cardiovascular diseases in clinical settings.

    High-quality dataset contribution: The paper contributes high-quality annotations for the dataset, which will be made publicly available

    • The experimental results on both the public dataset and an in-house dataset demonstrate that this method outperforms previous state-of-the-art methods, showcasing its effectiveness and potential real-world impact and potential for clinical practice

    • The paper is well-structured, making it easy to follow and understand. The visualizations provided are also helpful, making it simpler to grasp the concepts and the results. Overall, it adds to the readability and appeal of the work.

    • attached supplementary materials

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    No major weakness, overall really enjoyed. Just maybe one thing but concerns this and many other publications, while the paper demonstrates the potential of TopoLab in the context of coronary artery labeling, there is no direct evidence of how the method would perform in real-world clinical settings or how it would impact clinical workflows and evaluations. Further validation in clinical practice would be necessary to confirm its feasibility and utility.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of the paper is strong based on the provided information in the reproducibility checklist. The authors have addressed most of the crucial aspects, which includes:

    • Clear descriptions of the software framework, assumptions, and mathematical setting, algorithm, and/or model.
    • Detailed information about the datasets, including relevant statistics, and links to downloadable versions if public.
    • Code availability, including specification of dependencies
    • Comprehensive reporting of experimental results, such as hyper-parameter selection and sensitivity analysis
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • TopoLab framework effectively incorporates anatomical connections into the network design, which sets it apart from previous studies. The hierarchical feature extraction module and the anatomy-aware connection classifier work together to improve labeling performance while preserving anatomical topology. This is an innovative approach with strong potential for clinical applications. Potentially provided more clinical evaluations.

    • Evaluated and compared with SOTA comprehensively with existing state-of-the-art methods on both the orCaScore and in-house datasets, demonstrating its superior performance in terms of recall, precision, and F1 score. Additionally, the introduction of the viola and violac metrics provides valuable insights into the topological accuracy of the method.

    • However, it would be beneficial to include a description of results with central tendency and variation (e.g., mean and error bars) as well as an analysis of statistical significance of reported differences in performance between methods to further support your findings.

    • The paper exhibits a strong focus on reproducibility, addressing most aspects of the reproducibility checklist which is fantastic and very valuable for community. Providing the code, models, and dataset links, along with detailed explanations of the methods and experimental setup, will significantly aid others in reproducing and building upon your work. Additionally, your contribution to the orCaScore dataset with high-quality annotations is commendable and will be valuable to the research community.

    • Maybe one a bit of a wish/remark: while the paper presents the strengths and successes of TopoLab, it would be helpful to include an analysis of situations in which the method failed, along with a discussion of possible reasons and potential improvements. This information would provide a more balanced view of the method and offer valuable insights for future research.

    Overall, the paper presents a novel and promising approach to automatic labeling of coronary arteries while preserving anatomical connections.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I recommend accepting this paper with a score based on the following major factors:

    • Novel approach: The TopoLab framework introduces a unique and innovative method for incorporating anatomical connections into the network design, setting it apart from previous studies and demonstrating its potential for clinical applications.

    • The paper thoroughly evaluates and compares TopoLab with existing state-of-the-art methods, showcasing its superior performance in terms of recall, precision, F1 score, and topological accuracy, as indicated by the viola and violac metrics.

    • The authors contribute high-quality annotations for the orCaScore dataset, which will be valuable to the research community and encourage further development in the field.

    • The paper provides a comprehensive and transparent description of the methods, experimental setup, and resources, making it easier for other researchers to reproduce and buil

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    Labelling coronary arteries is an important step in both clinical and research contexts. This work proposes an automated labelling method, which fo the first time incorporates prior knowledge of the valid topologies of coronary arteries to ensure they generated labels are physiologically accurate.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This method for the first time intergated information of valid topologies of the coronary arteries, such that the generated labels follow conventions of how sub branches are labelled. The results show significant improvement compared to prior state of the art.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It should be noted that while rare, in some cases patients will have abnormal topologies where a branch can originate from different parent vessels than expected.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The manuscript presents details of the implementation and parameters used. Atleast some of the dataset is publically available, and authors state the additional ground truth data used for this manuscript will be made publically available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    It might be useful to discuss how the model performs if there are any abnormal features, which would likely be present in 1200 CTA scans.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The methodology is sensible and well validated, showing clearly better performane against prior state of the art methods.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors propose a method for coronary artery segment labeling with topological constraints. The authors evaluate their method in two datasets and intend to publish their annotations to extend an existing public data set. Overall a nice paper, but there is some confusion w.r.t. the evaluation metrics that should be cleared up in a rebuttal.

    Strengths

    • The authors extend a public dataset and intend to publish their data.
    • Interesting integration of image-based feature extraction, graph neural networks, and topological regularization.
    • Well written paper.

    Weaknesses

    • The reported numbers don’t add up (Rev. 1). It’s quite obvious from the results that the F1 score is not the harmonic mean of the recall and precision. Authors should explain clearly how these numbers are computed. For example, are they averages of sample-wise precion, recall and F1-scores?
    • The kind of model-driven labeling that the authors propose has been previously proposed, e.g., by Yang et al. This in itself is not very novel, but the combination with deep learning is.
    • A weakness of the method is its dependence on coronary tree extraction from segmentation masks using thinning. It would be good to discuss the effect of the centerline extraction algorithm on the performance of the labeling algorithm.

    In the rebuttal, the authors should specifically try to clear up the confusion w.r.t. the reported metrics. I.e., how were these values determined? Moreover, it would be good to address:

    • centerline extraction
    • the gap between reported numbers for existing methods and numbers reported in original publications




Author Feedback

We thank all reviewers for their valuable comments and address the issues below. R1/MR) Q1”F1-score reported is not the harmonic mean of precision and recall.” As explained in Section 4.1 Evaluation Metrics of the main paper, precision, recall and F1-score are firstly calculated for each segment category separately. The reported metrics in the paper are then derived as the mean metrics, weighted by the number of segments belonging to different categories. Consequently, the F1-score in the paper does not exhibit a harmonic mean relationship with precision and recall. Therefore, we are confident that the results presented in the paper are valid and reliable. Q2”Performance gap between related work and reimplementation results.” The performance gap can be attributed to several key factors.a)Our paper addresses a more complex task than previous studies, focusing on a challenging 14-category classification task. This surpasses the scope of studies like TreeLab-Net and CPR-GCN, which only consider 10 and 11 categories, respectively.b)The public dataset orCaScore has only 72 CCTA images in total, which imposes constraints on the amount of data available for training and naturally impacts the overall performance.c)The anatomy structures of coronary arteries in our in-house dataset exhibit higher complexity compared to previous studies. For instance, our in-house dataset demonstrates an average of 23.0 vessel segments, while CPR-GCN’s dataset only has an average of 13.2 segments. Q3”Clinical motivation.”According to recently published authoritative guidelines about CCTA, such as CAD-RADS 2.0 released in July 2022, which provides a standardized reporting template for radiologists, the description of arteries is accurate to RCA without further subdivision into proximal, middle, and distal parts. In clinical practice, radiologists usually label vessels via anatomical considerations firstly, based on that vessels are reconstructed and displayed as complete entities. And then some of the recognized vessels like RCA are divided into 2 or 3 segments to describe the location of lesions. Therefore, the vessel labeling strategy in our paper and previous studies holds clinical value in various applications like report generation and image reconstruction. It also serves as a foundation for further vessel subdivision, as mentioned in your review. Q4”Dependency on centerline extraction.”Thanks for the comments. Firstly, as stated in Section 3.1 Overview, the centerline extraction approach employed in our paper is applied on vessel segmentation annotations for vessel labeling, thus preserving all vessel structures, including both larger and smaller vessels. Secondly, we have evaluated multiple centerline extraction algorithms including the thinning method in the paper, the skeletonization method in clDice(CVPR2021) and model-driven method in PointScatter(ECCV2022) and found that the downstream labeling performance remained consistent across these methods (+-1.0% on F1). R2) We appreciate the valuable comments. We acknowledge the importance of conducting validations in clinical practice, and we will prioritize this aspect in our future work. The central tendency of the results and the analysis of statistical significance will also be supplemented in our revised paper. R3) “Abnormal topologies.”Thanks for the valuable comments. In Section 3.3 Inference, we outlined our approach of selecting the connection with the highest confidence score among all segment connections that cover the given segment. This selected connection then determines the predicted category for the specific segment.This method helps mitigate incorrect predictions caused by abnormal topologies. However, we acknowledge that the model’s performance may be lower when encountering rare cases with abnormal topologies due to design constraints in TopoLab and the scarcity of training samples with abnormal topologies. We will attempt to address this limitation in our future work.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    In their rebuttal, the authors have clarified the discrepancy between the precision, recall and F1 scores. It would be good to also clarify this more in the text of the camera-ready version (also in the caption). It would also be good to discuss the discrepancy between reported results and ‘original’ results of baseline methods, as is done in the rebuttal. Overall, the method and the dataset that the authors intend to share are of interest to the community, and I consider the work to be a good MICCAI contribution of sufficient quality.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I have read the comments and rebuttal. This paper is about coronary artery segment labeling with topological constraints. Most of the concerns raised by the reviewers have been addressed, e.g., metrics not adding up and performance differences. Related to R1’s concerns, I suggest the authors include the explanations presented in the rebuttal into the revised paper.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper has a overall high quality and good clinical potentials, including applications in other imaging modalities where topology needs to be preserved.



back to top