Authors

Weihao Yu, Hao Zheng, Yun Gu, Fangfang Xie, Jiayuan Sun, Jie Yang

Abstract

Pulmonary airway labeling identifies anatomical names for branches in bronchial trees. These fine-grained labels are critical for disease diagnosis and intra-operative navigation. Recently, various methods have been proposed for this task. However, accurate labeling of each bronchus is challenging due to the fine-grained categories and inter-individual variations. On the one hand, training a network with limited data to recognize multitudinous classes sets an obstacle to the design of algorithms. We propose to maximize the use of latent relationships by a transformer-based network. Neighborhood information is properly integrated to capture the priors in the tree structure, while a U-shape layout is introduced to exploit the correspondence between different nomenclature levels. On the other hand, individual variations cause the distribution overlapping of adjacent classes in feature space. To resolve the confusion between sibling categories, we present a novel generator that predicts the weight matrix of the classifier to produce dynamic decision boundaries between subsegmental classes. Extensive experiments performed on publicly available datasets demonstrate that our method can perform better than state-of-the-art methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43990-2_37

SharedIt: https://rdcu.be/dnwLS

Link to the code repository

https://github.com/EndoluminalSurgicalVision-IMR/AirwayFormer

Link to the dataset(s)

https://github.com/yuyouxixi/airway-labeling/tree/main

Reviews

Review #1

Please describe the contribution of the paper

The paper proposes a Transformer-based airway anatomical labelling model. The model takes branch features and predicts branch labels bi-directionally at three hierarchies: 1) lobar labels, 2) segmental labels, and 3) subsegmental labels. The pairwise branch neighbouring information is encoded in the Transformer attention module. A learnable weight generator is encoded in the architecture to generate key and value vectors of the Transformer block for subsegmental label prediction from the segmental category to address the inter-subject variability issue.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The proposed model is novel in the following ways:
1. A novel way to encode pairwise branch distance information in the self-attention module in the Transformer block;
2. A novel way to encode the hierarchical labelling by using a U-shape prediction schema while concatenating the results from both ends of the U-shape to keep consistency;
3. A novel way to address the inter-subject variability by encoding the segmental-subsegmental anatomical relationship in the weight generator.
The evaluation is well designed and the qualitative and quantitative experiments shows the clear performance improvement of the proposed method over state-of-the-art methods including: 1) GNN-based methods, 2) HGNN-based methods, 3) conventional methods, and 4) other transformer-based methods. The ablation study also shows a clear advantage of encoding all proposed components among three different hierarchies.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The main weakness of the paper is the need for more clarity on some of the technical details, e.g., the construction of the SPD matrix and the codebook.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The description seems to be very clear and reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- Section 2.1. The codebook is mentioned in constructing the self-attention module using the pairwise branch distances, however, the exact definition of the codebook is not found in the main paper or in the supplementary material.
- Section 2.1. The SPD matrix is calculated using the shortest path distance between each pair of branches, but it’s unclear whether this distance is calculated in the feature space or using the spatial distance between branches in the original image. More clarification should be given.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The novelty of the paper is very clear and the whole architecture is clearly described. The experiments are well designed and clearly discussed.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #4

Please describe the contribution of the paper

This article proposes an end-to-end architecture to label the pulmonary airways. Three contributions are developed: a self-attention mechanism called neighborhood information encoding (NIE), a U-shape architecture and a boundary-adaptive classifier. The proposed network consist of transformer blocks with NIE which includes a prior on the spatial relationship of each branch. The classification task is separated in three: the lobar, segmental and subsegmental branches. A classifier is added after each of the three first transformer block to classify the three classes. In order to refine the upper branches labels using information from lower branches, they added two more transformer blocks and their corresponding classifier, forming a global U-shape. Finally, the weights of the subsegmental classifier are computed based on the features of the following transformer block. The method is validated on a public dataset compared with several other methods, and an ablation study is conducted.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The motivation of the paper is clear and makes sense
- The NIE is a smart way to integrate a structural prior
- The boundary adaptive classifier formulation is interesting
- The method is compared to many other approaches and outperform them
- The ablation study shows that the 3 novelties contributes to the good results
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Despite the clear writing style employed in the article, the confusing organization and nomenclature of ideas, coupled with the lack of important details, makes the comprehension of the content difficult (see detailed comments).
- The lack of the standard deviation of the results
- Even though the authors stated that they performed a 4-fold cross validation, they did not add the standard deviations of the results. This prevents the reader from evaluating the data variability of the proposed approach.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

There are several unclear aspects in the authors’ methodology description. This prevents the article to be fully understood and thus reproduced. I recommend that the authors provide the code associated to this article.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- The input of the proposed method should be described in more details. The authors say “Following previous work, each branch is regarded as a node in a graph space with well-designed 20-dimensional features.” I agree that a complete description could be out of scope, however the authors should provide some details on how this 20-dimensional space is obtained in this article, especially as the previous article is not in open-access.
- To improve clarity and facilitate understanding, I suggest that the authors clearly identify their three contributions at the beginning of the article and maintain consistent naming throughout. As it stands, it was necessary for me to review the article several times to identify the distinct contributions and their corresponding descriptions. For instance:
  - NIE module and the U-shape architectures are not clearly identified separately in the introduction. Both should also be separated in the method section. Moreover, the exact composition of the NIE module is never explicitly presented, nor is it illustrated in Fig1.
  - The third contribution is referred to alternatively as the “novel/weight generator” or the “boundary-adaptive classifier.
- The NIE module description is confusion
  - Eq (1): In classic attention modules, the matrices Q and K are obtained by Q=X_m.Wq and K = X_m.Wk where Wq and Wk are the learned projection matrices. Thus the attention A is A = QK^T/sqrt(d). I am uncertain whether the authors employed a different convention or did something different. In any case, I recommend the authors to either harmonize their notation with the literature, or provide and explanation for these differences.
  - The authors should explain in details what is the codebook C_m. What is its purpose ? Why not use directly D ?
- Eq (3): M is equal to 5 in this article. However the bounds of m in Eq (3) are M/2 whereas m is an integer. The authors should clarify this point.
- If I understood correctly, each classifier is binary. I assume that the final prediction is the concatenation of the 3 classes. If so, the authors should explain what is perform when predictions overlaps. For example if a node is classify as both segmental and subsegmental.
- “The terms “coarse” and “fine” are used interchangeably throughout the article to describe the airway level, network level, and feature level, which leads to significant confusion.
  
  The first paragraph of page 5 is particularly difficult to follow due to this inconsistency. I strongly recommend that the authors revise the article to clarify this point. In particular, I still do not understand, regarding the network level, if the output of the last transformer block is supposed to be fine (as the last layer), or coarse (as the classification of the lobar branches).
- In page 5, the authors mentioned that they stop the back propagation from segmentatl level to subsegmental level. However, this statement lacks crucial details. Specifically, it is unclear which weights in the network architecture are actually updated during training. I suggest that this information should also be more clearly illustrated in Figure 1. Additionally, it seems likely that the frozen weights were initially updated for a few epochs before being frozen. It would be helpful for the authors to provide a more detailed description of this process.
- Eq (9): It seems that the cross entropy loss is computed on different prediction / ground truth depending on the value of m. To eliminate any confusion or ambiguity, I recommend that the authors clarify the inputs of L_ce within this equation.
- Fig1 : green and blue seems inversed, the description in the text should be put/summarized in the caption
- Fig2 should include the notations of the article to improve readibility. In particular: X_m, P_m, G_k, w, A, D/cm(D)
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The contributions of the article are interesting and the method yield good results compared to SOTA.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The task of this paper is automatic airway labeling, which aims to assign the corresponding anatomical names to the branches in airway trees. Then, this paper propose an transformer-based U-shape neural network to the relationships between different tree levels. Furthermore, the dynamic decision boundaries is introduced to improve the accuracy for the fine-grained category variations.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The motivation and organization of this paper is clear. The techniqal details are easy to read.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

This work is highly rely on the previous work [21], where the input data of this paper is prepaed by the previous work [21], but the input content is not explained well in this paper. It is difficult to understand how the input aligns with the structure-aware network.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The techique details is explained well. With the input from the previous work [21], the results could be reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The input content is from the previous work [21] but is not explained well in this paper. It is confused that, why each node can be classed at both lobar level and segmental level. What is the receptive field for each node? Adding a pre-requiment section will be prefered.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper is easy to read and the motivation is soundable. However, I have one concern about the structure aware network. It is not intuitive that, the u-shape attention-based network and distinguish the airway anotomical levels.
Reviewer confidence

Somewhat confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a transformer architecture for airway labeling, utilizing the attention mechanism to encode spatial priors.

The reviewers are largely positive about this paper, finding the architecture well designed, its performance convincing and its modelling foundations interesting.

The reveiwers also have some concerns regarding writing, data description and the addition of standard deviations to results. The authors should make sure to address these concerns when preparing the final version of the paper.

Author Feedback

N/A

back to top

AirwayFormer: Structure-Aware Boundary-Adaptive Transformers for Airway Anatomical Labeling