Authors

Zhonghang Zhu, Qichang Chen, Lequan Yu, Lianxin Wang, Defu Zhang, Baptiste Magnier, Liansheng Wang

Abstract

Hip fractures are a common cause of morbidity and mortality and are usually diagnosed from the X-ray images in clinical routine. Deep learning has achieved promising progress for automatic hip fracture detection. However, for fractures where displacement appears not obvious (i.e., non-displaced fracture), the single-view X-ray image can only provide limited diagnostic information and integrating features from cross-view X-ray images (i.e., Frontal/Lateral-view) is needed for an accurate diagnosis. Nevertheless, it remains a technically challenging task to find reliable and discriminative cross-view representations for automatic diagnosis. First, it is difficult to locate discriminative task-related features in each X-ray view due to the weak supervision of image-level classification labels. Second, it is hard to extract reliable complementary information between different X-ray views as there is a displacement between them. To address the above challenges, this paper presents a novel cross-view deformable transformer framework to model relations of critical representations between different views for non-displaced hip fracture classification. Specifically, we adopt a deformable self-attention module to localize discriminative task-related features for each X-ray view only with the image-level label. Moreover, the located discriminative features are further adopted to explore correlated representations across views by taking advantage of the query of the dominated view as guidance. Furthermore, we build a dataset including 768 hip cases, in which each case has paired hip X-ray images (Frontal/Lateral-view), to evaluate our framework for the non-displaced fracture and normal classification task. Experimental results demonstrate the effectiveness of the deformable self-attention module and the designed cross-view correlation exploration mechanism.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_43

SharedIt: https://rdcu.be/dnwJY

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #3

Please describe the contribution of the paper

The paper brings forward the cross-view deformable transformer framework for non-displaced hip fracture classification from paired hip X-ray images. They identify the challenges of the task, such as low contrast in single-view hip X-ray for fracture detection, especially for non-displaced cases. The methodology is based on existing discovery, two views are better than one view, correlations in both views are important for fracture detection, and transformers are good at extracting patch dependencies. The proposed framework consists of two view-specific branches with four stages and has good performance in bone fracture detection.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

– clarity. The paper clearly states the background of hip fracture detection. The challenges, existing works, possible directions, and the proposed methods are all neat and detailed. – application novelty. Based on the identified challenges, the authors utilize the View-specific Deformable Transformer Network and the Cross-view Deformable Transformer Framework to solve the task. The transformers are well-justified in other papers for similar purposes. And they carry out extensive experiments to show the effectiveness of the proposed model.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

– lack of original technical novelty. Although the authors utilize a deformable transformer and feature fusion of cross-view X-ray images, the novelty mainly comes from the application of existing methodology. From a disease understanding point of view, are the transformers adequate to address the symptoms of fractures? Is there a better way to formulate the problem, for accurate and exact representation? – repeated description. The methodology part has much-repeated information. For example, the idea of the beginning paragraph in Section 2.3 has already been stated in previous content. Equations 4,5,6 and Equations 7,8,9 are just repetitions of Equations 1,2,3. Is there a more concise way to express it?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper works on closed-source dataset.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The task description is clear and reasonable. The authors have identified the main challenges and proposed the view-specific transformer architecture for fracture classification. The methodology part can be improved, and a deeper logic behind the design is desired. Since each task is different, the proposed workflow should be shown to address the specific challenges and hard cases in double-view hip fracture detection. For example, the author could show comparisons between model variations, demonstrating the varied ability to detect hard cases in the ablation study.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The clarity and effectiveness of the proposed method. The authors work on hip fracture detection, addressing challenges with transformer architectures. Detailed description and analysis are provided. Experiemnt results justified the proposed workflow.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

This paper presents a cross-view deformable transformer model to classify non-displaced hip fracture using frontal and lateral X-ray images. The proposed method uses a deformable transformer to better localize discriminative features from X-ray images. The lateral view is used to provide complement features to the dominant frontal view. Results on a private dataset shows that the proposed method achieves the best performance in cross-view non-displaced hip fracture classification.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The idea to use lateral view as a complement of frontal view to classify hip fracture is reasonable and effective.
- The proposed method is compared with many state-of-the-art methods in Table 1, which provides a better idea of how well it works.
- The visualization in Fig. 3 demonstrated the regions identified by the model for the classification and is consistent to experts’ diagnosis.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The hip ROI is cropped manually. While this is acceptable for a small dataset, it will be difficult to scale up to large datasets. This process should be done automatically or semi-automatically using some landmark detection algorithms.
- The dataset used in this study is small, with only 768 paired hip X-ray images. Though the proposed method works well on this dataset, it is questionable whether the proposed method will generalize to larger datasets with more data variations. More details should be provided about the dataset used, such as the age, gender distribution, etc.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Reproducing might be difficult as the paper didn’t provide many implementation details, such as specific hyper-parameters of the model blocks and training hyper-parameters.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

This paper is overall well-written. Some improvements may be more description about the dataset used and more technical/implementational details.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper proposed a new method for cross-view hip fracture classification and demonstrated its effectiveness with corresponding experimental results. It would be best if the code and the dataset could be released for researchers in the field to reproduce the results.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #1

Please describe the contribution of the paper

Hip fractures are among the most common fractures encountered in clinical practice, and accurately diagnosing them in X-ray images poses a significant challenge, even for experienced emergency room physicians. In this study, the authors present an novel cross-view deformable transformer classification approach designed to address this crucial and complex task, showcasing promising evaluation results.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. CAD solution for the non-displaced hip fracture diagnosis in X-ray is an important task for clinical practice, especially in the emergency room settings;
2. The proposed framework is technical sound and interesting;
3. X-ray image dataset is generated by multiple imaging manufactures, which is important for model generalization. Thus, the dataset is valuable;
4. Promising classification performance demonstrated by the experiments;
5. The paper is well-written and easy to follow;
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The authors only evaluate the proposed method on non-displaced fracture cases. A comprehensive CAD solution should cover both non-displaced and displaced fractures;
2. The authors should also report the AUC metric;
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors claim that the code and data will be released upon acceptance. The implementation details discussed in the paper look good.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. The reviewer understands that the system is designed for detecting non-displaced hip fractures. However, some subtle displaced hip fractures are also hard to detect in the rush ER environment. As a result, a more comprehensive solution should cover both non-displaced and displaced fractures. The reviewer looks forward to seeing a completed solution in the future;
2. The authors should also report the AUC classification metric. Meanwhile, how did the authors choose the operating point (Youden Index point or other preset cutoff like 0.5)?
3. The authors missed some important prior studies on detecting hip fractures in X-ray. For example: (1) Oakden-Rayner, Lauren, et al. “Validation and algorithmic audit of a deep learning system for the detection of proximal femoral fractures in patients in the emergency department: a diagnostic accuracy study.” The Lancet Digital Health 4.5 (2022): e351-e358. (2) Cheng, Chi-Tung, et al. “A scalable physician-level deep learning algorithm detects universal trauma on pelvic radiographs.” Nature communications 12.1 (2021): 1066.
Discussion: the authors use a relative small image size for the model input (224 x 224). Will a larger resolution (e.g., 1024 x 1024) help improve the performance?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The recommendation is based on the novelty of the proposed framework, the clinical importance of the task, and the promising results demonstrated by the experiments.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a novel cross-view deformable transformer framework to model key representation relations between different views for non-displaced hip fracture recognition. A dataset including 768 hip cases was also built to evaluate the non-displaced fracture and normal hip classification tasks. The proposed method mainly uses a cross-attention learning scheme to explore the correlation between views, which is less innovative. In the experimental part, the author should compare the recently published methods on fracture recognition and add ablation experiments on interactive learning of different view features to verify the effectiveness of the proposed method.Secondly, many reviewers believed that the author’s experimental setup needed to be revised and that the classification of non-displaced and displaced fractures, data distribution, and ROI acquisition should be added.

Author Feedback

N/A

back to top

Cross-view Deformable Transformer for Non-displaced Hip Fracture Classification from Frontal-Lateral X-ray Pair