Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yi Gu, Yoshito Otake, Keisuke Uemura, Mazen Soufi, Masaki Takao, Nobuhiko Sugano, Yoshinobu Sato

Abstract

We propose a method for estimating the bone mineral density (BMD) from a plain x-ray image. Dual-energy X-ray absorptiometry (DXA) and quantitative computed tomography (QCT) provide high accuracy in diagnosing osteoporosis; however, these modalities require special equipment and scan protocols. Measuring BMD from an x-ray image provides an opportunistic screening, which is potentially useful for early diagnosis. The previous methods that directly learn the relationship between x-ray images and BMD require a large training dataset to achieve high accuracy because of large intensity variations in the x-ray images. Therefore, we propose an approach using the QCT for training a generative adversarial network (GAN) and decomposing an x-ray image into a projection of bone-segmented QCT. The proposed hierarchical learning improved the robustness and accuracy of quantitatively decomposing a small-area target. The evaluation of 200 patients with osteoarthritis using the proposed method, which we named BMD-GAN, demonstrated a Pearson correlation coefficient of 0.888 between the predicted and ground truth DXA-measured BMD. Besides not requiring a large-scale training database, another advantage of our method is its extensibility to other anatomical areas, such as the vertebrae and rib bones.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16446-0_61

SharedIt: https://rdcu.be/cVRT5

Link to the code repository

https://github.com/NAIST-ICB/BMD-GAN

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a method to estimate BMD (Bone Mineral Density) from a plain X-Ray image. The proposed approach combines the QCT in training and decomposes an x-ray image into a projection of a bone-segmented QCT.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main idea of the paper is to estimate BMD using QCT and plain X-ray images widely available and more accessible. Unlike previous methods on BMD estimation from X-ray images, the proposed method uses information in the training phase from QCT.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper is a hard bit to follow, due to lack of details and lack of precision in the definition of the equations and variables used. Several steps of the proposed method are not described, e.g. cropping, registration, etc. The original data used for the study are not described. Experiments were achieved on a limited dataset and were not validated on a different data.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No proof of reproducibility. Experiments achieved on an in-house dataset. Code should be available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    This paper proposes a method to estimate BMD (Bone Mineral Density) from a plain X-Ray image. The proposed approach combines the QCT in training and decomposes an x-ray image into a projection of a bone-segmented QCT.

    The paper is a hard bit to follow, due to lack of details and lack of precision in the definition of the equations and variables used. Several steps of the proposed method are not described, e.g. cropping, registration, etc. The original data used for the study are not described. Experiments were achieved on a limited dataset and were not validated on a different data.

    It is unclear what GAN brings to the proposed method?

    The structure of the paper could also be improved by restructuring the “Proposed Approach” section into different sections linked to the different steps of the proposed approach.

    Too many variables are not defined in the different Equations. Please fix.

    Details are lacking about how the different models were implemented. Were they tuned favorably?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method is interesting The paper is not well written The paper deserves acceptation if slightly improved for the readers

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper is focused on estimating bone mineral density using plane X-ray. The main contributions are 1) the greatly improved method for estimating BMD from x-ray that is presented, 2) the method that allows combining QCT data, and segmentations using hierarchical learning to improve performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Novel methods for applied that make use of related data in a different representation. Specifically the authors use a hierarchical learning approach, which allows them to take advantage of available quantitative CT data. The algorithm and the training approach allow the algorithm to greatly improve performance. The investigation presents a novel way to use the QCT data, it is used as a way to measure ground truth BMD, but also for training a GAN for image translation from x-ray to a synthetic DRR that is useful or BMD assessment.
    • The article is well written
    • The clinical need is well established
    • The authors create a dataset, create a novel algorithm, investigate different training regimens and then demonstrate impressive performance. There is lots of novel and interesting work here
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The dataset could be better described. How much variability in vendor, time subjects? What is the image quality required, particularly for the X-ray? This is an important question and it is not well dealt with.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • The code is made available.
    • The data is not made available.
    • There is an extensive note about the parameters used in training
    • Reproducability is likely, dependent on the code released. There are a lot of details in creating the dataset that are probably crucial to reproducing this work. So without the dataset it may be hard to reproduce.
    • The source of the data is not well described, what is the fidelity (resolution, contrast, reconstruction) of the underlying images? This lack of clarity could limit reporducability
    • The methods are well described.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Figure 1 – landmarks are present on some of the 3D bone surfaces, however, these landmarks are not explained.
    • Figure 1 / dataset - It is unclear if the x-ray images and QCT are matched. I think they are because of the comparisons of BMD derived from DXA and QCT with the synthetic BMD measurements. However for the GAN training it would not be requirements that the x-ray and QCT be matched.
      o Comment on: Are all the x-rays taken from a consistent orientation? How much variability is there in the x-ray image orientation and how will this affect the result?
    • Can you comment on the magnitude of the errors in BMD in z-score terms, which is often how osteoporosis is reported.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • Excellent paper, a useful goal, novel implimentation, improvement over SOTA, consideration of clinical need.
  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    I congratulate the authors for this great work. The authors proposed a novel hierarchical learning approach to predict bone mineral density (BMD) from plain radiographs which provides opportunistic screening for osteoporosis.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed hierarchical framework was novel to learn a Digital Reconstructed Radiograph (DRR) from X-rays, and then estimate BMD from the corresponding DRR generated using a generative adversarial network (GAN).

    • Extensive validation using different backbone architecture was performed.

    • The paper is very well written and I enjoyed reading through it.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • While it is an strength for the proposed framework to be trained on a small dataset with N=200, this is a weakness when it comes to validation. I would suggest testing your method on larger datasets without paired QCT to see how this may be generalised to other studies.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Data does not sound to be available but most codes will be available upon acceptance.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    -Please provided the baseline characteristics of the cohort? age, gender, mean BMD T-scores,…

    • The precision of the method is not reported here. If you can collect repeated X-rays from the same subject for a subset of data, for example N=30, you can also compared computed BMDs and report coefficient of variation (CV) to reflect the precision of the compuatioanl framework. We should have CV<3% for practical use; the lower the better.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well presented with a novel formulation, interesting application, and comprehensive results.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The work proposes a method to estimate the bone mineral density (BMD) from a plain x-ray image which addresses an important topic and is suitable for presentation at the 2022 MICCAI meeting. The reviewers are all positive about this paper. After reading the comments raised by the reviewers I recommend conditional acceptance of the work. The authors should try to address the main points raised by the reviewers in the final version of the paper: an improved explanation of the proposed methods, more information on data, evaluation, and the results.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    1




Author Feedback

We appreciate all the reviewers for their highly constructive and positive feedback. We first respond to the two main comments summarized by the meta-reviewer, (1) an improved explanation of the proposed methods (raised mainly by Reviewer #1), (2) more information on data, evaluation, and the results (raised by Reviewers #1, #2, #3), then answer to (3) other questions raised by the reviewers.

(1) We will improve the explanation of the proposed methods (specifically the image cropping and 2D-3D registration steps briefly described in Section 2.2 in our initial manuscript) as the following. [image cropping] The aspect ratio of the original x-ray images varied from 0.880 to 1.228 (width/height). We first split them horizontally in half at the center. Then the side with the target hip was reshaped to pre-defined image size (256×512 in this experiment) by aligning the center of the image and cropping the region outside the image after resizing to fit the shorter edge of width and height. [2D-3D registration] The intensity-based 2D-3D registration using gradient correlation similarity metric and the CMA-ES optimizer [15] was performed on each patient’s x-ray image and QCT.

(2) We will add information about data, evaluation, and the results as the following. [data in Section 3.1] Ethical approval was obtained from the Institutional Review Board of the institutions participating in this study. The dataset was obtained retrospectively from 200 patients (166 females) who underwent primary total hip arthroplasty between May 2011 and December 2015. The patients’ age and T-scores calculated from the DXA-based BMD of the proximal femur were 59.5 ± 12.9 (min, max: 26, 86) years and -1.23 ± 1.55 (min, max: -5.68, 4.47), respectively. All x-ray images used in this study were acquired in the standing position in the anterior-posterior direction. [evaluation in Section 3.1] The DXA images of the proximal femur were acquired for the operative side (Discovery A, Hologic Japan, Tokyo, Japan) to obtain the ground truth DXA-BMD. [results in Section 3.3 (following the comments raised by Reviewer #2)] We also evaluated the prediction error in terms of T-scores. T-score was calculated based on the mean and standard deviation of DXA-BMD for Japanese young adult women reported in the literature (proximal femur: 0.875 ± 0.100 g/cm2 [1]). We found the absolute error in T-score for HRFormer w/ HL to be 0.53 ± 0.47. [1] Soen S, et al. (2013) Diagnostic criteria for primary osteoporosis: year 2012 revision. J Bone Miner Metab 31:247–257.

(3) Other questions raised by the reviewers Q1 (Reviewer #2): The landmarks presented in Figure 1 are not explained. A1: The landmarks used to define the proximal femur region in this study were explained in [25], which was not explicitly mentioned in our initial manuscript. We will modify Section 2.2 to clarify this point.

Q2 (Reviewer #3): While training on a small dataset is a strength, this is a weakness for validation. I suggest testing on larger datasets. A2: We appreciate the clear understanding of the strength of our method and an encouraging suggestion. A multi-center study using a larger cohort is our immediate next step.

Q3 (Reviewer #3): It is better to report the precision using the coefficient of variation (CV) for repeated X-rays from the same subject. We should have CV<3% for practical use; the lower, the better. A3: We performed an additional evaluation on 13 cases whose repeated X-rays (acquired in standing and supine position on the same day) were available. The CV was 3.06 ± 3.22% when the best model, HRFormer w/ HL, was used. We will add this result in Section 3.3.

Q4 (Reviewer #1): Too many variables are not defined in the different Equations. A4: We realized that the definitions of D and E in equations (1)-(5) were insufficient. We will add them to our final manuscript.



back to top