Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yi Gu, Yoshito Otake, Keisuke Uemura, Masaki Takao, Mazen Soufi, Yuta Hiasa, Hugues Talbot, Seiji Okada, Nobuhiko Sugano, Yoshinobu Sato

Abstract

Musculoskeletal diseases such as sarcopenia and osteoporosis are major obstacles to health during aging. Although dual-energy X-ray absorptiometry (DXA) and computed tomography (CT) can be used to evaluate musculoskeletal conditions, frequent monitoring is difficult due to the cost and accessibility (as well as high radiation exposure in the case of CT). We propose a method (named MSKdeX) to estimate fine-grained muscle properties from a plain X-ray image, a low-cost, low-radiation, and highly accessible imaging modality, through musculoskeletal decomposition leveraging fine-grained segmentation in CT. We train a multi-channel quantitative image translation model to decompose an X-ray image into projections of CT of individual muscles to infer the lean muscle mass and muscle volume. We propose the object-wise intensity-sum loss, a simple yet surprisingly effective metric invariant to muscle deformation and projection direction, utilizing information in CT and X-ray images collected from the same patient. While our method is basically an unpaired image translation, we also exploit the nature of the bone’s rigidity, which provides the paired data through 2D-3D rigid registration, adding strong pixel-wise supervision in unpaired training. Through the evaluation using a 539-patient dataset, we showed that the proposed method significantly outperformed conventional methods. The average Pearson correlation coefficient between the predicted and CT-derived ground truth metrics was increased from 0.424 to 0.857. We believe our method opened up a new musculoskeletal diagnosis method and has the potential to be extended to broader applications in multi-channel quantitative image translation tasks.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43990-2_47

SharedIt: https://rdcu.be/dnwL2

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    The manuscript proposes an improvement to learning-based methods for musculoskeletal decomposition of radiographs, an accessible alternative to dual-energy X-ray absorptiometry (DXA) and CT imaging for diagnosis of musculoskeletal diseases, such as sarcopenia. This may facilitate opportunistic screening for disease based on images acquired in the course of unrelated treatment. The paper proposes an intensity-based loss that markedly improves the decomposition of a diagnostic radiograph into DRRs of individual anatomies, e.g. the gluteus and pelvis, based on CycleGANs. This is verified in a four-fold cross validation on a dataset of 539 CTs with corresponding radiographs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is well motivated by increasing accessibility of screenings for musculoskeletal diseases.
    • The proposed object-wise intensity-sum loss shows marked improvements over the conventional approach.
    • The validation is conducted over a large dataset in a rigorous manner.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed method is validated primarily in terms of image quality rather than diagnostic value.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code will be made available. Although data may not be released due to IRB requirements, the paper is highly reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    My primary concern has to do with the value of the proposed musculoskeletal decomposition for diagnosis of disease. Related work, namely [16,17], establish this problem, but it may be worth emphasizing the connection between the quality of decomposition, which is what the four-fold validation ultimately determines, and diagnostic value. Future work may wish to establish this connection more directly by highlighting diagnostic structures that are visible in the improved decompositions but not using the conventional methods.

    The validation is essentially structured so that “good performance” reconstructs the DRR of isolated structures available from the segmentation of CT. For the purposes of this paper, it is acceptable to assume the accuracy of the underlying segmentation method, but future work may wish to consider validation methods that do not assume perfect segmentations. For instance, clearly establishing superior diagnostic value would bypass this issue.

    Minor Comments:

    • Fig. 3 is rather far from where it is referenced in the text.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, the manuscript motivates a compelling clinical problem, namely expanding accessibility of diagnosis for musculoskeletal diseases, and proposes an improvement to learning-based solutions. Because the proposed method is intended for opportunistic screening, relying on images acquired in the course of unrelated treatment, the assumptions made are acceptable for improving the likelihood of identifying individuals for follow-up with a CT or DXA. The novel object-wise intensity-sum loss is a comparatively simple modification of conventional approaches using CycleGANs, making it highly reproducible, and yields significant improvements in the quality of decomposition in terms of Pearson correlation coefficient and qualitative evaluation.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    This paper proposes a method to derive information on muscle volume and lean muscle mass from conventional X-ray images. The development of the system is based on a dataset of paired CT and X-ray data for 539 patients. The authors introduce a novel loss function as well as a 2D-3D rigid alignment approach. Their cross-validation results show greatly improved performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed method is an interesting approach to estimating muscle volume and lean muscle mass from conventional X-ray images.
    • The results are promising.
    • The paper is largely well written.
    • The figures as well as the supplementary material on visualising the results are great.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • There is limited information on the clinical application with only a hint at potential opportunistic screening of musculoskeletal diseases in routine clinical practice.
    • The definition of the equations lack detail and may be difficult to follow.
    • The system has not been validated on an independent dataset.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors have included an anonymised link to the source code. It is not clear whether a pre-trained system utilising their data is also included. The latter would be a great contribution to the community as the system relies on paired CT and X-ray data which are not commonly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • “DXA allows the measurement of only overall body composition” -> DXA data can also be analysed for specific regions of interest. I would recommend to rephrase.
    • The table captions lack detail and are not self-explanatory without reading the main text (for example, in Table 1, what does the conventional method refer to). I would recommend to add further detail.
    • Table 2 shows that for lambda=1000 performance may not improve if L_B=true (contrary to when lambda is lower). I would recommend to add a discussion on this.
    • The paper would benefit from a more detailed explanation of the clinical application.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents an interesting approach to derive muscle properties from plain X-ray images. The proposed method describes a novel contribution that leads to improved performance. The systems is evaluated on only a single dataset.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The authors propose a technique for estimating muscle properties from X-rays of the pelvis. The method uses GANs to translate muscle properties between CT and X-ray representations.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The application is novel and the use of inexpensive imaging modality such as radiography for analysis of muscle properties is appealing.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    As with other GAN-based methods it is not clear if the technique can be used in a clinical setting. Mapping from radiographs to CT scans is an inverse problem that may have multiple solutions.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors provide sufficient details that support the reproducibility of the results. A URL of the source code repository will be released as well.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    It is an interesting paper and the idea of using X-rays for muscle quantification is appealing. However, I am not sure if such a technique can be adopted in clinical settings, because of the generative model that needs to be used to simulate morphometry in another modality. Finally, what is the advantage of this technique over DEXA?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is a good paper that presents an interesting idea. There are moderate weaknesses that the authors may address as they develop their method further.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The three reviewers agree that this work has merit, but point to several issues that should be addressed. First, the authors should address the problem hallucinating features going from radiographs to CT scans. Also, the authors should discuss whether this work is useful in clinical settings in terms of diagnostic value, instead of solely relying of image quality. There are also issues regarding clarity of the presentation and validation on independent datasets that should be addressed. Nevertheless, it is a very good paper that will be of interest to the MICCAI community.




Author Feedback

We appreciate all the reviewers for their highly constructive and positive feedback. We first respond to the two main comments summarized by the meta-reviewer, (1) the problem of hallucinating features going from radiographs to CT scans, (2) feasibility in clinical settings in terms of diagnostic value, then answer (3) other questions raised by the reviewers.

(1) We will improve the discussion regarding the ill-posed problem of translating radiographs to CT scans. We understand inverting radiographs to CT scans (DRRs) implies multiple solutions, in which GAN may give unrealistic answers. This is also what the intensity-sum loss was proposed for. The proposed loss constrained the predicted volume and mass to be consistent with that of the patient of the given X-ray. Surprisingly, this constraint also improved structural consistency shown in Fig. 4. The proposed method generated DRRs more similar to reference DRRs than the conventional method in image contrast and object structure. For future work, we will validate our method using multi-pose radiographs, as we know muscle deformations happen when posing changes. We expect the same muscle-changing trend in the DRRs decomposed by our method from multi-pose radiographs.

(2) We will add more discussion for clinical usage aspects. MRI and CT are considered gold standards of muscle quantifications. Still, they are not commonly used in practice because of high costs and low accessibility, (Alfonso J. Cruze-Jentoft et al., Sarcopenia: revised European consensus on definition and diagnosis, Age and Ageing, 48(1), pp 16—31, Jan. 2019). In this paper, we showed that accurately recovering CT information from radiographs is possible. The predicted metrics from radiographs are consistent with CT scans. Thus, our method is feasible for clinical usage to quantify muscle metrics. We believe that diagnosing muscle disease from the detailed muscle information extracted by our method is possible, which is also our future work. Furthermore, our method can help to track detailed muscle changes (quantitively and qualitatively) over time. Hip OA and/or knee OA patients are encouraged to train their muscles before and after surgery. However, questions like (a) which individual muscles need to be trained particularly, (b) does the training improved individual muscles as expected, are typically unclear, where our method could be used to potentially solve them. OA patients are supposed to undertake periodical x-ray examination, in which our method could also provide detailed muscle information without extra scans.

(3) Other questions raised by the reviewers. Q1: (Reviewer #1) What is the advantage of this technique over DEXA? A1: This technique can be used for the images obtained by DEXA as well as radiography. That is, DEXA lean muscle images can be decomposed into individual muscles using our technique. The advantage of radiography over DEXA is its accessibility.

Q2: (Reviewer #2) Future work may wish to consider validation methods that do not assume perfect segmentations. A2: We plan to conduct a more comprehensive validation, including the automatic segmentation model regarding the diagnosis of sarcopenia.

Q3: (Reviewer #4) DXA data can also be analyzed for specific regions of interest. A3: Though DXA can measure regions, it cannot differentiate the overlapped muscles and thus does not allow measurement of individual muscles, iliacus, for instance. We will address this point in the revised version.

Q4: (Reviewer #4) Table 2 shows that for lambda=1000 performance may not improve if L_B=true (contrary to when lambda is low). A4: we agree this point is worth more discussion. In our supplementary materials, we reported more completed results in Fig. 1., showing that lambda=100 gave us the best results. Since we also use other losses (GAN GC) in training, we need to balance each them. Too large lambda on IS made the generated DRR to be unstable, deforming objects that degraded the accuracy.



back to top