Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Birgi Tamersoy, Felix Alexandru Pîrvan, Santosh Pai, Ankur Kapoor

Abstract

Accurate and robust estimation of the patient’s height and weight is essential for many clinical imaging workflows. Patient’s safety, as well as a number of scan optimizations, rely on this information. In this paper we present a deep-learning based method for estimating the patient’s height and weight in unrestricted clinical environments using depth images from a 3-dimensional camera. We train and validate our method on a very large dataset of more than 1850 volunteers and/or patients captured in more than 7500 clinical workflows. Our method achieves a PH5 of 98.4% and a PH15 of 99.9% for height estimation, and a PW10 of 95.6% and a PW20 of 99.8% for weight estimation, making the proposed method state-of-the-art in clinical setting.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_33

SharedIt: https://rdcu.be/dnwJO

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a deep-learning based method for estimating the patient’s height and weight in unrestricted clinical environments with an end to end network.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1 Holds practical value for bed bound patients who can not be measured in a conventional approach. 2 Improved performance over conventional approaches 3 End to end estimation without additional algorithms.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Is the proposed large dataset public available for reproduce The work holds very practical meaning. It will be great to show if the error is also within the tolerance of mentioned applications such as drug dose or MRI. There are existing public available suitable to the proposed approach with bed bound patient depth data named SLP (NEU) with people’s weight and tailor measurement. It also includes the heavily occluded cases as mentioned in the paper. It is helpful to also test on this public benchmark for reference and easy reproduction. 4 No ablation studies. No compare to state of the art.

    1. Figure 1 uses a common encoder but the patterns are different (you used common in the main text). Are they shared backbone?
    2. P4. “The known 3D models of the patient tables are used” is not clear. How do you get the 3D models and how to use it? Is that from scan ? In practical applications beyond your dataset, how to get this patients table? 7 No discussion about the “occlusion” issue faced by other algorithm? Is that going to be a problem here?

    3. grammar: P6 “As the loss function we the symmetric mean absolute percentage error (SMAPE): “
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Not clear if the code or the large dataset will be public avaialble.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please check the weakness part.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper addresses an interesting topic. But the technical studies are superficial.
    There unclear parts such as , what is the table 3D model? Is that a scan. No abaltion study. No comparison with other approaches.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors present a deep learning-based approach to estimating the weight and the height of individuums based uniquely on single 3D depth maps. The method is trained on a large dataset of patients and volunteers. The method includes two preprocessing steps, namely viewpoint normalization and table removal. Two modified ResNets which encoder is pre-trained using landmark detection, are used to infer both weight and height independently. According to the presented state of the art, the contributions are:

    • introducing a simple deep-learning approach for the weight and height estimation
    • training and evaluating it on a significantly large dataset with respect of published previous works
    • providing high performing approach capable of dealing with occlusions, blankets, different table types and random objects as it can be expected on a clinical setup
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Methodologically, the paper does not introduce a ground-breaking approach, but in my view the main contribution is its simplicity. Given the significantly larger dataset used with respect to any previously presented weight/height estimation literature, the performance proves that it can be used in a clinical setting with a 23-fold cross validation. Despite not being quantified, the authors claim that the dataset includes images common in clinical routine. The shown examples indeed show blankets, occlusions and random objects around the patient. Under that assumption, the quantitative performance of the approach proves a good robustness as claimed by the authors.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper fails to explain the pre-processing steps in a way they can be reproduced. The rationale given for them is clear, but since in my view they play a major role, this weakens the paper’s otherwise scientific quality. I would expect alternatively references towards more detailed publications in this matter.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The large dataset the authors present is not publicly available. Also, as mentioned above, the pre-processing steps are not explained in a way that can be reproduced. Additionally, the pre-training with landmark detection lacks more explanations: which landmarks, how many, how were they annotated…

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Abstract: “number of scan optimizations” is unclear, the sentence is confusing and needs rephrasing.

    Introduction:

    1. Same as abstract.
    2. If available add reference on claim that clinical staff relying on previously acquired information or their own estimate.

    Related work:

    1. The second paragraph misses several articles in the context of methods, e.g., “THE Buckley method…” Review with a mother-tongue speaker or a grammar tool.
    2. Add references 3, 9 and 10 after last sentence of paragraph 3.

    Approach, Obtaining “Normalized” Depth Images:

    1. IMPORTANT: Far too few pieces of information how normalization works. Either add a reference if you are using previously published methods or extend it so someone can reimplement your idea.
    2. The caption of Fig. 2 is poor. What are we seeing on the first and second rows? What are the columns showing? Also, why is the head of the person on the right bottom line so blurred? The current caption does not help us understand the paper but rather confuses us.
    3. IMPORTANT: Unclear how table is subtracted. Is the type of table an input to your method? Hwo does the method what type of table to subtract? Your steps need to either be referenced (to existing literature) or require more explanations.

    Approach, Learning Accurate Low-Level Features:

    1. IMPORTANT: How are landmarks selected? How many are used? Without this information reproducibility is weak.

    Experiments, Dataset:

    1. IMPORTANT: How were landmarks annotated?
    2. If possible. provide a quantification of the complexity of the dataset. How often are patients occluded? And how often are blankets or other objects in the images? Can you tell how many patients were prone or supine?
    3. List or at least give examples of the clinical workflows you included in the dataset.

    Experiment, Training:

    1. Modify the last paragraph so that it is clear how you figured out that removing the table impacts height estimation. What this done empirically?

    Experiment, Result:

    1. Add standard deviation or range to the provided results.
    2. You did a 23-fold cross validation, which performance do you report? The average, the best, etc?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The approach may be extremely useful in clinical practice and if the claims of the authors are true, the contribution is high. Yet, I cannot give it a “strong accept” due to the lack of reproducibility. I do believe that authors can correct this in the rebuttal process.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    I still keep my evaluation since I still believe the clinical impact is high. Still, I do agree with most of the criticisms of the other reviewers and the metareviewer. In that sense, my “wishlist” of changes requested to the authors increased since my own review.

    Still existing weaknesses (to be considered for final decision):

    1. The authors do not clearly explain why the SLP dataset (proposed by R1) was not use for reproducibility.
    2. My requests to add more information about preprocessing were not addressed. I still find that this is needed.

    Requests to camera-ready paper in case of acceptance:

    1. Mention why SLP dataset was not used, and ideally - if possible, add as supplementary material an evaluation of their method with that dataset.
    2. Include statement of the authors in the rebuttal regarding 3D model of patient table (bullet point 5).
    3. The authors should report on the performance outside the inclusion criteria as stated in the rebuttal text (bullet point 3).
    4. The authors should include the discussion on their performance compared to other methods as they include it in the rebuttal text (bullet point 4).
    5. Make explicit that the dataset is not publicly available.
    6. Include more information about preprocessing as requested in my review.
    7. Add the information provided in the rebuttal regarding landmark evaluation (bullet point 6).



Review #3

  • Please describe the contribution of the paper

    The authors proposed a deep-learning based method for estimating the patient’s height and weight in unrestricted clinical environments using depth images from a 3-dimensional camera. It provided an approach for estimating heights and weights for patients with limited mobility.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The method achieved high accuracy for both height and weight estimation with a common deep learning architecture. The results met the clinical criteria and requirements.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Inclusion criteria have been applied to the training and validation data. Thus, it was unclear how the proposed model worked for the outlier data point. Those outliers (participants of extreme height and weight patients) may be the real difficulties in the clinical setting. Also, it was unclear how the proposed method could be incorporated into the clinical setting with the table and workflow limitations.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The deep learning architecture was common and reproducible. The data preprocessing was well explained.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Major comments:

    1. In Section 4.1, it seemed that table information and clinical workflow were required to get the data for weight estimation. Please elaborate on how to make the method generalizable to fit the complicated clinical settings.
    2. Inclusion criteria have been applied to the training and validation data. Thus, it was unclear how the proposed model worked for the outlier data point.

    Minor comments:

    1. In Section 4.2, please rephrase the sentence “As the loss function we the symmetric mean absolute percentage error (SMAPE):”.
    2. In Section 4.3, why evaluate the model in 23-fold cross validation? Please provide details about the clinical workflow and depth image corresponding to the same participant.
    3. In Fig. 5-6, please provide the unit for the error histogram (x-axis).
    4. In conclusion, “estimation. These results outperform all alternative methods to the best of our knowledge, including family estimates and clinical staff estimates.” Please provide quantitative results from prior methods.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method accurately estimated the height and weight from a large dataset. However, there were two major concerns about the data collection and selection.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors addressed my major points in the rebuttal. I hope the authors address the minor points as well in the camera-ready copy. Thus, I keep the original score.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This work proposes a simple, robust method for body height and body weight estimation of bed-bound patients, to enable height and weight based settings for MRI (SAR) and CT (dose). It uses a depth camera and performs height and weight regression via a network that is pre-trained on a landmark localization task on the images of a depth camera mounted above the patient.

    For certain patients it may be necessary to perform height and importantly weight estimation in a clinical workflow (before tomographic scanning) due to the lack of other means of getting these values accurately. In such cases, this work seems to provide a meaningful alternative, robust to a variety of situations (like sheet covers, orientations of tables, prone vs. supine positioning). A simple method is proposed here, and it is evaluated on a large private database of scans of seemingly large variety.

    Reviewers report some shortcomings that need to be addressed in the rebuttal:

    • Due to a lack of access to the dataset, reproducibility is very limited. One reviewer mentions a public dataset, and it is not clear why this has not been used to get a part of the evaluation that is reproducible.
    • There is no indication, which error in height and weight measurements is acceptable in the clinical practice, and if the height and weight estimates are within these error rates.
    • Moreover, it is also not clear if human estimates or other methods for height and weight estimation in the clinic are within or outside these error rates, since neither literature, nor experiments in this direction are performed.
    • Regarding height estimation, it seems that the related literature does not focus on this, but solely on weight estimation. It seems to be a less important problem in practice, especially if one uses a measurement of the patient bed as a scale and estimates the patient height from that. Again, it is not clear if such a simple approach (for height estimation only) is already sufficient regarding its error in practice, without the need for cameras or deep learning.
    • The work does not include any comparison to related methods at all, neither imaging based nor non-imaging based. However, the conclusion states that the results outperform all alternative methods. This conclusion can not be supported by any evidence.

    • A final, very important issue is that the inclusion criteria depicted in Fig. 4 exclude overweight patients, who are more likely to be bed-bound. Younger, or normal weight patients are very likely to have good estimates of weight and height available, and can be measured before surgical interventions. It is not explained, why these patients were excluded, since they might be the most interesting patients for this approach.

    Please comment on/clarify the mentioned shortcomings in the rebuttal.




Author Feedback

Dear reviewers, thank you for your comments. Your suggested changes will make this a stronger paper. Please find our response as follows. The discussions in the relevant parts of the paper will be improved accordingly for the final submission.

  • Target Demographics: Please note that the proposed approach is designed for the common diagnostic imaging workflows. Bed-bound patients is only a small subset of this demographics. There may be age biases in this population, but there are no known height and/or weight biases. There is a large variety of clinical workflows that needs to be covered (e.g., coils, heavy blankets, knee supports, positioning equipment, etc.) which are not covered in any public datasets including the SLP dataset.

  • Clinical Acceptance: Wells et al. (2017) suggests a minimum accuracy for weight estimation to be: PW10 > 70% and PW20 > 95%. The proposed approach achieves PW10 = 95.6% and PW20 = 99.8%.

  • Performance of Alternatives: The most common approach in the clinical workflow is the estimation of the patient weight by the clinical staff. Several studies put the performance of clinical staff at: Menon and Kelly (2005) PW10 = 78% for nurses, PW10 = 59% for physicians, Fernandes et al. (1999) PW10 = 66% for nurses, PW10 = 66% for physicians. Similarly, performance of patient self-estimates varies between studies, Fernandes et al. (1999) PW10 = 91% for patients, Blum et al. (2019) PW10 = 97.1% for patients. The proposed approach performs significantly better than clinical staff, and comparable to patient self-estimates.

  • Performance Outside Inclusion Criteria: The performance outside the inclusion criteria is as follows, for patients with BMI < 18.5, PW10 = 89.2% and PW20 = 98.9%, and for patients with BMI > 34.9, PW10 = 96.2% and PW20 = 99.8%. Even though the performance drops a little bit for underweight population, the main reason for keeping these populations outside the inclusion criteria is not the performance, but rather the limited support in the training dataset.

  • 3D Patient Tables: For this work, we leveraged the top surface information from the 3D CAD models of the corresponding patient tables. During inference, the type of the table is considered as an input, and the appropriate top surface is selected accordingly from the list of known tables. Please note: 1) information on the type of the patient table is available in an integrated system, 2) there are a relatively small number of different patient tables, and 3) only the top surface is needed for the proposed approach and this information may also be obtained during calibration by taking a depth image of the empty table surface.

  • Pre-training with Landmarks: For nearly 2000 pre-training workflows we also acquired the corresponding full-body 3D medical volumes (MRI acquisitions). Please note that the volunteers are asked not to move during the acquisitions so that the depth images can be matched to the corresponding medical volumes through calibration. The anatomical landmarks are then manually annotated by a group of experts on these 3D medical volumes. A total of 31 anatomical landmarks are annotated, but for pre-training we only used a subset of 10 major joints (knees, elbows, shoulders, ankles, and wrists).

  • Dataset coverage: The dataset used for the training of the proposed approach is collected from multiple sites in multiple countries, over a span of several years. The target clinical workflows such as a variety of coils, a variety of positioning equipment, unrestricted covers (light and heavy blanket), occlusions by technicians, etc. are covered. Due to volunteer and patient consents, the dataset will not be publicly available.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This work on patient height and weight estimation using depth cameras has seen reviews that were overall more favorable than not. A number of issues were raised by reviewers, like to clarify the practical applicability and need of the method and to specify how well a method has to perform to be clinically relevant. Furthermore, dataset reproducibility issues and confusion on the performance given low/high BMI patients were mentioned. Authors addressed the concerns about clinical relevance and practical applicability very convincingly, by showing that the proposed method seems to be well suited to be deployed. The lack of publication of the example dataset is a drawback, however, in my opinion the paper’s contribution is strong enough to be of interest for the MICCAI community.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I agree with the AC and other reviewers that a major shortcoming is the lack of comparsion, thus it is difficult to evaluate the value of this work. I do not think the current work is ready to be published in the MICCAI comunity.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Similar to R2 and R3 I think the paper is interesting for the MICCAI community and has meret, even though comparison to the approaches is obviously difficult due to lack of data and lack of work in this area. The authors have reasonably addressed the first meta review and therefore my vote is acceptance.



back to top