Authors

Chaofan Wang, Kangning Yang, Weiwei Jiang, Jing Wei, Zhanna Sarsenbayeva, Jorge Goncalves, Vassilis Kostakos

Abstract

Hand hygiene can reduce the transmission of pathogens and prevent healthcare-associated infections. Ultraviolet (UV) test is an effective tool for evaluating and visualizing hand hygiene quality during medical training. However, due to various hand shapes, sizes, and positions, systematic documentation of the UV test results to summarize frequently untreated areas and validate hand hygiene technique effectiveness is challenging. Previous studies often summarize errors within predefined hand regions, but this only provides low-resolution estimations of hand hygiene quality. Alternatively, previous studies manually translate errors to hand templates, but this lacks standardized observational practices. In this paper, we propose a novel automatic image-to-image translation framework to evaluate hand hygiene quality and document the results in a standardized manner. The framework consists of two models, including an Attention U-Net model to segment hands from the background and simultaneously classify skin surfaces covered with hand disinfectants, and a U-Net-based generator to translate the segmented hands to hand templates. Moreover, due to the lack of publicly available datasets, we conducted a lab study to collect 1218 valid UV test images containing different skin coverage with hand disinfectants. The proposed framework was then evaluated on the collected dataset through five-fold cross-validation. Experimental results show that the proposed framework can accurately assess hand hygiene quality and document UV test results in a standardized manner. The benefit of our work is that it enables systematic documentation of hand hygiene practices, which in turn enables clearer communication and comparisons.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16449-1_7

SharedIt: https://rdcu.be/cVRUN

Link to the code repository

https://github.com/chaofanqw/HandTranslation

Link to the dataset(s)

N/A

Reviews

Review #2

Please describe the contribution of the paper

The authors propose an AI system to evaluate and document the effect of hand hygiene procedures, using fluorescent hand disinfectants and Ultraviolet (UV) images of the hand. Aim of the proposed method is to generate standardized hand template images which display the amount and location of the hand disinfectants.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

To my knowledge, this is a novel application of AI methods to establish a better processing, monitoring and documenting of hand hygiene. Using hand templates to document the outcome of UV test for hand hygiene were previously proposed, although templates were generated manually. The proposed method has the potential to measure, document and compare effects of different hand hygiene procedures with considerable more participants, in multiple centers and in different settings.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The proposed framework comprises of two main step: Segmentation and shape transformation. Although the segmentation step was extensively tested, evaluation of the shape transformation step was only performed on simulated data
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The manuscript is well written and contains all information required to reproduce the work. The authors will also make the pre-trained models public, although not the dataset.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
The manuscript is well written and easy to read. I have two minor remarks for improvement of the manuscript: On page 5, please add a reference (website, or publication) regarding the MediaPipe and finger-web. In the second last paragraph of the Introduction, there is an extra “used”. Please see below two more general comments/suggestion
- Have you considered to compare your method with using 2D statistical shape models for hand contours? Such shape models might also be useful to generate a test dataset for the hand translation model.
- I believe the strength of the proposed method is in the improvement of methods introduced by Kampf et al to document, analyze and compare data from much larger populations and multiple settings. Therefore, I would have enjoyed seeing one or two of the statistical evaluation graphs as proposed by Kampf et al. For example, in your reference [5], Figure 1 or 2 show the frequency of untreated skin using a color map. Having a similar image (even just for one of your task categories) in your manuscript would show a potential application of your proposed method.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I think the work described a novel application and would contributes an interesting topic of research to the conference.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Somewhat Confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This paper describes a deep learning-based segmentation and image-to-image translation approach to standardize hand hygiene documentation from images of hands with different amounts of skin coverage (with desinfectans) taken under standardized UV lighting conditions. The models in the paper are based on a U-net architecture augmented with attention gates in the generator path. The authors use a self-generated dataset to train and validate their models.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper describes and interesting application / problem domain. The paper is well written in principle.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- I think the paper lacks convincing motivations and explanations why the chosen methods are appropriate (or necessary). The authors have to make sufficiently clear why a deep learning-based pipeline is needed at all, if the reference annotations were actually created with more traditional (non-learning) image processing methods (Maybe I’ve missed something but that’s what I took away from it).
- I do not understand, what the benefit would be to translate the results in a standardized template space using a U-net (or any other learning-based method). Wouldn’t a spatial registration to a template (that could be created on-the-fly from the available data) be more robust and standardized then an image-to-image translation model? I think this has the be clearly explained and it would also nice to see a comparison with a registration-based method.
- I am also not convinced with the validation of the methods, in particular for the hand translation model. It is not obvious to me that the synthetic data are actually a good proxy for the real skin-coverage patterns. Also: when would you consider a result good or robust enough for medical documentation? Is variability of the estimates (and their quality) an issue? This part would have to be clarified.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors do not explicitly state if they make the code or the dataset available.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- p.4 Is the mapping function R^s -> R^t to signify a mapping from s-dimensional real space into t-dimensional real space? I found this a bit confusing
- p.4/5 The loss function is not well explained. Why is the chosed formulation more robust? This would be nice to briefly explain.
- p. 5 The experiments with the standardized template space are not well explain and not motivated from my perspective. Why is a registration-based approach not a good option here? This should be explained I think. What was the idea behind the triangle/trapezoid mapping approach?
- p. 6 Evaluation metrics: Why are U-Net and U-Net++ the baselines to compare against? Of course it depends what you want to compare. So here it seems you would like to argue that the Attention Gates are an important addition to model. However, isn’t the more important question if a deep learning-based approach is better than a baseline method? Is there maybe a more traditional method that you could compare against?
- p.7 Results: Are the difference in Dice and IOU between the models practically relevant ? Can you comment on this?
- p.7 “Thus, to avoid overfitting, we chose to use the model trained with eight epochs for translating the lab study data to hand templates” This is a bit of an unclear argument to me. Why eight epochs? Did you inspect the reasons for overfitting?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I think it could be quite interesting paper from the application point of view, but the description and validation of the methods is rather weak I would say.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

4
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #4

Please describe the contribution of the paper

The paper describes a deep learning approach for hand hygiene analysis. After applying a fluorescent hand disinfectant, pictures of hands are analyzed using a segmentation network and mapped onto a common template. Results are reported on synthetic and real images.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The key strength of the paper is the proposal to use a neural network (NN) for hand hygiene quality. NN overcome the difficulties of manual labelling or hand-crafted ML approaches on this task. Automatically analyzing fluoresence images and mapping them to a common template provides a relatively inexpensive and rapid way to analyze hand hygiene.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The approach is analyzed using a single model (U-Net), and it would be important to compare to more recent segmentation model (e.g., U-Net++, DeepLabV3, Transformer). Also, the details and limitations of the approach with respect to generalization to: the type of fluorescent used, the demographics of the dataset, and any other confounding factors should be clearly discussed.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

All the details with respect to implementing the model have been clearly discussed. I encourage the authors to consider releasing the dataset, to facilitate future research in the area.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

Please address the issues discussed in the main weaknesses section.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper proposes an innovative application of NN to a medical analysis task. However, the analysis of the computational models and discussion of the biases and ethical considerations with respect to the dataset is very limited, given the proposed task.
Number of papers in your stack

6
What is the ranking of this paper in your review stack?

4
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The authors propose a deep learning-based segmentation and image-to-image translation approach to standardize and evaluate the effect of hand hygiene procedures from images of hands (taken under standardized UV lighting conditions). The methods are based on a U-net architecture and a self-generated dataset is used to train and validate the models.

All reviewers agree that the paper is well written and organized and that the topic is quite interesting, however, the authors should address a number of issues in order for the paper to be accepted to MICCAI. Specifically, motivating the methods that were used, explaining more in terms of the hand translation model was used and synthetic data validity, adding details with respect to generalization. it will be important to address the reviewers comments in order for the paper to be accepted.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

8

Author Feedback

We thank the reviewers for their valuable and insightful comments. Below, we present a summary of our corrections in response to reviewers’ comments. Reviewer 2: Main weakness: The reason why we only evaluate the shape transformation on simulated data is that the ground truth of hand translation is difficult to obtain since traditional methods rely on human observers to manually mark uncovered areas: they, therefore, lack standardized observational practices and tend to introduce manual labeling errors. In future work, we plan to use Amazon Mechanical Turk and work with clinicians to further evaluate hand translation results. Comment: a) we will add references and correct the grammatical errors; b) we will include the statistical shape models in our follow-up study; c) We summarize the frequently untreated areas of the WHO six-step hand hygiene technique (which are mainly located at the dorsal side of hand around edge and areas between thumb and index finger), agreeing with the findings by Kampf et al. Reviewer 3: Main weakness: a) The reasons why we propose to use the image translation algorithm to document the untreated areas are two-fold. For each individual, we could visualize their constantly untreated areas to improve hand hygiene quality. Also, by summarizing the coverage information of a specific hand hygiene technique across groups, we could evaluate the effectiveness of a proposed hand hygiene technique (e.g., frequently untreated areas and potentially redundant steps); b) The reason why we use image translation instead of registration is that people’s hands come in diverse sizes and shapes, and their gestures and finger positions may differ across observations. This means that spatial registration could not simply translate a hand into a hand template; c) responded in R1’s Main weakness. Comment: a) R^s -> R^t means translating the segmented hand image domain to the translated hand template domain (irrelated to dimension); b) The loss function combines the cross-entropy loss (left part) and the dice loss (right part) as suggested by Chen et al. [2] for leveraging the flexibility of dice loss of class imbalance and using cross-entropy for curve smoothing; c) Responded in R2’s Main weakness b. The synthetic dataset is to sample triangles and trapezoids on the same relative positions in segmented hand contour and hand template pair, and a U-Net is used to obtain the mapping information; d) We have tested thresholding and fuzzy c-means algorithm [15] for simultaneous hand segmentation and area classification, but both algorithms fail to segment hands from the noisy background (fluorescent concentrates on the observation pegboard); therefore, we discarded their results and used U-Net and U-Net++ as the baseline; e) U-Net and the variants overall achieve good segmentation results, while U-Net++ and Attention U-net get better performance, especially around hand edges; f) We select the model of eight epochs based on the visual inspections. We think the reason for overfitting is that the model tends to overfit sampling patterns of triangles and trapezoids of the generated synthetic dataset. Overfitting may be eased by updating the synthetic data generation algorithm. Reviewer 4: Main weakness / Comment: a) We have examined the performance of three models for the segmentation tasks, including U-Net, U-Net++, and Attention U-Net, where Attention U-Net achieves the highest Dice coefficient and IOU score. We tested these three models for the hand translation task in a pilot study, where U-Net outperformed the other two models. This could be because the self-attention gating is at the global scale, which lacks some of the inductive biases inherent to convolutional networks such as translation equivariance and locality [3]; b) due to the page limitation, we include the device and task details in Appendix Figures 1 and 2 captions. We will include the demographic information if more text is allowed for the camera-ready version.

back to top

Hand Hygiene Quality Assessment using Image-to-Image Translation