Authors

Haozheng Zhang, Edmond S. L. Ho, Xiatian Zhang, Hubert P. H. Shum

Abstract

Parkinson’s disease (PD) is a progressive neurodegenerative disorder that results in a variety of motor dysfunction symptoms, including tremors, bradykinesia, rigidity and postural instability. The diagnosis of PD mainly relies on clinical experience rather than a definite medical test, and the diagnostic accuracy is only about 73-84% since it is challenged by the subjective opinions or experiences of different medical experts. Therefore, an efficient and interpretable automatic PD diagnosis system is valuable for supporting clinicians with more robust diagnostic decision-making. To this end, we propose to classify Parkinson’s tremor since it is one of the most predominant symptoms of PD with strong generalizability. Different from other computer-aided time and resource-consuming Parkinson’s Tremor (PT) classification systems that rely on wearable sensors, we propose SPAPNet, which only requires consumer-grade non-intrusive video recording of camera-facing human movements as input to provide undiagnosed patients with low-cost PT classification results as a PD warning sign. For the first time, we propose to use a novel attention module with a lightweight pyramidal channel-squeezing-fusion architecture to extract relevant PT information and filter the noise efficiently. This design aids in improving both classification performance and system interpretability. Experimental results show that our system outperforms state-of-the-arts by achieving a balanced accuracy of 90.9% and an F1-score of 90.6% in classifying PT with the non-PT class.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16440-8_47

SharedIt: https://rdcu.be/cVRwy

Link to the code repository

https://github.com/mattz10966/SPAPNet

Link to the dataset(s)

https://data.4tu.nl/articles/dataset/Technology_in_Motion_Tremor_Dataset_TIM-Tremor/12694256

Reviews

Review #1

Please describe the contribution of the paper

This paper presents a method to classify Parkinson’s tremors in videos. To this end, the authors propose an attention module with a pyramidal channel-squeezing-fusion architecture (Spatial Pyramidal Attention Parkinson’s tremor classification Network, SPAPNet). The proposed system shows an accuracy of 90.9 %, which is 3.2% higher than ST-GCN [1].

[1] Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: AAAI Conference on Artificial Intelligence. (2018).
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) The proposed network is explained in detail and in an easy-to-understand manner. 2) The reasons for the design choices are thoroughly explained.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) The framework relies heavily on OpenPose performance. 2) It is hard to see why the system is an interpretable automatic PD diagnosis system. 3) The paper insists that using seven keypoints can be more beneficial than using all keypoints, but there is insufficient proof. 4) ST-GCN [1] was introduced in 2018, and it can be considered outdated.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Its reproducibility is very high.
The method and parameter explanations are excellent, and the data-related parts and learning environment are also well explained. I look forward to the code release of this paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

It would be nice to see the final results when OpenPose shows wrong results or the results of applying other recent pose estimation approaches.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The manuscript is clear and easy to follow, and the results are plausible.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The authors propose a strategy to classify different tremor classes using postural inputs and a graph neural representation (GNN). Also, the proposed architecture includes an attention mechanism to enhance relationships among joint distances. The authors validate the strategy over a public dataset with some tremor patients diagnosed with Parkinson’s disease. The authors report promising results with around 90.9 % in the classification task of Parkinson’s tremor vs Parkinson’s no tremor.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The proposed methodology based on graph neural nets to represent postural configurations is promising and coherent. Also, the attention models proposed inside the strategy avoid the loss of weakly relationships among joints. Also, the authors show an interesting back propagation of probability output to support the explainability of results.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The main limitation of the paper is related to the understanding of the clinical context of Parkinson’s disease, and the related motor and no motor symptoms. From the title, the authors claim that the approach allows for early support diagnosis but tremors mainly appear in advanced stages. But also, there is not any validation that shows that the proposed approach deal with tiny tremors that appear at early or promodal stages. In fact, there exist resting and postural tremors, in the recorded dataset only use postural tremors that magnify tremors but introduce external tremors as a consequence of muscle and movement of particular activities. How does the proposed approach filter out Parkinsonian tremor??. Moreover, the authors omit scale index and protocols to stratify Parkinson, such as UPRDS y H&Y. In such cases, how the approach may assume that carried out early diagnosis?. In fact in the state-of-the-art, the authors also omit some works that amplify tremors from optical strategies, which also discover natural discrimination among different types of tremors.

Regarding the methodological pipeline, the main drawback is the recovery of joint points from OpenPose which result highly noise with high frequencies among frames. How do filter such movements from real Parkinson’s tremor?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

It is very difficult to reproduce the paper and achieved results from description of the manuscript. The authors use a subset and crop the videos but any of this information is reported.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

The authors should better identify the support that has the proposed tool and how this characterization may impact Parkinson following. Also, the authors should study the OPenPose outputs to determine the level of noise, similar to tremor.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The main limitation of the paper is related to the understanding of the clinical context of Parkinson’s disease, and the related motor and no motor symptoms. The work is not useful for early diagnosis.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The authors propose a binary classification (and an extension to multi-class) framework to diagnose Parkinson’s disease in video recordings of subjects using the seven upper body joints of a 2D skeleton extracted with OpenPose as input to their framework. The body joints are used to build a graph with intra-skeleton and inter-frame connections which are fed to a GNN with Spatial Attention and a novel Pyramidal Channel-Squeezing Fusion Block. The presented results show that the proposed method consistently outperforms prior work.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The solution is light weight (only seven 2D body joints as input), low-cost, and only needs videos of the subject to get an indicator about a potential Parkinson’s diagnosis, therefore the system is very accessible, and the potential clinical impact is high.

The authors present an extensive evaluation, including a comparison against the state-of-the-art and an ablation study and show that the proposed modules improve the model’s performance.

By inspecting the attention weights, the network performance and also failure cases can be interpreted.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

It’s hard to assess if the system really has clinical significance and if tremor classification in video recordings is enough to make a diagnosis – maybe it would be better to describe the system to be an indicator for further patient examination.

The system heavily relies on the 2D joints computed by the OpenPose framework. A more thorough discussion (and comparison of different pose estimation frameworks and framework-specific advantages/disadvantages) would greatly improve the presented work.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

According to the authors, the code will be made public and the dataset is publicly available.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

The authors should already specify in the abstract which type of movements / posture the videos should contain – the method does not require any type of video.

The authors should specify (or at least estimate) the detection accuracy of the OpenPose 2D human pose detection framework.

Noise in the predictions of the 2D joint locations could be interpreted as tremor by the proposed framework. The authors should discuss how this issue could be addressed.

The authors should specify the exact inputs and outputs (size and format) of the proposed model for reproducibility.

The authors should discuss the multi-class classification performance in more detail.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents a novel, low-cost and computationally efficient approach for a problem with great clinical relevance.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper mainly received positive feedback however one of the reviewers raises concerns about claims in the paper and that the video-based analysis may not be used for early diagnosis. This is a major criticism the reviewers raise and the authors are STRONGLY encouraged to address this issue in the final version of the paper.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

3

Author Feedback

We appreciate your time and valuable comments for recognising the technical contribution in our work. We will update the final manuscript based on the following points.

Reviewer #2: PD early diagnosis. We propose an early monitoring system for supporting the diagnosis of relatively “early” PD motor symptoms in a non-hospital setting. We expect our system to provide undiagnosed patients with low-cost PT classification results as a warning sign. This can prompt patients and clinicians to participate in more timely PD management and control further progression. In our work, we refer to the definition of “early” PD in [1,2,3]. In these studies, tremor was diagnosed as one of the key motor symptoms of early PD. Thanks to R2, we have found the terms “early” and “early diagnosis” can be confusing, so we will change these words in the final release to make it clearer.

Reviewer # 2 & 3: Detection accuracy of the OpenPose. Although the OpenPose output is not perfectly accurate and could lead in noise, we propose to apply the Pyramidal Channel-Squeezing-Fusion mechanism to extract relevant PT information and filter noise to improve learning performance. In addition, we use the attention mechanism to both improve the classification performance and provide model interpretation to the clinicians (i.e., which joint is important to the system). Furthermore, our proposed system can be easily extended and improved with the future development of pose estimation algorithms. Reviewers may also be interested in the work [4] on PD detection using OpenPose 2D skeletons from finger tap videos.

Reviewer # 1: Show the wrong results from OpenPose and results of applying other recent pose estimation approaches. For the situation where OpenPose shows a significant wrong result, as we displayed in Fig 3 b2, our voting system could handle this problem to minimise the influence of these worst estimated results. We did some experiments with different pose estimation approaches, and OpenPose returned the best estimation results. We would carefully consider to whether show them or not in our final version based on the page limitation.

[1] Jankovic J. Parkinson’s disease: clinical features and diagnosis. J Neurol Neurosurg Psychiatry 2008;79:368–76. [2] Jacopo Pasquini, Roberto Ceravolo, et al. progression of tremor in early stages of Parkinson’s disease: a clinical and neuroimaging study, Brain, Volume 141, Issue 3, 2018. [3] Heusinkveld Lauren E., Hacker Mallory L., et al. Impact of Tremor on Patients With Early Stage Parkinson’s Disease. Frontiers in Neurology, vol 9, 2018
[4] Mandy Lu, Qingyu Zhao, et al. Quantifying Parkinson’s disease motor severity under uncertainty using MDS-UPDRS videos, Medical Image Analysis, Vol 73, 2021.

back to top

Pose-based Tremor Classification for Parkinson’s Disease Diagnosis from Video