Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yuki Hashimoto, Akira Furui, Koji Shimatani, Maura Casadio, Paolo Moretti, Pietro Morasso, Toshio Tsuji

Abstract

The assessment of general movements (GMs) in infants is a useful tool in the early diagnosis of neurodevelopmental disorders. However, its evaluation in clinical practice relies on visual inspection by experts, and an automated solution is eagerly awaited. Recently, video-based GMs classification has attracted attention, but this approach would be strongly affected by irrelevant information, such as background clutter in the video. Furthermore, for reliability, it is necessary to properly extract the spatiotemporal features of infants during GMs. In this study, we propose an automated GMs classification method, which consists of preprocessing networks that remove unnecessary background information from GMs videos and adjust the infant’s body position, and a subsequent motion classification network based on a two-stream structure. The proposed method can efficiently extract the essential spatiotemporal features for GMs classification while preventing overfitting to irrelevant information for different recording environments. We validated the proposed method using videos obtained from 100 infants. The experimental results demonstrate that the proposed method outperforms several baseline models and the existing methods.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16434-7_72

SharedIt: https://rdcu.be/cVRsE

Link to the code repository

https://github.com/uoNuM/two-stream-gma

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes an automated framework for GMs classification consists of preprocessing networks and motion classification network. With the former network getting rid of the background information and aligning the body orientation, the latter network processes the spatiotemporal features. The proposed method outperforms the listed comparing methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The motivation is reasonable and interesting.
    2. The paper is well-written
    3. The GM classification data set was collected and labeled.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    In general, the contribution and novelty is limited. The proposed method is a combination of the processing networks and the two-stream network. Is backgroud removal a necessary step? For one thing, it’s very difficult to remove them all. Then, based on the most recent human motion and video content analysis research, it’s unncessary to do that.

    Lacks of some detailed of compare methods such as paper [26], [18] and some necessary analysis of why they(sota) work much worse here. In Table 1, the preprocessing networks were applied to both “Baseline” and “Ours” and all outperform [26] and [18] a lot. It seems that preprocessing is playing a major role. But the ablation study in Table 2 shows that it is not so important. It confuses me here the single stream network without preprocessing works so good. And it may be more convincing to test the proposed method on other public datasets.

    In table one, the comparing methods are [7], [18], [25] and [26]. But they are not SOTA method in human motion analysis, and two methods are published in 2015. Latest video analysis and human motion classification methods from computer vision fields should be considered and compared.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code is released but the dataset is not.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    It is a good work to solve the GM task with explicit and reasonable motivation. And experiments prove its effectiveness. But I think it may be better to show the details of implements of compared methods and give more analysis of the experiments result. And experiments on more datasets is encouraged to make the conclusion sound. And some latest methods could try to applied in the motion classification network to replace the two-stream network for processing spatiotemporal infomation.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The research motivation is clear and the paper is well-written. In general, the idea of proposed method is not new and contribution is limted. Also, the comparison and ablation study cannot fully show the effectiveness of proposed method.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    5

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    The paper addresses the problem of Automated Classification of General Movements in Infants. This problem is known to be still difficult and “unsolved”. Authors propose an image based approach (using videos) to rate the GM category. In the proposed approach, first a focus is made on removing the background of the video and the pose of the infant is normalized. From these processed images, authors use a two-stream spatio-temporal network to predict the GM category. Authors evaluate on a proprietary dataset and make the code available. Results outperform current state of the art.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed preprocessing networks, while not technically novel, are an effective way to normalize against background, pose and scale.

    • Authors leverage existing work on two-stream spatio-temporal architectures which seems particularly well suited for GM classification

    • authors release the code for comparison. One major difficulty of comparing automating GMA methods is that the data (recordings of infants) are often “private”. The release of code will allow the community to compare on other cohorts.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There is no clear novel technical contribution. It is at the boundary of a “Methodological studies” and an “Application study”.

    The evaluation and comparisons of the method could be improved: Preprocessing (3.1) + final evaluation and comparisons.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code is made available with all steps.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • evaluation on the preprocessing networks (3.1) is missing. One wonders what is the quality of the output. In Related Work authors write “The pose-based approach [5,18] … performance depends on the accuracy of the pose estimation algorithm.” to motivate the use of a video-based approach. However, the proposed process is based on OpenPose (and later on Farneback optical flow method [10]) which may have errors. Could they be quantified? The underlying question is: is it worth to improve these preprocessing steps to increase the performance or should one focus on the two-stream networks?

    • No evaluation of the computed flow (Farneback method [10]). How reliable is it? Are outputs noisy, temporally consistent? Same underlying question as before.

    • It is unclear if STAM [18] was retrained with the new dataset to have a fair comparison with the proposed method.

    • It is unclear how authors compare to [26]. To the best of my knowledge their code is not available. Did authors re-implement it? Did authors retrain on the new dataset to have a fair comparison with the proposed method?

    • Authors do not evaluate on the [26] dataset, which is available “upon reasonable request” (https://www.nature.com/articles/s41598-020-57580-z#data-availability). It would have been interesting to have a second comparison.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The automated General Movement Assessment is a difficult and yet clinically very relevant problem. The proposed approach is sound, relatively simple (in the good meaning of the word simple) and the obtained results (even with the mentioned evaluation weaknesses) are improving the state of the art. At this point the merits slightly weight over the weaknesses.

  • Number of papers in your stack

    3

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    Focus on the infant general movement classification problem in this work, a preprocessing network is introduced to remove unnecessary background.from GM videos and adjust the infant’ body position. A two stream structure for motion classification is proposed based on both spatial and temporal information.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Detailed ablation study to demonstrate the effectiveness of the proposed components including the two stream design and the preprocessing network.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It is not clear if the dataset will be publicly available. WM, FM and PR should be clarified what they are. The angle and scale preprocessing is similar to the idea of “In-Bed Pose Estimation: Deep Learning With Shallow Dataset” , not cited.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    code is avialalbe, dataset not.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    See weakness.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Detailed ablation study and technical sound.

  • Number of papers in your stack

    1

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper has received mixed reviews although the reviewers see merits in the methods and their applicability. Some of the major concerns include limited contributions and technical novelties and that the ablation study does not show the effectiveness of the proposed components.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    6




Author Feedback

We would like to thank all the reviewers for their insightful comments and positive evaluation. For instance, Reviewer 1 (R1) commented that the motivation of this work is reasonable and interesting, and experiments prove its effectiveness. R2 and R3 commented that the proposed approach is sound. We would like to address the concerns raised by the reviewers as follows.

  1. Novelty and contribution (Meta reviewer, R1, and R2): In general, the cost of GMs recording is very high, making it difficult to acquire a large dataset. Hence, the recording environment can often be biased depending on GMs types, which can lead to overfitting to irrelevant background information. In addition, it is necessary to learn the spatiotemporal features during GMs adequately. We developed a novel GMs classification framework that solves the former problem with preprocessing networks that remove irrelevant information and the latter problem with a two-stream network that uses explicitly computed motion information. We believe our approach will contribute to the MICCAI community and technological improvement in the relevant field.
  2. Preprocessing networks and their ablation (Meta reviewer, R1, and R2): [Body area extractor] As R1 stated, recent human action recognition research does not generally remove the background. However, background removal is reasonable in the domain of video-based GMs classification because the background does not provide meaningful information about GMs and can also cause overfitting to irrelevant information due to the limited dataset size. In fact, as shown in Fig. 2, the method without preprocessing focuses on regions unrelated to infants. [Body position adjuster] Unlike the conventional pose-based approach, which directly uses pose information as an input to the recognition model, our method only uses it to adjust body size and orientation. We also designed the adjuster to minimize the effect of instantaneous pose estimation errors by using the average of the quartile range of joint coordinates calculated over multiple frames. Therefore, the estimation error in each frame does not significantly affect the final classification performance. [Ablation] The reason why ostensibly good performance was obtained even when the preprocessing was fully removed is that the model overfits to non-essential elements other than the infant (see Fig. 2). The performance did not decrease but rather improved when the preprocessing networks were added, indicating that the components work effectively for GMs evaluation.
  3. Experimental setting (R1, R2, and R3): [GMs details] Description of each type of GMs is included in 2.1. [Method details] We would like to add more detailed descriptions of the comparative methods ([26] and [18]) to the final manuscript. [Code availability] Both implementations of [26] and [18] are available on GitHub. We carefully checked them for conformity with the original papers. [Fairness] All models, including [26] and [18], were retrained using our dataset. [Baselines] The motivation for adopting a simple two-stream structure is to efficiently learn infant’s temporal features from small data by explicitly calculating motions by optical flow. To evaluate this, we chose CNN+LSTM and C3D (basic 3DCNN) as two naïve but reasonable baselines without flow. Although there would be room for further improvement by simply replacing the classification network with a more modern multi-stream model, this is sort of an application task.
  4. Accuracy of optical flow (R2): A quantitative evaluation of the reliability of calculated flows has not been conducted, thus we will add this point to the limitation. However, since OpenCV implementation is used for the flow calculation and the parameter settings are in accordance with the previous two-stream works, we believe that the calculation results have a certain degree of reliability. The background removal in the preprocessing may also contribute to the stability of the flow calculation.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    In the post-rebuttal discussions, reviewers agreed that the rebuttal addressed concerns regarding evaluation of the method and that the paper provides insights about GMs classification. However, the paper lacks methodological novelty. Therefore, the paper should be rejected.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NR



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The strength is the proposed two-stream spatiotemporal fusion network for infant movement assessment. Most reviewers’ questions were addressed in the rebuttal. I suggest acceptance.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I believe the authors response are reasonable and addresses most of the issues. They have also pointed out also that the study has limitations, and this will be included in the final version authors. The rebuttal is concise and detailed. Their response to Point 1 is, however, not that strong unless they refer to exactly where in the results this is achieved, which should be highlighted in the camera ready.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    6



back to top