Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zhi Cao, Weijing Zhang, Keke Chen, Di Zhao, Daoqiang Zhang, Hongen Liao, Fang Chen

Abstract

We explore the potential of deep convolutional neural network (CNN) models for differential diagnosis of gout from musculoskeletal ultrasound (MSKUS), as no prior study on this topic is known. Our exhaustive study of state-of-the-art (SOTA) CNN image classification models for this problem reveals that they often fail to learn the gouty MSKUS features, including the double contour sign, tophus, and snowstorm, which are essential for sonographers’ decisions. To address this issue, we establish a framework to adjust CNNs to “think like sonographers” for gout diagnosis, which consists of three novel components: (1) Where to adjust: Modeling sonographers’ gaze map to emphasize the region that needs adjust; (2) What to adjust: Classifying instances to systematically detect predictions made based on unreasonable/biased reasoning and adjust; (3) How to adjust: Developing a training mechanism to balance gout prediction accuracy and attention reasonability for improved CNNs. The experimental results on clinical MSKUS datasets demonstrate the superiority of our method over several SOTA CNNs.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_16

SharedIt: https://rdcu.be/dnwJy

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    This paper proposes a deep learning approach for gout diagnosis using musculoskeletal ultrasound (US) images. CNNs are used to perform binary classification of US images into gout or healthy images. Eye gaze data is collected along with the US images and is used to guide the attention of CNNs such that the class activation maps (CAMs) become similar to the recorded gaze map. CNN weights are adjusted to strike a balance between classification accuracy and reasonability of CAMs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Paper is easy to follow
    • It tries to incorporate subjectivity of freehand ultrasound in the decision making process of gout diagnosis using eye gaze maps.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Gaze information has been used by many existing ultrasound image analysis approaches to improve the performance CNN predictors. The specific advantages of the proposed approach over the existing ones are not very clear.
    • Table 1 shows comparison with the standard CNN models. It should include comparison with existing deep learning approaches in which CNNs utilise eye gaze information.
    • Several variables in Algorithm 1 are undefined e.g. M_attn.
    • Effect of alpha parameter requires more analysis, e.g. does a low value (< 0.2) degrades the classification performance of CNN?
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Dataset used is not available publicly.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please highlight the advantages of the proposed approach over the existing deep learning approaches in which CNNs utilise eye gaze information and perform a comparative study.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please see the weaknesses

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper introduces a training strategy as well as a deep CNN model for diagnosing gout from musculoskeletal ultrasound. Specifically, the paper utilizes the gaze map from an eye tracker, which captures the regions of interest of the sonographers. An encoder-decoder structure was trained to learn the gaze map. After that, the gaze areas were compared with the saliency map from the original classification model. A loss function was further designed to reduce imprecise and unreasonable predictions from the saliency model. The proposed method shows sota performance on the MSKUS dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    There are serval pros of this paper as follows. 1) The paper is the first to explore the potential application of deep learning techniques in the differential diagnosis of gout from musculoskeletal ultrasound, which is in accordance with the aim of MICCAI. 2) The proposed method is technically sound and shows impressive performance versus baselines. 3) The method improves the explainability.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    My main concern is regarding the annotation cost of the method. Compared to baselines, the method also requires gaze data for training. What is the time cost for collecting gaze data? Would it be a problem hindering the method’s application in the future?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Satisfactory.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    1) As mentioned in the weaknesses, the authors may want to consider the cons and pros of introducing gaze data in the method. 2) Generally the paper is well-written, whereas Section 2.3 could be reorganized since it took me some time to understand it.
    3) It would be more clear to inspect the contribution if there is a table presenting the model complexity of baselines and the proposed method. It could include the training time, model size as well as inference time of the models.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See strengths and weaknesses.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The paper proposes a framework to adapt CNNs for diagnosing gout from musculoskeletal ultrasound (MSKUS). This work attempts to address the challenge that CNNs often fail to learn gouty MSKUS features. The authors design a mechanism of “think like sonographers” in three levels: where to adjust, what to adjust, and how to adjust. The proposed framework improves the performance of diagnosis over the baseline deep classification architectures.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) A novel method. The paper proposes a novel method to address CNNs fail to learn the gouty features in diagnosis of gout from musculoskeletal ultrasound (MSKUS). The authors use the sonographer gaze map generated from the eye movement data as the supervision to train a model. This prompts CNN to focus on the key region of gout in MSKUS like a sonographer.

    (2) Good interpretability. The authors classify MSKUS images into reasonable and unreasonable based on whether the CNN focuses on the key regions of gout. The reasonableness and accuracy of gout prediction are combined to classify MSKUS images into four classes. The authors design to apply different loss weights to the four classes of images to balance the accuracy and reasonableness of the predictions. This motivates CNN to improve the prediction accuracy by focusing on the key region of gout. Therefore, the proposed strategy increases the interpretability of the CNN prediction results.

    (3) Strong practicality in clinical environments. The experimental results show that it was possible to use predicted gaze maps for both the training and testing phases of the classification models without any notable performance decrease. This means that there is no need to collect gaze maps from the sonographer while predicting gout with ultrasound images.

    (4) Reasonable experimental design. The paper compares performances of different models training with and without TLS (Thinking like Sonographers) mechanism in MSKUS.

    (5) Logical and fluent expression.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Not very clear in some details of the method. For example, in the section of “What to adjust”, dose the input data of the classification model include the predicted gaze map? It is also difficult to clarify this point in Fig. 2.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper is easy to reproduce because the authors provide a detailed methodological flow and sufficient network details.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    (1) In “Where to adjust”, the proposed model combines transformer encoder and CNN decoder to predict gaze maps. Why do you use the transformer encoder? What would be the result if you use a CNN encoder? I suggest adding a comparison experiment for both cases.

    (2) Fig. 2 needs a slight adjustment. Is the predicted gaze map fed into the classification network along with the ultrasound image? It should be consistent with the flow map in the supplementary. The directions of the precise axis in Fig. 2 and Fig. 3 should be consistent.

    (3) In “Where to adjust”, the authors evaluate the stability of the prediction under the predicted gaze map and the actual collected gaze map via t-test. Only the p-values of the t-test are published in the results. Where are the prediction results under different gaze maps?

    (4) There is an obvious difference in the performance of gout prediction using different networks (Resnet18, Resnet34, Resnet50, Vgg16, DenseNet121). How about briefly analysing the reasons for the difference in performance.

    (5) The ultrasound image is resized to 224*224, what is the original size?

    (6) In “Gaze Data Collection”, a sonographer gaze map was generated for each binary map by convolving it with a truncated Gaussian Kernel G(\sigma_{x,y}), where G has 299 pixels along x dimension, and 119 pixels along y dimension. Is it possible to provide a theoretical basis of doing this?

    (7) Limitations of the method should be mentioned. For example, the proposed method is still fully supervised learning rather than weakly supervised learning as in the paper [15].

    (8) Check typos. For example, in the last paragraph of the experiment section, “Therefore, our TSL mechanism”, should it be “TLS”?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper provides a novel method for predicting gout from musculoskeletal ultrasound. The paper has good clarity and organization. It requires only minor revision to be published for its high quality.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The proposed work incorporates the eye gaze maps, decision making, and attention mechanism in an integrated framework.




Author Feedback

N/A



back to top