Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Han Huang, Yijie Dong, Xiaohong Jia, Jianqiao Zhou, Dong Ni, Jun Cheng, Ruobing Huang

Abstract

Over the past decades, the incidence of thyroid cancer has been increasing globally. Accurate and early diagnosis allows timely treatment and helps to avoid over-diagnosis. Clinically, a nodule is commonly evaluated from both transverse and longitudinal views using thyroid ultrasound. However, the appearance of the thyroid gland and lesions can vary dramatically across individuals. Identifying key diagnostic information from both views requires specialized expertise. Furthermore, finding an optimal way to integrate multi-view information also relies on the experience of clinicians and adds further difficulty to accurate diagnosis. To address these, we propose a personalized diagnostic tool that can customize its decision-making process for different patients. It consists of a multi-view classification module for feature extraction and a personalized weighting allocation network that generates optimal weighting for different views. It is also equipped with a self-supervised view-aware contrastive loss to further improve the model robustness towards different patient groups. Experimental results show that the proposed framework can better utilize multi-view information and outperform the competing methods.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16437-8_64

SharedIt: https://rdcu.be/cVRuP

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The manuscript presents a new framwork for thyroid nodule classification using double view US images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. A novel approach with three main components towards personalized diagnosis.
    2. Application of multi-view US images for better diagnosis output.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. It is not clear how many patients are included in the study. Hence, it is difficult to evaluate the effect of personalized aspect of the proposed approach.
    2. There is no discussion on how feasible is to have this tool for the clinical application.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The implementation steps of the three main contributions are very well explained. An in-house dataset is used for the evaluation. There is no information regarding the data acquisition process.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. Section 3: Please clarify how many patients are included in the dataset.
    2. Section 3: It is not mentioned how many images were included in the dataset after data augmentation. Please clarify this.
    3. Section 4, Table 1: Are the results in this table showing the performance of each approach on one common dataset, or each approach is tested on a different dataset?
    4. Section 4: Proposing a personalized diagnosis tool that is one of the main claims of this paper needs to be more discussed. What are the main advantages of having such a tool in comparison to the other methods while the performance of the proposed approach is slightly better than the state-of-the-art?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents an intresting approach for thyoid nodule classification. There is no discussion on how feasible is to implement this tool in a real-time clinical scenario (on a matter of processing time) while we are talking about multi-view images.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    The authors propose a new multi-view thyroid tumor classification network. It is mainly composed of three parts: a swin-Transformer for feature extraction, a personalized weighting allocation network that customizes the multi-view weighting for different patients, a self-supervised view-aware contrastive loss that considers intra-class variation inside patient groups and can further improve the model performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The research topic is relatively novel. There are few works on multi-view ultrasonic image classification at present.
    2. This work designs a view-weighted fusion module. The existing multi-view ultrasonic image classification work treats different views without distinction, however different views have different degrees of importance in different tasks. Based on this, the authors design a personalized weighting allocation network to dynamically fuse different views.
    3. This work designs a self-supervised view-aware contrastive loss. Based on the original contrastive loss, the authors design a contrastive loss from the perspective of multi-view, and verify that the new Loss has better effect than the original loss through experiments.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The conducted experiments are quite limited.
    2. The reproduced AdaMML has poor performance, but no explanation is given.
    3. The description of the data collection process is not completed, such as descriptions of the experimental setup, device(s) used, image acquisition parameters, subjects/objects involved, instructions to annotators, and methods for quality control. Were the 4529 sets of multiview US images collected from 4529 patients or not?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The description of the data collection process is not completed, such as descriptions of the experimental setup, device(s) used, image acquisition parameters, subjects/objects involved, instructions to annotators, and methods for quality control. Were the 4529 sets of multiview US images collected from 4529 patients or not?

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. The authors should conduct more experiments to demonstrate the effectiveness of the proposed mothed.
    2. The authors should explain why the reproduced AdaMML has poor performance.
    3. The description of the data collection process should be completed, such as descriptions of the experimental setup, device(s) used, image acquisition parameters, subjects/objects involved, instructions to annotators, and methods for quality control. The authors should clarify if the 4529 sets of multiview US images were collected from 4529 patients.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The research idea is interesting and relately novel, but the conducted experiments are quite limited.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    This paper proposes a personalized diagnostic tool for thyroid cancer diagnosis, consisting of a multi-view classification module for feature extraction and a personalized weighting allocation network that generates optimal weighting for different views. Experiment results showed that the trained model outperform state-of-the-art approaches in thyroid cancer diagnosis.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The overall framework is clear and the proposed model has shown its effectiveness.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Personalized weighting generation module is functionally similar to the attention mechanism, which is used more frequently.
    2. For experimental implementation, the λ is set to 0.01, the authors should provide the explanation for it.
    3. It will be better that if the authors provide the actual time cost.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The results in this paper are easily reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The authors should provide more details of experimental implementation and computation complexity for the proposed method.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method is simple and reasonable, yet both personalized weighting generation and view-aware contrastive loss in the model are frequently used. Furthermore, the personalized weighting generation module is similar to the attention machanism, why didn’t the authors choose it?

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a diagnostic tool for thyroid cancer diagnosis, consisting of a multi-view classification for feature extraction and a personalized weighting allocation network that generates optimal weighting for different views. The research topic is relatively novel and the idea of a personalized weighting allocation network to dynamically fuse different views is very interesting. However, the experiments are quite limited and some parts are not enough explained.

    Please address the points that reviewers comment on the description of the experimental setup and dataset, and why the reproduced AdaMML has poor performance. Please remind that the purpose of the rebuttal is to provide clarification or to point out misunderstandings, it is not to promise additional experiments.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NR




Author Feedback

Q1: Please include more details on the dataset and experimental setups. (R1,2,3) A: Our dataset was collected from 4529 different patients using the following devices: Philips IUElite L12-5, GE LogiqS7 ML6-15, Mindray Resona7 L14-5WU, with a frequency range of 5-12MHz. A pair of multi-view (a transverse and a longitudinal views) US images were collected from each patient, resulting in 4529 sets. The data were randomly shuffled and all experiments were conducted using the same dataset and data splits (details explained in Para.1, Sect.3).

Online data augmentation was employed to reduce memory usage and increase randomness. The images were augmented randomly in each epoch, while the total no. of data remained the same. We applied the same augmentation strategy in all experiments, including: intensity scaling(0.6~1.6), rotation(30), translation(75), etc. (explained in Para.3, Sect.3).

The value of λ was defined by considering the order of magnitude of the two losses and empirically set to be 0.01 (details exempted due to limited space). Other experimental parameters (e.g. training epoch, optimizer, lr) have been specified in Para.3, Sect.3.

Q2: How feasible is it to have this tool for clinical application and what are the main advantages of having it as opposed to other methods? (R1,2,3) A: In practice, transverse and longitudinal images of the thyroid nodules are routinely collected. Being accurate, efficient, and portable, the proposed tool can be easily embedded into the current clinical workflow. In specific, our model only requires 8.70GFLOPs and 27.93M parameters and takes less than 0.04s on average to process one case. In comparison, the second-best SOTA model (MVMT) requires 12.98GFOPLs and 58.64M parameters, while its ACC and F1-score are 2.29% and 2.09% lower than our model. Note that the MVMT may not be favorable in real clinical setting as it tends to make dangerous false-negative predictions (resulting higher SPE and PRE). In contrast, our model obtained the best overall performance and can help to prevent misdiagnosis. Furthermore, it can also provide quantitative importance of each view, which helps clinicians to better understand and interpret the predictions, building their trust in AI-based CAD systems.

Q3: Why did authors choose PAWN rather than Attention mechanism? (R3) A: Attention is indeed a useful mechanism that can learn the weighting of feature elements. To better demonstrate the superiority of PAWN, we performed the following experiments. For a fair comparison, the same network backbone was retained while multi-view features were concatenated and fed to a SENet-based attention block[2020 Hu et al.]. It scored ACC=81.13%, SEN=87.25%, SPE=69.22%, PRE=84.64%, F1-score=85.93%, all lower than that of ours. We argue that naively applying attention might highlight some important feature elements, but failed to consider the overall importance of each view as the PWAN has done. To further imitate the PAWN using attention, we devised a multi-view oriented attention block by ‘squeezing and exciting’ the features of each view to the same channel to learn the view-level attention. It scored ACC=81.31%, SEN=87.54%, SPE=69.22%, PRE=84.68%, F1-score=86.08%, but still underperformed ours. We conjecture that this design might overlook the competition between different views, while the PWAN did the opposite. Therefore, we opt for the PAWN instead of attention in this task.

Q4: Please explain why the AdaMML has poor performance. (R2) A: As explained in Para.1, Sect.4, this may cause by the learn-to-eliminate strategy of the AdaMML that inevitably loses information. Its design may suit better to natural datasets(e.g. the ones used in their work) as there may exist large information redundancy where AdaMML can help to improve accuracy and efficiency. However, in our task, all views contain important information and should not be discarded. The proposed model is therefore more suitable than the AdaMML to such problems.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper presents an intresting approach for thyoid nodule classification. The rebuttal address the reviewers comments on the description of the experimental setup and dataset, why the reproduced AdaMML has poor performance and why PAWN rather than Attention mechanism was chosen.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NR



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes a new multi-view thyroid tumor classification network with three parts: a swin-Transformer for feature extraction, a personalized weighting allocation network that customizes the multi-view weighting for different patients, a self-supervised view-aware contrastive loss. In the rebuttal, the authors addressed the concerns raised by the reviewer. I would suggest to accept this paper.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    10



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Rebuttal successfully addressed all the major comments brought up by reviewers and AC on original submission. Dataset appears sufficiently powered, and analysis was rigorously conducted. Information about dataset and parameters should be included in the final paper. Differences between PAWN and Attention are well explained as well as understanding about AdaMML performance. Paper is acceptable.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    3



back to top