Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Luoyao Kang, Haifan Gong, Xiang Wan, Haofeng Li

Abstract

Deep learning (DL) has been used in the automatic diagnosis of Mild Cognitive Impairment (MCI) and Alzheimer’s Disease (AD) with brain imaging data. However, previous methods have not fully exploited the relation between brain image and clinical information that is widely adopted by experts in practice. To exploit the heterogeneous features from imaging and tabular data simultaneously, we propose the Visual-Attribute Prompt Learning-based Transformer (VAP-Former), a transformer-based network that efficiently extracts and fuses the multi-modal features with prompt fine-tuning. Furthermore, we propose a Prompt fine-Tuning (PT) scheme to transfer the knowledge from AD prediction task for progressive MCI (pMCI) diagnosis. In details, we first pre-train the VAP-Former without prompts on the AD diagnosis task and then fine-tune the model on the pMCI detection task with PT, which only needs to optimize a small amount of parameters while keeping the backbone frozen. Next, we propose a novel global prompt token for the visual prompts to provide global guidance to the multi-modal representations. Extensive experiments not only show the superiority of our method compared with the state-of-the-art methods in pMCI prediction but also demonstrate that the global prompt can make the prompt learning process more effective and stable. Interestingly, the proposed prompt learning model even outperforms the fully fine-tuning baseline on transferring the knowledge from AD to pMCI.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_53

SharedIt: https://rdcu.be/dnwHZ

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #3

  • Please describe the contribution of the paper

    The author proposed a new effective and computation-efficient prompt tuning method for the transfer learning of pMCI prediction, which outperformed other SOTA methods and full fine-tuning.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Transfer learning is a significant problem worth investigating.
    2. This work proposes an effective and efficient prompt-tuning framework for pMCI detection. Additionally, the global prompt is designed to adapt to high-dimensional MRI images.
    3. The paper is well written. The experiment is extensive and complete.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Statistical difference of different methods are not provided.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good. The code will be released, and the dataset is public.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    It is expected to see the results on an external testing set or other datasets/tasks

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Transfer learning and multimodal learning are significant and trending topics. The author has proposed novel and effective methods about prompt learning to achieve competitive performance in this domain.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    This paper proposes a transformer-based framework for progressive MCI identification. Both imaging features and clinical features are fused together using transformer networks. Experiments have been conducted on the benchmark ADNI dataset, and the proposed method can achieve state-of-the-art performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Transformer has become a very popular technique in the computer vision and medical imaging community. Introducing it for multi-modality learning on AD-related tasks is an interesting topic.

    2. The overall framework design is appropriate. The authors have clearly presented their motivation and the overall workflow of the network.

    3. The result on the MCI task is good which verifies the effectiveness.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Details about the MR images are missing. For example, how many subjects in training/test dataset? How are the MRI preprocessed (or just raw MRI as input)? For clinical data, how to encode them and feed into the transformer network?

    2. Some important metrics are missing. For example, progressive MCI subject detection is more important than sMCI, thus sensitivity should be reported accordingly.

    3. Some multi-modal methods (especially those based on CNNs) should be compared.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors promise to provide the source code. The reproducibility is okay.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Add more details about the data using more training/test, including the number of subject, the preprossing procedure, the encoding of the clinical data.

    2. Add results in terms of sensitivity or more metrics.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The overall framework design is appropriate. Introducing transformer into AD-related medical image classification task is an interesting topic.

    2. The overall experiment is relatively sufficient despite some details are missing.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    Proposed to use prompt tuning for improving the performance of a multimodal Transformer on imaging (i.e., MRI) and tabular data (i.e., clinical test results).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Improves performance over other multimodal approaches for the task of differentiating stable MCI and prodromal MCI subjects.

    Using the prompt to modulate the final visual representation is interesting. The authors should further explore how different is the information encoded in the prompt (i.e., global prompt vs vanilla prompt).

    Extensive experiments and ablations are carried out.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It is unclear why we need prompt tuning since we are not dealing with large models. The authors motivate the approach by stating the need to only add a small number of learnable parameters. In this case, how does this approach compare to other parameter efficient fine-tuning techniques like Adaptors or LoRA.

    The comparisons with prior works could also be fairer. Given that the MCI dataset is quite small, fine-tuning all parameters (i.e., 12M to 70M parameters) could be problematic. In contrast, the author’s proposed approach fine-tunes significantly lesser parameters (i.e., 0.59M parameters), which could be a reason for the better performance. The authors should freeze the weights of the prior works when fine-tuning. In addition, one important experiment the authors are missing is VA-Former (frozen), which is crucial in verifying the necessity of prompt tuning.

    Prompt tuning does not seem to work well for tabular data. In Table 1, Tabformer achieves a balanced accuracy of 78.05. In Table 2, Tabformer with prompt (i.e., TabPrompt) performs worse with a balanced accuracy of 77.39. Can the authors elaborate on why we even need tabular prompts in the first place. Wouldn’t it be better if we just used a frozen Tabformer in the final architecture?

    Although the authors position this work as “Multi-modal Prompt Learning”, separate prompts are fed to the Visual Transformer and Tabular Transformer. Therefore, the visual and tabular prompts do not interact and are unable to aid in learning multimodal features.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Should not be difficult to reproduce if code is released.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please refer to weaknesses.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Experiments conducted are comprehensive and shows empirical improvement over prior works. However, it is lacking in methodological novelty and several design decisions seem to be made in an arbitrary manner. The comparisons with prior works could also be fairer. Given that the MCI dataset is quite small, fine-tuning all parameters (i.e., 12M to 70M parameters) could be problematic. In contrast, the author’s proposed approach fine-tunes significantly lesser parameters (i.e., 0.59M parameters), which could be a reason for the better performance.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    This paper proposes VAP-Former, a transformer-based network to exploit the heterogeneous features from imaging and tabular data simultaneously for progressive MCI (pMCI) prediction. In the proposed VAP-Former, a Prompt fine-Tuning (PT) scheme is proposed to transfer the knowledge from Alzheimer’s disease (AD) prediction task for pMCI diagnosis. Extensive experiments on the public ADNI dataset show the superiority of the proposed method in pMCI prediction.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper aims to address an interesting problem, i.e., simultaneously utilizing MRI data and tabular data to improve the performance of pMCI diagnosis.
    2. The writing and organization of this paper are relatively good.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Experimental results are not convincing enough. Of all the competing methods, only one method [18] focuses on AD and pMCI diagnosis, which is not enough to reflect the current research status of diagnosis. A comparison with the SOTA methods of AD and pMCI diagnosis is highly recommended.

    2. In Table 2, it can be observed that compared with the VA-Former, solely utilizing either visual prompts (VisPrompt) or tabular prompts (TabPrompt) leads to a decrease in overall performance. This finding doesn’t seem very reasonable and the explanation of this finding is missing.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of this paper is good, as the authors provided a clear description of the proposed methods and used a publicly available dataset. Code will be made available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. A comparison with the SOTA methods of AD and pMCI diagnosis is highly recommended.
    2. The explanation of the finding that solely utilizing either visual prompts (VisPrompt) or tabular prompts (TabPrompt) leads to a decrease in overall performance should be provided.
    3. The rationale behind adding tabular prompts to each tabular transformer layer should be further demonstrated.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the performance evaluation is imperfect, this paper is an interesting paper that addresses an important problem.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposes a transformer-based network, VAP-Former, to simultaneously utilize imaging and tabular data for progressive MCI (pMCI) prediction. The paper introduces a Prompt fine-Tuning (PT) scheme to transfer knowledge from an Alzheimer’s disease (AD) prediction task for pMCI diagnosis. The paper received four reviews, all of which recommend accepting the paper.

    This paper has several key strengths. Firstly, it introduces a novel approach that combines imaging and tabular data for progressive MCI (pMCI) prediction. The experimental evaluation is extensive and demonstrates the superiority of the proposed method over existing methods. The Prompt fine-Tuning scheme and the use of a global prompt token are novel contributions that enhance the learning process and lead to improved performance.

    However, there are also some weaknesses that were identified by the reviewers. The paper lacks sufficient comparisons with state-of-the-art methods for Alzheimer’s Disease (AD) and pMCI diagnosis, which could provide a better understanding of the performance of the proposed method. Furthermore, the decrease in overall performance when solely utilizing visual prompts or tabular prompts needs to be explained more clearly. Additional details regarding the MR image data, such as the number of subjects and preprocessing procedures, as well as the encoding of clinical data, should be provided. Lastly, important metrics such as sensitivity should be reported to provide a more comprehensive evaluation. Overall, considering the strengths, positive evaluations, and the potential for addressing the weaknesses, the paper is recommended for acceptance.




Author Feedback

N/A



back to top