Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yiming Qian, Liangzhi Li, Huazhu Fu, Meng Wang, Qingsheng Peng, Yih Chung Tham, Chingyu Cheng, Yong Liu, Rick Siow Mong Goh, Xinxing Xu

Abstract

Visual explanations have the potential to improve our understanding of deep learning models and their decision-making process, which is critical for building transparent, reliable, and trustworthy AI systems. However, existing visualization methods have limitations, including their reliance on categorical labels to identify regions of interest, which may be inaccessible during model deployment and lead to incorrect diagnoses if an incorrect label is provided. To address this issue, we propose a novel category-independent visual explanation method called Hessian-CIAM. Our algorithm uses the Hessian matrix, which is the second-order derivative of the activation function, to weigh the activation weight in the last convolutional layer and generate a region of interest heatmap at inference time. We then apply an SVD-based post-process to create a smoothed version of the heatmap. By doing so, our algorithm eliminates the need for categorical labels and modifications to the deep learning model. To evaluate the effectiveness of our proposed method, we compared it to seven state-of-the-art algorithms using the Chestx-ray8 dataset. Our approach achieved a 55% higher IoU measurement than classical GradCAM and a 17% higher IoU measurement than EigenCAM. Moreover, our algorithm obtained a Judd AUC score of 0.70 on the glaucoma retinal image database, demonstrating its potential applicability in various medical applications. In summary, our category-independent visual explanation method, Hessian-CIAM, can generate high-quality region of interest heatmaps that are not dependent on categorical labels, making it a promising tool for improving our understanding of deep learning models and their decision-making process, particularly in medical applications.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_17

SharedIt: https://rdcu.be/dnwxZ

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This work focus on the limitation of many visualization algorithms that require categorical labels to generate visual explanations. This requirement limits the usage of algorithms to the training stage where the ground truth category label is available. The authors address the challenge by proposing a novel category-independent visual explanation method to solve this problem. The proposed methodology is called Hessian-Category Independent Activation Maps (Hessian-CIAM), which utilizes the Hessian matrix as an activation weighting function to eliminate the need for categorical labels to compute the ROI heatmap. Then a polarity checking process is added to the post process which corrects the polarity error from the SVD based smoothing function. The method is compared against seven state-of-art algorithms on the Chestx-ray8 dataset providing results which demonstrated its superior performance. Additionally, the work demonstrates a clinical use case in glaucoma detection from retinal image

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of this work rely on the fact that the proposed methodology - called Hessian-Category Independent Activation Maps (Hessian-CIAM) - utilizes the Hessian matrix as an activation weighting function being thus able to eliminate the need for categorical labels to compute the ROI heatmap. The results obtained show the overperformance capability of the proposed method. Additionally, the work demonstrates a clinical use case in glaucoma detection from retinal images to stress the flexibility of the proposed algorithm.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weakness of the proposed work relies in the fact that the flexibility of the method is not illustrated in the main article.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    In my view, the experiments performed in this paper are reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This is a well-written, readable, and scientifically sound paper that proposes a novel methodology. Please take in account the following minor comments:

    • SVG is not defined
    • Revise the paper for typos such as “an” instead of a, etc
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This work focus on the limitation of many visualization algorithms that require categorical labels to generate visual explanations. This requirement limits the usage of algorithms to the training stage where the ground truth category label is available. The authors address the challenge by proposing a novel category-independent visual explanation method to solve this problem. The proposed methodology is called Hessian-Category Independent Activation Maps (Hessian-CIAM), which utilizes the Hessian matrix as an activation weighting function to eliminate the need for categorical labels to compute the ROI heatmap. Then a polarity checking process is added to the post process which corrects the polarity error from the SVD based smoothing function. The method is compared against seven state-of-art algorithms on the Chestx-ray8 dataset providing results which demonstrated its superior performance. Additionally, the work demonstrates a clinical use case in glaucoma detection from retinal image

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposes to use the Hessian matrix as a means to derive CAM-like heatmaps without requiring the selection of the predicted class logits. The method seems sound and outperforms other heatmapping approaches.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The method could be relevant for applications where the ground truth is uncertain,for example for some lesion types where expert disagreement is high. The formulation is novel and the results are strong and provided on multiple datasets.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper organisation needs improvement, in particular the rationale for each steps in the methods is not clearly stated. Standard performance metrics of the models are neglected, which is a major issue.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The hyper parameters are specified but there is no mention of a public repository anywhere. The model for the eye fundus data is not described.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The paper organisation should be improved. I found the methods difficult to follow, as little space is given to justify each step adopted by the authors. Why is SVD necessary for example? The related work is taking a considerable space removing the possibility of better explaining the rationale for some parts in the methods. Similarly, the eye fundus dataset and models should be described in Sec.4.1

    Even if the focus of the paper is on explainability, the models should be always evaluated on a testing set first. There is no point in explaining a model that was not evaluated before. I would have liked to see the performance of the model on the eye fundus data.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think the paper organisation is my main concern. The authors should have stated more clearly their approach and the model performances and conventional evaluations before moving to the explainability methods.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    Thanks to the authors for the feedback. My opinion on the paper has not changed, though. I think the authors could have summarised the related work to add an ablation study for SVD. Similarly, the base performance of their model was not stated in the rebuttal. Even if only for demonstration purposes, it is important to determine if the model is worth interpreting, otherwise all the visualisations have little meaning. Moreover, if the final goal is to provide heatmaps to clinicians, a user-based evaluation with clinicians should have been included in the paper.



Review #3

  • Please describe the contribution of the paper

    The main contribution of the paper is a novel way to generate the region of interest (ROI) heatmap based on the Hessian matrix. Compared with other existing techniques of generating ROI heatmaps, the main advantage of this approach is that it doesn’t rely on the categorical labels of the training data. An SVD-based post-process approach is further proposed to smooth the heatmap.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The novelty lies in generating the Hessian matrix-based ROI heatmap which gets rid of the categorical label information.
    2. The paper is well organized and easy to follow.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. My major concern is that the evaluation of the proposed approach doesn’t fully reflect the visual explanability as the authors claimed. While the authors declared that the ROI heatmap can be used as a technique to enhance a model’s explainability in the beginning of the paper, in the experiment section, the main evaluation performed focus on how well the ROI heatmaps generated by the proposed approach highlight the ground truth area of the diseases compared with other existing techniques. The missing part is that how this observation helps explain the model performance visually. Also, there is no description of how the model performance will affect the quality of the heatmap.

    2. Another concern is the insufficient motivation of utilizing Hessian matrix. What are the mathematical justifications/explanations that support the effectiveness of the Hessian matrix-based ROI heatmap in highlighting the ground truth disease and improving the model’s explainability?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Even though the code is not provided, I consider this paper is reproducible as the model architecture, datasets, parameters, evaluation metrics are clearly listed.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. First of all, I would suggest more assessments and justifications on how the ROI heatmaps generated by the proposed approach better help to explain the model performance and the decision making process. For example, demonstrating how the ROI heatmaps vary for models with different levels of accuracy, or illustrating how the heatmaps correlate with the ground truth for accurate predictions and do not correlate for inaccurate predictions. In addition, more justifications can be added to describe how such observations are unique in comparison to the results generated from other existing techniques.

    2. It would be great to add further explanation regarding the motivation for using the Hessian matrix to generate the ROI heatmap and clarifying why such an approach is anticipated to be effective.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The main reason for the decision is the evaluation part doesn’t reflect the contributions as the authors claimed.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    First, I agree the authors clearly address the motivation of using Hessian matrix in the rebuttal. However, there is still not enough justification on how the proposed approach contributes to the visual explanation of DNNs. Although the authors have demonstrated that the proposed visualization outperforms the existing approaches on both wrong and correct predictions, the fundamental question remains: how can this visualization be effectively used to aid in model explanation? Regrettably, I feel such a gap has not been addressed sufficiently, and therefore I cannot accept the paper for presentation at the conference.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper introduces a novel Hessian-category independent activation maps for ROI heatmap generation. The authors show the effectiveness of the proposed method on two clinical datasets. However, some reviewers have concerns with the experiments. There are gaps between the main claims and evaluation. The proposed method is good for generating category-independent ROIs but it is not an explanation of the model prediction. The current claim and writing are confusing due to the gaps between the method/results and claims. Also, the motivation part is also weak. It would be great to address these issues during the rebuttal.




Author Feedback

We thank the reviewers for their helpful comments. We have already updated the paper to discuss the points below, including the new suggested analyses.

R1-3: Improve Writing. Thanks for your comments on improving writing and organization of the paper. We have updated our manuscript to address the information on flexibility of our method and rationale for each step in the method. We have added some description of dataset and model information for fundus image in the manuscript. We have included all the hyper parameters in the updated version.

R2-3: Standard performance metrics and evaluation. Our model is training free algorithm which can be used as ready to use plugin on any CNN based pre-trained model to generate ROI heatmap. The purpose of this ROI heatmap is to show the region that algorithm look at during the prediction process. It is designed as a tool that help doctors to gain confidence on the AI assisted products. For example, when doctors see a model makes a correct prediction at the same time highlight the right ROI, then it would help this model to gain more trust from doctors. Our objective is to create a tool that can be used in the product deployment where no categorical label is available, that leads to our current algorithm design.

The state of art method in this field are using performance of weakly supervised localization algorithms as benchmark which is not the goal of our method. Our main focuses are providing high quality model visualization at deployment when no categorical label is available. So, we designed a product deployment specific experiment to evaluate our algorithm as well as state of art approaches in terms of IoU with ground truth bounding boxes.

R2: Impact of SVD. Originally, we have a comparison of our approach with and without SVD based post processing. Due to the page limit, we have removed it from the paper. In our paper, the principal components from SVD have been used to remove the noise in the ROI heatmap. This idea was originated from [17] and it has been included as a standard setting in the official GradCAM GitHub implementation.

R3: Impact of model accuracy on algorithm. Our algorithm is used as a plugin that is directly attached onto a pre-trained model. In the paper Fig. 3 left figure shows a comparison of the performance of different ROI methods on the same pre-trained model. This figure is lack of legend, we have revised it to make it clearer. In the figure the dark gray bar shows the IoU performance for the samples that pre-trained algorithm gives out wrong predictions, while the bar in the light gray are the samples that pre-trained algorithm gives our correct predictions. Our model consistently outperforms the existing algorithm on both wrong and correct prediction categories.

Your suggestion is valuable we will conduct additional experiences to provide a more detailed evaluation on different level of pre-trained model accuracy on our method as well as existing techniques.

R3: Motivation of Hessian matrix. The motivation of this approach came from GradCAM++[3] where they included the second order derivatives in the method to produce visualization. From their evaluation, they found second order derivate provide a better performance in the ROI generation. The downsize of their method is that requires first order derivate to obtain a second order derivate. So in scenario of model deployment, when ground truth label is not in present, the first order derivate maybe wrong due to wrong prediction output from pre-trained model (shown in the Fig. 3) which leads to a wrong second order derivative.

Hessian matrix is basically the second-order partial derivatives of the activation function, so we used the method proposed in [19] as a fast way to approximate it. Based on our experiment, our algorithm delivers a more robust ROI estimation, when pre-trained model make wrong predictions and much better performance when model make correct prediction.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed concerns regarding the motivation for using the Hessian matrix. However, I agree with the concerns of Reviewer 3. That is misinterpretation of visual explanations, which might confuse the community. The visual explanation has been studied to understand the model’s behavior or underlying reason for the prediction in previous studies. The category-independent visual explanation is a confusing term. How can this visualization be effectively used for model explanation if it is not sensitive to the model prediction? The proposed method looks more like a localization tool rather than a visual explanation of deep networks. For that reason, this paper needs to clarify the definition of visual explanation and properly find the position of the proposed method. If this is a localization method, then comparative experiments with weakly-labeled localization methods will make the paper be more strong.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The work presented in this paper of using Hessian matrix to help visualize ROI is very interesting. Although it may not directly explain the model, it falls into the category of saliency methods, which belong to interpretability methods. The rebuttal reduced concerns of raised in the original review. The paper can be a very interesting presentation at the conference.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper is well-written and the topic is interesting and very relevant for the MICCAI community. I agree with the reviewers that knowing the accuracy achieved by the model being interpreted is critical to be able to evaluate the capacity of the explanations. For example, if a model’s classification accuracy is random, then the visualizations shouldn’t be trusted. However, that’s beyond the scope if this work, which I think should be accepted to the conference. As the authors mentioned in their rebuttal: “Your suggestion is valuable we will conduct additional experiences to provide a more detailed evaluation on different level of pre-trained model accuracy on our method as well as existing techniques”. In my opinion, I would only mention the accuracy of the pre-trained models used in the camera-ready version, and in a future extension, I would show how the visualization changes for models of different accuracy levels.



back to top