Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yijun Yang, Huazhu Fu, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Lei Zhu

Abstract

Diffusion Probabilistic Models have recently shown remarkable performance in generative image modeling, attracting significant attention in the computer vision community. However, while a substantial amount of diffusion-based research has focused on generative tasks, few studies have applied diffusion models to general medical image classification. In this paper, we propose the first diffusion-based model (named DiffMIC) to address general medical image classification by eliminating unexpected noise and perturbations in medical images and robustly capturing semantic representation. To achieve this goal, we devise a dual conditional guidance strategy that conditions each diffusion step with multiple granularities to improve step-wise regional attention. Furthermore, we propose learning the mutual information in each granularity by enforcing Maximum-Mean Discrepancy regularization during the diffusion forward process. We evaluate the effectiveness of our DiffMIC on three medical classification tasks with different image modalities, including placental maturity grading on ultrasound images, skin lesion classification using dermatoscopic images, and diabetic retinopathy grading using fundus images. Our experimental results demonstrate that DiffMIC outperforms state-of-the-art methods by a significant margin, indicating the universality and effectiveness of the proposed model. Our code is publicly available at https://github.com/scott-yjyang/DiffMIC.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_10

SharedIt: https://rdcu.be/dnwJs

Link to the code repository

https://github.com/scott-yjyang/DiffMIC

Link to the dataset(s)

https://challenge.isic-archive.com/data/#2018

https://www.kaggle.com/competitions/aptos2019-blindness-detection/data


Reviews

Review #5

  • Please describe the contribution of the paper

    This paper describes the use of a diffusion-based model to denoise medical images, towards improving classification performance by reducing irrelevant noise and capturing the true semantic representation. In particular, a dual conditional guidance strategy is implemented. Experiments are performed on several image modalities (placental ultrasound, skin lesion dermatoscopic, diabetic retinopathy fundus) against various other models, suggesting improved performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Explores inference capabilities of diffusion models
    • Encouraging performance against various other methods
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • DiffMIC framework might have been elaborated in greater detail, especially as to (the specifics and significance of) the various component input/outputs.
    • Inconsistent comparisms for various methods against the datasets
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code will be provided upon acceptance.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. For the DiffMIC framework, it might greatly aid understanding were the various inputs/outputs explained (possibly with representative figures), since the common understanding of a (generative) diffusion model involves the iterative addition/removal of noise in the forward and reverse process respectively. In particular, do the square inputs to the Denoising U-Net indicate (recognizable) images?

    2. Section 2.3 refers to Figure 1 (b), but there does not appear to be a (b) subfigure for Figure 1. Also, it might be more useful to provide examples of the images before and after denoising, if DiffMIC indeed undertakes denoising.

    3. For the evaluations in Table 1, the sets of methods used for PMG2000 and HAM10000/APTOS2019 are different. Is there any particular reason for this? Moreover, given that the motivating task is classification, it might be more appropriate to benchmark against classification-specific models similar to ResNet (i.e. EfficientNet, Inception, DenseNet etc.)

    4. For the ablation results as reported in Table 2, the frameworks corresponding to C1 and C2 might be included in supplementary material as figures.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    More clarity for the framework description would be ideal, together with more complete evaluations.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The author innovatively designed a general medical image classification model based on the Diffusion model, and achieved SOTA experimental results on three datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The author proposes a Diffusion-based model for general medical image classification, which is the first time that the Diffusion model has been applied to medical image classification tasks.
    2. In the denosing procedure, the author introduce a Dual-granularity Conditional Guidance stratagy to guide denosing process.
    3. The denosing model not only uses L2 loss to estimate general noise, but also designs MMD loss to estimate global and local information.
    4. The model achieves SOTA resutls on 3 datasets.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Some details of the model were not explained clearly, such as how to crop saliency maps and the number of saliency maps in the DCG module.
    2. The pipeline of the model is to add noise to low-dimensional labels and use a projection layer to project the low-dimensional labels onto the high-dimensional feature map for denoising, in order to match the input of Unet. Is it feasible to use a one-dimensional model for denoising.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Try to use a one-dimensional model for denosing.
    2. Some hyperparameters should be stated in the paper for better understanding. For example, the number of saliency maps in DCG, the output dimension of the projection layer, etc.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The author innovatively applied the Diffusion model to medical image classification and achieved the SOTA results.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper describes a diffusion-based model with a dual-granularity conditional guidance (DCG) module for medical image classification tasks. The proposed DCG module is the main novelty factor of the paper. The authors report good results on three distinct classification tasks, with images from different modalities.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Strong experimental validation.
    • The reported results are interesting, particularly given the applicability to different imaging modalities, suggesting potential clinical impact.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Some implementation details should be made more clear.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors used three datasets, two of them public and with adequate citations. The other dataset is proprietary, with the authors confirming approval by the IRB. The authors also claim the code will be made available upon acceptance.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Section 2: the authors should provide more details on the projection of the noisy variables, as well as the UNet used for denoising.

    • Section 2.1: what is the size of the cropped ROIs? Is it static?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The reported results are promising, especially considering the application to different modalities and classification tasks. The DCG module provides an interesting novelty factor.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    Dual-guidance diffusion network was studied for medical image classification, and its performance was compared with some commonly used models using three medical image datasets with different image modalities. The authors also used a Dual-granularity Conditional Guidance (DCG) strategy to guide the denoising procedure, and Condition-specific Maximum-Mean Discrepancy (MMD) regularization to learn the mutual information in the latent space for each granularity, enabling the network to model a robust feature representation shared by the whole image and patches.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Dual-guidance diffusion network was studied for medical image classification, and its performance was compared with some commonly used models using three medical image datasets with different image modalities. The authors also used a Dual-granularity Conditional Guidance (DCG) strategy to guide the denoising procedure, and Condition-specific Maximum-Mean Discrepancy (MMD) regularization to learn the mutual information in the latent space for each granularity, enabling the network to model a robust feature representation shared by the whole image and patches.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    More datasets are needed to confirm the performance of the proposed method. The method is still based on a supervised training strategy. Recently progress in self-supervised and weakly supervised strategies is expected to be included to improve the traning efficiency.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The proposed method is reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    More datasets are needed to confirm the performance of the proposed method. The method is still based on a supervised training strategy. Recently progress in self-supervised and weakly supervised strategies is expected to be included to improve the traning efficiency.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Dual-guidance diffusion network was studied for medical image classification, and its performance was compared with some commonly used models using three medical image datasets with different image modalities. The authors also used a Dual-granularity Conditional Guidance (DCG) strategy to guide the denoising procedure, and Condition-specific Maximum-Mean Discrepancy (MMD) regularization to learn the mutual information in the latent space for each granularity, enabling the network to model a robust feature representation shared by the whole image and patches.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Some details for the proposed approach and implementation should be clearly explained. The effectiveness of proposed method may be more clear if tested on more datasets and compared with self-supervised/weak-supervised methods. The performance comparison should be conducted under a fair setting for the existing and proposed methods.




Author Feedback

N/A



back to top