Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jianfeng Zhao, Shuo Li

Abstract

Denoising diffusion probabilistic models (DDPM) for medical image segmentation are still a challenging task due to the lack of the ability to parse the reliability of multi-modality medical images. In this paper, we propose a novel evidence-identified DDPM (EI-DDPM) with contextual discounting for tumor segmentation by integrating multi-modality medical images. Advanced compared to previous work, the EI-DDPM deploys the DDPM-based framework for segmentation tasks under the condition of multi-modality medical images and parses the reliability of multi-modality medical images through contextual discounted evidence theory. We apply EI-DDPM on a BraTS 2021 dataset with 1251 subjects and a liver MRI dataset with 238 subjects. The extensive experiment proved the superiority of EI-DDPM, which outperforms the state-of-the-art methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_65

SharedIt: https://rdcu.be/dnwEi

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    A multi-modal medical image segmentation approach (EI-DDPM) is proposed in this submission, where denoising diffusion probabilistic models (DDPM) is equipped with evidence learning. As a result, the proposed approach is able to perform segmentation and predict the reliability coefficients for each modality used. Ablation studies are provided to validate the major components of the proposed EI-DDPM.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The nature of multi-modal medical image segmentation involves an assessment of the reliability of each modality. The idea proposed in this paper is interesting, which leverages the evidence theory (DST) and integrates it into DDPM.

    • The proposed approach is able to produce both segmentation and reliability estimation for each modality as shown in Table 3 and 4.

    • The ablation studies (Table 2) validate two major components in the proposed approach on two datasets (BraTS 2021 and Liver MRI).

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • As a diffusion model-based image segmentation approach, the quantitative comparison should include the recent diffusion model-based image segmentation methods. However, there is only one diffusion model-based method [20] included in the comparison (Table 1), which makes it difficult to assess the effectiveness of the proposed approach.

    • It is mentioned in section 2.2 that the hypothesis of the proposed approach is to regard multi-modality images as independent and different sources of knowledge. However, no further discussion or clarification on whether the hypothesis is true or not.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The lack of code or pre-trained models as supplemental material is a concern for the reproducibility of the results. Based on the reproducibility response, the authors agree to release the related code if this work is accepted.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Given the fast development in the area of diffusion models for medical image segmentation, it is important for the authors to be aware of the latest research and acknowledge the recent works in this area. While it may not be feasible to include all recent papers in the comparison, more diffusion model-based image segmentation methods should be included to validate the effectiveness of the proposed approach. The current quantitative comparison (Table 1) only includes one diffusion model-based method [20], which raises concerns about the authors’ familiarity with the latest developments in this area. Therefore, it is recommended to include more related works in the quantitative evaluation, such as [a-d], to provide a more convincing comparison of the proposed approach.

    The hypothesis regarding multi-modality images as independent and different sources of knowledge serves as the foundation for the proposed approach, and it is important to evaluate its validity. Without further discussion or clarification on the hypothesis, it is difficult to assess the validity of the proposed approach and the reported results. Thus, it is expected to provide a detailed discussion on the hypothesis (the evidence supporting it).

    Other comments:

    • A reference is missing in page 2: … have shown remarkable performance [9,?].
    • One space before the beginning of this sentence is missing (page 2): … knowledge [10].Some …

    [a] Wu, Junde, et al. “MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model.” arXiv preprint arXiv:2211.00611 (2022). [b] Rahman, Aimon, et al. “Ambiguous Medical Image Segmentation using Diffusion Models.” CVPR 2023. [c] Wu, Junde, et al. “MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer.” arXiv preprint arXiv:2301.11798 (2023). [d] Xing, Zhaohu, et al. “Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation.” arXiv preprint arXiv:2303.10326 (2023).

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The following are the major factors that led to the current score.

    • The idea proposed in this submission is interesting for multi-modal medical image segmentation.
    • Few diffusion model-based image segmentation methods are included in the quantitative comparisons.
    • The hypothesis regarding multi-modality images as independent and different sources is not validated.
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper propose a novel EI-DDPM for multi-modal image segmentation, which combines DDPM and contextual discounting. This allows the network to evaluate the reliability of different modalities when fusing the outputs. Experiments on BraTS 2021 dataset demonstrate the effectiveness of the proposed method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is well-organized and the idea is clearly introduced.
    2. The experimental results are promising.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The proposed method combines several known techniques, i.e., DDPM and contextual discounting, which lacks technical impact to some extent. Besides, the method only considers uncertainty fusion at the output-level, which may not be sufficient. It may be better to utilize latent DDPM to consider uncertainty at the feature-level, which also significantly saving the memory.
    2. The proposed method employs four DDPM models in toal. But it does not provide any strategies for accelerating image sampling. I am concerned about the training duration and sampling speed of the whole network.
    3. The manuscript requires careful proofreading. There are several grammar errors and multiple ‘?’ throughout the main body.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The idea is clearly introduced but no codes provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    See weaknesses above.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See strengths and weaknesses above.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper utilizes denoising diffusion probabilistic models (DDPM) for tumor segmentation by integrating multi-modality medical images. Advanced compared to previous DDPM work, authors propose a novel evidence-identified DDPM (EI-DDPM), which parses the reliability of multi-modality medical images through contextual discounted evidence theory. Experimental results also prove the superiority of EI-DDPM.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is of certain novelty and easy to follow.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Please refer to the comments.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Will be reproducible if the code is released.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Why not consider using a simpler uncertainty-learning based method instead of DDPM to measure the confidence of different modalities?

    1. Evaluating segmentation performance solely based on the Dice score is insufficient and should be complemented by other distance metrics., such as HD95 or ASSD.
    2. The baseline of nnUNet should be compared. DDPM involves multiple iterations, and please compare the model complexities of different methods.
    3. The paper still has some formatting errors, such as incorrect referencing.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The motivation is clear and the designed method is targeting the shortages described. DDPM is somehow popular these days and it’s nice to see it works in different fields.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    All reviewers recognize the merits of the proposed method. More details (e.g., computational complexity and training time) are needed for description clarity. More importantly, the method needs to be evaluated more thoroughly (e.g., by using complementary metrics). The method was compared with limited DDPM-based segmentation approaches.




Author Feedback

We sincerely thank all ACs and reviewers for your recognition and valuable suggestions, especially for ACs acknowledged “All reviewers recognize the merits of the proposed method” such as the sound idea (“idea is interesting -R1”, “idea is clearly introduced -R2”, “motivation is clear -R3”); novel method (“propose a novel EI-DDPM -R2”, “the paper is of certain novelty -R3”); good organization (“the paper is well-organized -R2”, “the paper is easy to follow -R3”); and promising experiment (“ablation studies validate two major components -R1”, “experimental results are promising -R2”).

Q1: Model complexities between different methods (R3) A1: The complexity of our EI-DDPM is superior to TransU-Net and similar to U-Net and SegDDPM. The Params and FLOPs are used to evaluate the model complexities. The values of Params(M)/FLOPs(G) for proposed EI-DDPM and compared methods are as follows: EI-DDPM (32.3/17.9), U-Net (31.1/16.6), nnUNet (19.07/17.01), TransU-Net (96.1/48.3), and SegDDPM (33.8/18.4).

Q2: The training and sampling time of the proposed method (R2) A2: Training time: BraTs2021 (~15 days) and liver MRI (~6 days). The sampling time (~13 seconds per image). Although it is relatively slow at training and inference due to the inherent nature of the diffusion process. Compared to other DDPM-based segmentation methods like SegDDPM (step T = 1000), our EI-DDPM (step T = 300) achieves better performance with fewer steps, which achieves 3 times faster. And the sampling time of EI-DDPM has been discussed in the conclusion.

Q3: Recommended evaluation metrics HD95 (R3) A3: The quantitative results of recommended HD95 (mm) indicated our EI-DDPM outperforms the other compared methods. On BraTS2021, the quantitative results of average HD95 (mm) for U-Net, TransU-Net, SegDDPM, and our EI-DDPM are 8.85, 7.83, 7.41, and 5.87, respectively. And on liver MRI, the quantitative results of average HD95 (mm) for U-Net, TransU-Net, SegDDPM, and our EI-DDPM are 5.42, 4.96, 3.88, and 2.70, respectively.

Q4: Recommended comparison: nnUNet (R3) and related works [a-d] (R1) A4: The recommended nnUNet gains the average value of Dice(%)/HD95(mm) with 87.03/8.44 on BraTs2021 and 88.14/5.07 on liver MRI. Our EI-DDPM outperforms nnUNet 3.14%/2.57mm on BraTs2021 and 2.33%/2.37mm on liver MRI with respect to Dice/HD95. Advanced compared to recommended DDPM-based work [a-d]: our EI-DDPM is the first work to perform segmentation and predict the reliability coefficients for each modality. Due to the training time constraint of MedsegDiff [a] and MedSegDiff-v2 [d], we can just compare the brain tumor segmentation results reported in their paper on the same BraTs2021 dataset. The Dice score of our EI-DDPM is 4.62% higher than MedsegDiff and 2.7% higher than MedSegDiff-v2. For [b] and [c], our dataset does not conform to the setting in work [b] that contains 4 expert segmentation labels and the work Diff-UNet [d] focuses on 3D segmentation.

Q5: Discussion of hypothesis regarding multi-modality images as independent and different sources of knowledge for DST (R1) A5: Literature [8,10] cited in the manuscript can justify our hypothesis as true. Works [8,10] indicated the prerequisite for meeting independent and different sources for DST is no interaction between different modalities. And that is why our EI-DDPM uses four independent DDPM paths to learn the segmentation feature from the single modality image separately.

Q6: It may be better to consider uncertainty fusion at the feature-level. (R2) A6: We did consider. But it is unreasonable to do so for reliability learning, which violates the principle of independence explained in A5. The main idea of this work is to leverage the evidence theory for reliability learning. Thus, we consider uncertainty fusion at the output-level rather than feature-level.

Thank you again for your careful reading and valuable comments. All answers are based on the original manuscript and are consistent with the conclusion.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Thank the authors for theirs efforts that properly addressed the major concerns regarding comparisons and quantitative evaluations. Overall, this is a good paper with clear motivation and technical novelty.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes a segmentation by integrating multimodal images with an evidence-identified denoising diffusion model. The reviews had concerns on a limited evaluation, lacking recent diffusion models, a baseline using simpler uncertainty-based method, statistical significance and chosen evaluation metrics, as well as readibility issues. The rebuttal has provided additional results, which were disregarded in the decision since it is against the miccai guidelines and unfair to the other submissions who had their submission completed before the deadline. The work as is would need a stronger validation and comparison with state-of-the-art methods in diffusion models and uncertainty-based methods. For all these reasons, and situating this work with respect to the other submissions, the recommendation is towards Rejection.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    All major concerns are successfully addressed, such as computational cost as well as technical impact of the new diffusion model.



back to top