Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Peiang Zhao, Han Li, Ruiyang Jin, S. Kevin Zhou

Abstract

Universal Lesion Detection (ULD) in computed tomography (CT) plays an essential role in computer-aided diagnosis. Promising ULD results have been reported by anchor-based detection designs, but they have inherent drawbacks due to the use of anchors: i) Insufficient training target and ii) Difficulties in anchor design. Diffusion probability models (DPM) have demonstrated outstanding capabilities in many vision tasks. Many DPM-based approaches achieve great success in natural image object detection without using anchors. But they are still ineffective in ULD due to the insufficient training targets.
In this paper, we propose a novel ULD method, DiffULD, which utilizes DPM for lesion detection. To tackle the negative effect triggered by insufficient targets, we introduce a novel Center-aligned bounding box (BBox) padding strategy that provides additional high-quality training targets yet avoids significant performance deterioration. DiffULD is inherently advanced in locating lesions with diverse sizes andshapes since it can predict with arbitrary boxes. Experiments on the benchmark dataset DeepLesion show the superiority of DiffULD when compared to state-of-the-art ULD approaches.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_10

SharedIt: https://rdcu.be/dnwGN

Link to the code repository

https://github.com/momopusheen/DiffULD

Link to the dataset(s)

https://nihcc.app.box.com/v/DeepLesion


Reviews

Review #1

  • Please describe the contribution of the paper

    This work aims a Universal Lesion Detection (ULD) method, named DiffULD, based on deep learning and diffusion probability models (DPM) . The proposed method can address the inherent drawbacks due to the use of anchors: i) Insufficient training target and ii) Difficulties in anchor design.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper is very well written, and differs from previous methods which rely on a large number of high-quality training samples. The proposed center-aligned bounding box padding strategy can provide additional high-quality training targets yet avoids significant performance deterioration. And the proposed model outperforms the state of the arts.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There are some problems must be solved before it is considered for publication. ①In Section 2.4 “Backbone Design”, Multi-window input is proposed to focus on organ-specific information. The strategy for window widths and window levels settings should be stated. Whether the window size should be adjusted for different scenarios? ②Compare the results in Table 1, SATr shows better performance under the fewer number of FPs. It may be helpful to show some comparative visualization of the results to demonstrate the advantages of the proposed method. ③The limitation of the proposed method is not included. ④Another minor mistake is in Section 3.1 Settings.One citation is not given, “we also evaluate the performance of 3 methods based on a revised test set from []”.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Yes.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please refer to my comments in “Weaknesses”.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method is novel and the paper is well written. Some details of the method is not well explained. In addition, it is better to provide some visualization results.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The contribution of this paper is the introduction of a center-aligned bounding box padding strategy to improve detection performance in Universal Lesion Detection (ULD). The method utilizes a diffusion probability model (DPM) and introduces the new padding strategy to provide additional high-quality training targets, addressing the issue of insufficient training targets in ULD. The proposed method is evaluated on the DeepLesion benchmark dataset, and the results demonstrate its effectiveness in improving detection performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Innovative approach of adapting the DiffusionDet method with a newly proposed bounding box padding strategy for lesion detection.
    2. Well-written manuscript that effectively communicates research findings. Thorough experimental validation of the proposed method, which further strengthens the credibility of the study.
    3. Proposed center-aligned bounding box padding strategy generates additional high-quality training targets for lesion detection, resulting in improved detection performance.
    4. Demonstration of the superiority of DiffULD over state-of-the-art ULD approaches through experiments on the benchmark dataset DeepLesion.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper mentions difficulties in anchor design for ULD, which is redundant as the proposed method only addresses insufficient training targets and does not propose a new anchor design or anchor-free detection framework.
    2. The novelty of the proposed method is limited as it mainly relies on adding a new bounding box padding strategy to the previous method DiffusionDet and using a different backbone.
    3. Section 3.1 lacks a reference to support the evaluation of three methods based on a revised test set.
    4. The paper needs to clarify the reason for not reporting the results of DiffULD under slice 27 and 9 in Table 1.
    5. In Table 1, the performance gain of DiffULD against SATr and DKA-ULD is not clearly evident, and the paper should include a fair comparison in terms of model size, inference time, and FLOPs to determine which model is more efficient.
    6. Additional visual results are needed to support the comparison of DiffULD with other methods listed in Table 1.
    7. The overall novelty mainly lies in the proposed bounding box padding strategy, which may be insufficient. Furthermore, the implementation of DiffULD is similar to DiffusionDet [40].
    8. The comparative experiments are not well-designed or explained, and more information is needed to determine how the proposed strategy works with other anchor-free methods.
    9. The results in Table 2 lack interpretation, and the significance of the experiment conducted in that table is unclear.
    10. The ablation study is insufficient, and other designs, such as 3D context feature fusion, need to be examined to understand their role, not just the bounding box padding strategy.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code is not currently available. The used dataset (DeepLesion) is publicly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The name of the method mentioned in the paper as “DKA-ULD [42]” should be corrected to “DKMA-ULD”. It would also be helpful to cite the paper “DKMA-ULD: Domain Knowledge augmented Multi-head Attention based Robust Universal Lesion Detection” to avoid any confusion.

    There is an empty square bracket at the end of the first paragraph in section 3.1 that should be filled with the appropriate citation. This will help readers to access the reference and understand the context of the evaluation.

    The paper would benefit from a brief explanation of how specific parameters are determined, such as λ_scale, λ_conf, and the weights in the loss function. It is recommended to include this information in the main text or supplementary materials to help readers understand the decision-making process and replicate the experiments.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, the technical novelty of the proposed method is not enough, and its performance improvement compared to SATr, DKA-ULD, and DiffusionDet is not significant.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    After reading the other reviews and rebuttal, I tend to reject this paper as there are still concerns that has not been addressed to make it accepted in its current form.



Review #3

  • Please describe the contribution of the paper

    The paper proposed a diffusion-based framework, DiffULD, for universal lesion detection. Novelty of this method was that it developed the center-aligned bounding box padding strategy for bounding box augmentation to solve the insufficient training target issues. In the experiment section, the author comprehensively compared the proposed method with other state-of–the-art approaches and provided an ablation study.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main advantage of this paper is that the author explored the diffusion-based model for lesion detection and provided a novel bounding box augmentation strategy. The comparison with other methods is comprehensive.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    However, there are several issues with this paper:

    (1) The paper writing and organization is not very clear. For example, it is unclear which backbone is used to predict the bounding box coordinates (for the denoising part).

    (2) Though the paper compared the proposed method with [1] in the supplementary, why not report the results of [1] in Table 1 or 2? it is better to have a comparison with other diffusion-based models.

    (3) The performance improvement compared with other methods is insignificant (<1%). It is difficult to assess whether the performance improvement is from the center-aligned bounding box augmentation strategy or the normal data augmentation strategy (e.g., random horizontal flipping, rotation, and random brightness adjustment) used in the experiment?

    (4) How to choose the number of slices in Table 1 and 2 (“Slices” column)?

    There are several minor issues: (1) it is better to provide some explanation for Fig.1 in the caption; (2) can you provide more explanation or reference to the loss function formulation?; (3) missing reference at end of 1st paragraph in Section 3.1: “on a revised test set from []”; (4) limitation of the proposed method is not discussed; (5) Fig. 1 in the supplementary materials is very difficult to see, there are many overlaps between texts.

    [1] Chen, Shoufa, et al. “Diffusiondet: Diffusion model for object detection.” arXiv preprint arXiv:2211.09788 (2022).

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The dataset used in this paper is public available, but the code implementation is unavailable.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see the weakness section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although using diffusion based model to perform lesion detection is novel, it is difficult to justify the proposed center-aligned bounding box is effective for the performance improvement. In addition, the performance improvement is not significant.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors addressed my comments, so I update the score.



Review #4

  • Please describe the contribution of the paper

    The author propose to use diffusion model for objection detection task like universal lesion detection.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. novelty in using diffusion model for bounding box detection.
    2. coherent writing and logic.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    More experimental data is favorable.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    reproducible

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea is novel.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This study focuses on using diffusion probability models for Universal Lesion Detection in CT images, using the public dataset DeepLesion.

    Strengths include

    • novel and innovative use of the diffusion models to learn the box coordinates (as opposed to the more traditional image generation)
    • use of a large public dataset (~4.5k patients, 32K lesions, 10K studies).
    • use of 3 different windowing levels, to have more specific organ of interest
    • the approach works a lot better than the other anchor free methods

    Weaknesses include

    • Not a lot better performance compared to the existing methods ( up to 0.5% overall increase in sensitivity). While better the performance is only slightly increased (is the increase statistically significant)? What is the nature of those additionally extracted lesions ? Is it worth the trouble of a new technique if the other ones were working already good enough?

    • (Minor) Organization of Table 1 renders the interpretation more difficult.

    Please address the shortcomings mentioned above, especially the moderate increase in sensitivity.




Author Feedback

Thanks for the comments. We will address all of them: Q1.Moderate performance increase(Meta,R2) Apologies for any misunderstanding, paper space restrictions prevented us from providing more results. The slight performance improvement primarily stems from the large training data. When using 25% training data of DeepLesion, DiffULD surpasses SATr by a large margin: SATr/ours:59.99/62.13(+2.14, FP=0.5); 68.05/70.01(+1.96, FP=1); 74.67/76.42(+1.75, FP=2); 79.09/80.78(+1.89, Avg.). Besides, we’d like to further underscore several notable benefits of DiffULD: 1.Improved generalization: While anchor-based detection methods (e.g. SATr) achieved success, the anchor needs carefully manual design, which poses a significant challenge that impedes their generalization. Anchor-free designs demonstrate strong generalization but current anchor-free methods are still GRAPPLING WITH ISSUES OF ACCURACY. Our DiffULD also adopts anchor-free designs to avoid the performance influence from anchor designs yet exhibits COMPARABLE ACCURACY to anchor-based methods. This affirms its ability to effectively tackle a wide range of lesions in clinical applications. 2.Faster convergence & lower overfitting risk: In contrast to SATr, which requires 20 epochs to achieve optimal performance and risks significant performance degradation due to overfitting beyond 24 epochs, DiffULD converges faster. It matches the performance of SATr in merely 16 epochs and shows NO SIGNS OF PERFORMANCE DETERIORATION during prolonged training. We invite reviewers to refer to the convergence curves in the supplementary material for further detail. 3.Superior stability: Our experiments suggest that SATr’s performance is unstable, fluctuating by ~1%. Conversely, DiffULD exhibits better stability, with variations kept <0.3%. 4.Untapped potential: While the majority of prior research concentrate on anchor-based frameworks, DiffULD, as a novel paradigm, is only beginning to reveal its potential in the domain of medical object detection. Q2.Additionally extracted lesions.(Meta) These lesions are harvested by [45] from the DeepLesion[1]. They contain two categories, mislabeled lesions from previously labeled CT slices and newly annotated lesions from unlabeled CT slices. Q3.Novelty & extendability(R2) While the design is lightweight, there are several notable benefits and contributions as mentioned in Q1. Additionally, our modifications, although minimal, are highly effective and can be seamlessly integrated with other anchor-free diffusion methods for medical object detection. Q4.Backbone(R3) The overall backbone design is identical to [8]: multi slices -> multi-window pre-processing -> ConvNeXt-T + FPN -> multi-window feature -> fusion module (A3D[16])-> diffusion-based detector. Q5.Selection of slice numbers(R2,R3) The selection of slices number is based on the lesions’ structural characteristics and the BALANCE between performance and GPU cost. Most lesions in CT scans can be covered by a few adjacent slices, excessive 3D context (e.g., 3 to 9 slices) doesn’t bring a substantial performance boost (<0.5%) yet greatly increase the computational cost (e.g., 7 slices need 21 GB while 3 slices need 16 GB). Theoretically, 27 slices need 46 GB and the performance boost can be further diminished, thus it’s a SUB-OPTIMAL way. Besides, all SOTA methods experiment with 3 or 7 slices, we adopt the same settings for fair comparison. Q6.HU windows(R1) The selected windows are commonly used in clinical diagnosis that can cover the most common organs of interest. Q7.Comparison with DiffusionDet & Assess the performance gain(R3) Sorry for the misunderstanding, the result of DiffusionDet is only reported in the ablation study (baseline in Tab. 3). We have added them in Tab. 1. Besides, the methods in Tab.1 all used identical data augmentation as detailed in Sec. 3.1. Thus it clearly indicates the gain comes from our method according to the ablation study. Q8.Minor suggestions-We’ve fixed them.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    While I do think this work is both novel and useful clinically, unfortunately in my opinion the study has not reach the bar for acceptance on account of incremental result improvement (I understand that with low amount of data, your approach works better than other approaches, but with the large amount of data that you already have, its barely better.). I encourage the authors to further develop their study and focus on improving the existing approach with all the data they already have.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal does not fully address the concerns raised by reviewers.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors provided a good rebuttal, and one of the reviewers increased the score. As a result, the final score became among the ones on the higher-side in my pool.



back to top