Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Ibrahim Ethem Hamamci, Sezgin Er, Enis Simsar, Anjany Sekuboyina, Mustafa Gundogar, Bernd Stadlinger, Albert Mehl, Bjoern Menze

Abstract

Due to the necessity for precise treatment planning, the use of panoramic X-rays to identify different dental diseases has tremendously increased. Although numerous ML models have been developed for the interpretation of panoramic X-rays, there has not been an end-to-end model developed that can identify problematic teeth with dental enumeration and associated diagnoses at the same time. To develop such a model, we structure the three distinct types of annotated data hierarchically following the FDI system, the first labeled with only quadrant, the second labeled with quadrant-enumeration, and the third fully labeled with quadrant-enumeration-diagnosis. To learn from all three hierarchies jointly, we introduce a novel diffusion-based hierarchical multi-label object detection framework by adapting a diffusion-based method that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. Specifically, to take advantage of the hierarchically annotated data, our method utilizes a novel noisy box manipulation technique by adapting the denoising process in the diffusion network with the inference from the previously trained model in hierarchical order. We also utilize a multi-label object detection method to learn efficiently from partial annotations and to give all the needed information about each abnormal tooth for treatment planning. Experimental results show that our method significantly outperforms state-of-the-art object detection methods, including RetinaNet, Faster R-CNN, DETR, and DiffusionDet for the analysis of panoramic X-rays, demonstrating the great potential of our method for hierarchically and partially annotated datasets. The code and the datasets are available at https://github.com/ibrahimethemhamamci/HierarchicalDet.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_38

SharedIt: https://rdcu.be/dnwJT

Link to the code repository

https://github.com/ibrahimethemhamamci/HierarchicalDet

Link to the dataset(s)

https://github.com/ibrahimethemhamamci/DENTEX


Reviews

Review #2

  • Please describe the contribution of the paper

    In this paper, a diffusion-based hierarchical multi-label object detection model is introduced for the analysis of panoramic dental X-rays. The annotated data for the multi-labels follow the FDI system and are divided into three distinct types. The proposed model uses hierarchical learning to understand these labels and employs a diffusion-based object detection model to tackle the challenge of training detectors with partially annotated data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    In this paper, a new object detection model is suggested, taking into consideration two important features of the panoramic dental X-ray dataset: partial annotations and object categories hierarchy. Experiments show the proposed method achieved the highest performance compared to baselines.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Baseline models are too outdated models. Cascade R-CNN is a good option for a multi-stage object detector in addition to Faster R-CNN, FCOS or recent YOLO models are also good options for one-stage object detectors.

    Furthemore those models can also be modified for hierarchical and multi-label learning in the similar way but the models seem to be tested without modifications.

    There are more good comparisons. The proposed method tries to solve sparely annotated object detection problems and one of main approaches for the problems is sem-supervised learning [R1]. Or anchorless object detectors which are robust to sparsely annotated dataset [R2] such as FCOS.

    More information on annotation sparsity of the dataset will also be helpful for analysis and further research.

    Why (Manipulation+Transfer+multilabel) underperform for the enumeration and diagnosis in the table 1?

    Refereces [R1] Rambhatla, Sai Saketh et al. “Sparsely Annotated Object Detection: A Region-based Semi-supervised Approach.” ArXiv abs/2201.04620 (2022): n. pag. [R2] Yoon, Jihun et al. “Semi-Supervised Object Detection With Sparsely Annotated Dataset.” 2021 IEEE International Conference on Image Processing (ICIP) (2020): 719-723.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Based on the checklist, this research seems to be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Additional experiments that I mentioned above will supplement shortcomings of this research.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the baseline methods seemed relatively weak, the proposed method demonstrated a novel and appropriate approach that took into account the characteristics of the dataset. As a result, I am inclined to give a weak acceptance.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The contribution of the paper is two-fold: On a methodological side, it proposes a novel end-to-end diffusion-based hierarchical multi-label object detection approach that can cope with partially annotated data sets. This also includes a novel noisy box manipulation technique in diffusion models for hierarchical data. For clinical application, this model is the first to concurrently detects abnormal teeth with the associated diagnosis and dental enumeration. Experiments show significant improvement compared to other state-of-the-art methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Very interesting approach with clinical relevance. I especially like the idea about how to combine to combine the different data sets with partially unlabeled data.
    • Good results, ~10% improvement against multiple state-of-the-art methods.
    • I consider it a big plus that data and code will be made available.
    • Paper well written, nice to read, well structured.
    • The information in the appendix was helpful.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I do not see any severe weaknesses.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method is clearly explained and data as well as code will be made available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Interestingly, for the enumeration and diagnosis - but not quadrant - the combination manipulaion+transfer+multilabel seems not to be the best approach. Can you comment on this a bit more? What is the advantage of transfer then given that “w/o Transfer” shows good results? Are there any significant differences in training/inference time between the different combinations?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    8

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents an interesting approach with clinical relevance and the achieved prediction results are a significant improvement to the state-of-the-art. Given that code and data will be made available, the reproducibility and usability for the scientific but also medical community is very high. Moreover, the paper is very well written.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The paper proposes a diffusion-based hierarchical multi-task object detection framework for analyzing panoramic X-rays to identify problematic teeth with dental enumeration and associated diagnoses at the same time. The proposed approach adapts DiffusionDet as backbone that formulates object detection as a denoising diffusion process from noisy boxes to object boxes. Experimental results demonstrate that the proposed method works for domain specific dataset in panoramic X-rays.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. A new dataset for panoramic X-rays. It is very interested to me, but need clinical reviewers provide more in-detailed review for it.
    2. The architecture is simple to follow and the reproducibility looks fine.
    3. Proves that diffusion-based model can be helpful for panoramic dental x-ray images. I did not see related papers before.
    4. Good visualization.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The backbone is too similar to DiffusionDet, and the only difference is the multi-label output head for domain specific data. The novelty is limited.
    2. The author should provide pre-trained hyperparameters and pre-trained schemes.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    A good multi-task domain specific improvment to DiffusionDet, but the novelty is limited.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    A good multi-task domain specific improvement to DiffusionDet, but the novelty is limited.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper received three positive reviews. The essential technical novelty and significance are still limited and quite incremental (as evidence by two “weakly accept” ratings). The 3rd reviewer is an outlier. This work may be a useful multi-object detection network for a special clinical application.




Author Feedback

Dear Area Chair and Reviewers,

We would like to express our appreciation for your thoughtful reviews, positive comments, and constructive feedback. Your insights will undoubtedly enhance the quality of our work. In response to the concerns and queries raised by the reviewers, we provide the following clarifications:

Reviewer #1: • Novelty & Similarity to DiffusionDet: Indeed, we concur that our model builds upon the strengths of DiffusionDet, particularly in handling object detection as a denoising diffusion process. The addition of a multilabel output introduces a new dimension to our model that sets it apart from its foundation. The cornerstone of our contribution, however, is the introduction of a unique bounding box manipulation technique, meticulously tailored for hierarchical dataset settings. This innovation significantly extends the potential of the DiffusionDet framework. • Hyperparameters and Pre-training schemes: Comprehensive details about the hyperparameters and pre-training schemes are included in the supplementary materials. To maintain total transparency and facilitate reproducibility, we will disclose the remaining details in our Github repository, along with our code and data.

Reviewer #2: • Selection of Baseline Models: We appreciate your concern regarding the choice of baseline models. These models were chosen to contextualize our method against established object detection models that have made a significant impact in the field. Though more recent models like Cascade R-CNN, FCOS, and the latest YOLO models provide an intriguing comparison, our backbone, DiffusionDet, exhibits already proven state-of-the-art performance. Our method serves to elevate this performance, with a special emphasis on our cutting-edge bounding box manipulation technique. • Adapting Other Models Similarly: We value the perspective on the possibility of modifying other models for hierarchical and multi-label learning. However, the main novelty of our work lies in the development of a unique bounding box manipulation technique. As demonstrated in our ablation study, this technique proves to be highly efficient. This manipulation heavily relies on the denoising process of the diffusion network. By adapting this process with the inference from the previously trained model in a hierarchical order, we have created an innovation unique to our method. Consequently, while we acknowledge that other object detection models can be adapted for multi-label learning, our bounding box manipulation technique, rooted in the diffusion network’s denoising process, may not be readily adaptable to those models.

Reviewer #2 and #3: • Performance of (Manipulation+Transfer+Multilabel) and Advantage of Transfer: We acknowledge the suboptimal performance of (Manipulation+Transfer+Multilabel) in enumeration and diagnosis tasks. We appreciate your discerning observations and share your interest in this unexpected outcome. Our working hypothesis is that the three hierarchical data levels, specifically for quadrants, contain potent information. Transferring such information appears to enhance accuracy. This does not appear to be the case for enumeration and diagnosis classes, however. These findings accentuate the effectiveness of our novel bounding box manipulation technique, which consistently delivers superior performance across all classes, as substantiated in our ablation study.

Reviewer #3: • Training/Inference Time: We appreciate your recommendation to disclose this information. Although we can confirm that there are no notable time discrepancies, we commit to including this information in our Github repository for thorough documentation and ease of reference.



back to top