Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Xinquan Yang, Jinheng Xie, Xuguang Li, Xuechen Li, Xin Li, Linlin Shen, Yongqiang Deng

Abstract

When deep neural network has been proposed to assist the dentist in designing the location of dental implant, most of them are targeting simple cases where only one missing tooth is available. As a result, literature works do not work well when there are multiple missing teeth and easily generate false predictions when the teeth are sparsely distributed. In this paper, we are trying to integrate a weak supervision text, the target region, to the implant position regression network, to address above issues. We propose a text condition embedded implant position regression network (TCEIP), to embed the text condition into the encoder-decoder framework for improvement of the regression performance. A cross-modal interaction that consists of cross-modal attention (CMA) and knowledge alignment module (KAM) is proposed to facilitate the interaction between features of images and texts. The CMA module performs a cross-attention between the image feature and the text condition, and the KAM mitigates the knowledge gap between the image feature and the image encoder of the CLIP. Extensive experiments on a dental implant dataset through five-fold cross-validation demonstrated that the proposed TCEIP achieves superior performance than existing methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_31

SharedIt: https://rdcu.be/dnwJN

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #3

  • Please describe the contribution of the paper

    The paper proposes a novel dental implant position regression network that integrates text conditions from the Contrastive Language-Image Pretraining (CLIP) model to assist the implant position regression. The proposed network, called Text Condition Embedded Implant Position Regression Network (TCEIP), consists of an encoder-decoder framework and a cross-modal interaction that consists of cross-modal attention (CMA) and knowledge alignment module (KAM) to facilitate the interaction between features of images and texts. Extensive experiments on a dental implant dataset demonstrated that TCEIP outperformed baselines.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper proposed a new method for predicting implant position that integrates text conditions from the CLIP model. It addresses the limitation of previous methods that only target simple cases with one missing tooth. The proposed method uses cross-modal interaction, which consists of CMA and KAM, to facilitate the interaction between image and text features. The method has been evaluated and outperformed the baselines.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It might be difficult for researchers to reproduce the experiments presented in the paper without access to the dataset. Although the paper references the study that collected the dataset, there is no mention of how to access the dataset. It would be helpful if the authors could provide details on how to access the dataset, or if they could release the dataset along with the code.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper is well-written. But, I will suggest the code and dataset be released to support reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    It would be beneficial if the authors release the code and dataset to enhance reproducibility. Additionally, providing details on the computational complexity of the approach would be helpful for understanding its practical implementation.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes a novel dental implant position regression network that integrates text conditions from the Contrastive Language-Image Pre-training (CLIP) model to assist the implant position regression and achieves performance above baselines. The proposed approach has clinical relevance. However, for reproducibility, it would be beneficial if the code and dataset were released. Additionally, including details about the computational complexity of the approach would be helpful.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The author proposes a text-guided dental implant position prediction method, which improves the accuracy in terms of image-only methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. CLIP is a powerful vision-language pretraining model, and the authors claim that they integrated CLIP into the text-guided network for the first time to improve the performance of dental implant position prediction.
    2. The authors propose a novel network for performing dental implant position detection, which includes CMA and KAM modules.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The authors need to clearly state the clinical significance of introducing text information. The proposed approach is interesting, but the authors need to explain and discuss it in a clinical sense.
    2. The authors can make a direct comparison with the text-guided detection method.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors released some training parameter information, but did not release the code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The authors need to clearly state the clinical significance of introducing text information. The proposed approach is interesting, but the authors need to explain and discuss it in a clinical sense.
    2. The authors can make a direct comparison with the text-guided detection method.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    As a medical imaging conference, authors should supplement clinical discussions. Meanwhile, more experiments will help the manuscript improve its persuasiveness.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    The authors have solved my concerns on additional comparison experiments and discussion of clinical significance, so I would increase my rate.



Review #5

  • Please describe the contribution of the paper

    In this paper, the authors propose a text condition embedded regression network to predict the location of dental implant, which involves the conditional text embedding, encoder-decoder, cross-modal interaction, and final regression. Experiments on a dental implant dataset are used to validate the proposed network when compared with several existing methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The papar is generally easy to follow. The experiments include both quantitative results and subjective visualizations.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    See the comments.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    none

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    (1) The backbones are all ResNet50 or ViT for ResNet50. Can the performance be improved with other advanced backbones? (2) Except for AP75%, can we see some other AP values and why choose 75% here? Please clarify this. (3) The results of ablation study in Table 1 are incomplete. More scenarios should be included. Additionally, in Table 2, it would be better to provide the category of TCEIP. (4) The presentation of this paper can be improved. For example, the figures are in low-quality.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See the comments.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    Thanks for the rebuttal.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper integrates text conditions and image information for assisting the implant position regression. There are some weaknesses in the paper. (1) It might be difficult for researchers to reproduce the experiments presented in the paper without access to the dataset and code. It would be helpful if the authors could provide details on how to access the dataset. (2) Adding more backbones. The backbones are all ResNet50 or ViT for ResNet50. Can the performance be improved with other advanced backbones? More transformer-based approaches could be compared in the experiments (3) Except for AP75%, can we see some other AP values and why choose 75% here? Please clarify this. And I would like to see more evaluation criteria like Precision, recall, AUC and etc, . Right now the performance of their model is unclear.




Author Feedback

We sincerely thank all of the reviewers for their comments and their acknowledgment of the novelty, methodology of our work. We address the key concerns below and will further improve our paper.

R3,R4 - The reproducibility. The dental implant dataset is currently undergoing hospital ethical review and will be released soon. The code will be released once the paper is accepted.

R3 - The computational complexity. When a PC with Nvidia V100 GPU is used, the ITPI (Inference Time Per Image) and FLOPs of our TCEIP with ResNet50 backbone are 29ms and 67.48G respectively, that of TCEIP-ResNet18 are 15ms and 43.87G, respectively. We will include these data in the final version.

R4 - The clinical significance of text information. When there is missing teeth on both sides of alveolar bone, the doctor usually choose either side for treatment, such that the patient can use the other side for chewing. Moreover, the width of the anterior alveolar bone is much narrower than that of the bilateral alveolar bone, which needs to be considered separately when selecting implants. Therefore, it is necessary to divide teeth regions into three categories, i.e., left, middle, and right and use target region as a prior information to help the dentists decide the implant position. While such information is not available to the network, it can be integrated by inputting the text provided by dentists, when using the network for implant position regression.

R4 - Comparison with the text-guided detection method. Thanks for your suggestion. We have included three text-guided detection methods, e.g., TransVG(CVPR2021), VLTVG(CVPR2022) and JointNLT(CVPR2023) for comparison. Experimental results show that their AP75 values are 13.2%, 14.1%, and 15.3%, respectively, which are significantly lower than that of our proposed TCEIP (17.8%).

R5 - Advanced backbones. The main contribution of TCEIP is a framework introducing the text condition to guide the implant position prediction (CMA and KAM module), in which the backbone can be replaced by any available ones. The adoption of ResNet50 is mainly to make a fair comparison with other compared detectors. In addition to ResNet50, we have also evaluated the backbone of SegFormer, and the AP75 of TCEIP-SegFormer and SegFormer are 18.5% and 14.2%, respectively. The results suggest that advanced backbones could improve the performance of TCEIP.

R5 - Clarification of AP75. The diameter of the implant is 3.5~5mm, and clinically the mean error between the predicted and ideal implant position is required to be less than 1mm, i.e., around 25% of the size of implant. Therefore, AP75 is used as the evaluation criteria. Following the suggestion, we will include more criteria, e.g. precision, recall and F1 score, in final version. The primary results show that TCEIP achieves 19.26% precision, 16.14% recall and 17.56% F1 score, which is the best among the benchmarks.

R5 - Ablation studies of Table 1. We will include more ablation studies in Table 1. When individually applying the feature fusion or CMA without KAM, the AP75 are 15.1% and 15.7%, respectively, which is much lower than the results with KAM (17.1% and 16.5%, respectively). These results demonstrate the effectiveness of the above three modules. We actually provided more ablation experiments, e.g., different locations of the CMA module and different text prompts, in the supplementary material.

R5 - The presentation. We will improve the presentation and the quality of figures in the final version.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed the reviewers’ concerns by conducting additional comparison experiments and providing a discussion on the clinical significance.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After rebuttal, all three reviewers agree to accept this paper.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors provided a good rebuttal, and one of the reviewers increased the score. As a result, the final score became among the ones on the higher-side in my pool.



back to top