Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yubo Fan, Jianing Wang, Yiyuan Zhao, Rui Li, Han Liu, Robert F. Labadie, Jack H. Noble, Benoit M. Dawant

Abstract

Cochlear implants (CIs) are neuroprosthetics that can provide a sense of sound to people with severe-to-profound hearing loss. A CI contains an electrode array (EA) that is threaded into the cochlea during surgery. Recent studies have shown that hearing outcomes are correlated with EA placement. An image-guided cochlear implant programming technique is based on this correlation and utilizes the EA location with respect to the intracochlear anatomy to help audiologists adjust the CI settings to improve hearing. Automated methods to localize EA in postoperative CT images are of great interest for large-scale studies and for translation into the clinical workflow. In this work, we propose a unified deep-learning-based framework for automated EA localization. It consists of a multi-task network and a series of postprocessing algorithms to localize various types of EAs. The evaluation on a dataset with 27 cadaveric samples shows that its localization error is slightly smaller than the state-of-the-art method. Another evaluation on a large-scale clinical dataset containing 561 cases across two institutions demonstrates a significant improvement in robustness compared to the state-of-the-art method. This suggests that this technique could be integrated into the clinical workflow and provide audiologists with information that facilitates the programming of the implant leading to improved patient care.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43996-4_36

SharedIt: https://rdcu.be/dnwPf

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a two-stage (DL + post-processing) pipeline for electrode array localisation of Cochlear implants on two (CB)CT datasets

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • unified approach shows better results compared to SOTA by qualitative assessment of experts and selected images in Fig 4
    • use of different datasets that are varied in image acquisition and EA types.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The authors relied on adjusted predictions from previous methods to generate ground truth for training. However, it is unclear why the authors didn’t follow this approach for validation/testing and instead decided to ask 3 experts to evaluate the results in contrast to previous SOTA. Moreover, it is unclear why experts were presented with SOTA nd proposed approach side-by-side since experts are not comparing these but rather assessing accept or failed given the localisation of EAs of a particular method. It is unclear whether experts evaluated these in 3D or on a particular plane. In summary, an Euclidian distance would have been a better metric given that ground truth was available for training. -Moreover, related to max P2PE between methods, what if both methods are wrong predicting things in similar locations
    • The authors argue that it is important to know the localisation of the electrodes to find optimal settings of CIs. However, the authors do not mention or cover anything related to identifying the electrodes in the context of the anatomy or how the anatomy could be identified of patient-specific cases
    • Moreover, it is difficult to assess what is the performance of the first stage compared to that of the entire framework. A study is neeed of comparing their DL approach for electrode identification as a first stage with other image-processing methods. Moreover, the second stage consists of ad-hoc rules that are claimed to be simple but there are no ablation studies that give evidence that they generalise to other electrode types
    • It is confusing and unclear why another DL model is used for those images that have intensities truncated to 3071HU
    • Results are slightly better but with no statistical significance for distantly-spaced EAs. This is even more difficult to assess for closedly-spaced EAs
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No dataset, no code. It might be difficult to reproduce

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • re-write the section of datasets since it is confusing
    • A table as supplemental material is needed to understand the specs of EAs, i.e spacing contyact size, as well as medical imaging spacing
    • explain the rationale of why the model is trained on CBCT but testing on conventional CT and not mixed or viceversa
    • add more details on why dataset 2 has gold ground truth via paired micro-CT; show an example as a figure.
    • when the ground truth is dilated, the authors mentioned 3 voxels are used. However, there is no investigation of the effect of this choice in regards to perhaps different image spacing
    • define clDICE
    • be more specific when refering to images being registered to an atlas, which atlas?
    • indicate why previous method has a case as an outlier, there is no reason about that.
    • explain why there is variability among experts in Fig 3 with previous methods
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Proposed method seem to achieve better results than SOTA (qualitatively) but the differences are minimal

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper describes a unified framework to detect electrode arrays or cochlear implants by localizing the electrode arrays, finding the center line. The approach is evaluated on a substantial clinical data set.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper builds on an approach previously published and extends it to train for the geometries of different vendors’ cochler implants geometries. The approach seems to work well and is evaluated in a nicely defined “clinical” study with 27 patients as training and a few hundred test cases. The assessment is designed well and shows the practicability of the approach.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weaknesses of the paper are that two central references (8 & 12) are most likely from the group. Without these papers it is hard to assess the methodology and the underlying approach. Basically this means to include the pertinent information here, too. However, this would make it evident that the present contribution is not a signifcant improvement over the stat of the art. I am doubtful whether this can be fixed.

    p. 4, second paragraph, electrodes are localized with a heatmap regression that should be refernced and explained as this is (one of the) central themes in this work.

    p. 5, postprocessing: what will happen if the implanted electrode has a “fold-over” what can happen clinically? Did you have such a case, and why would or wouldn’t it be a problem?

    no page: on base of 27 data sets for training, was there data augmentation used? How was it done? How much data were available at the end?

    p. 6, evaluation: as the system comes out with numbers for the localization of the electrode arrays I wonder whether all the data were aligned to the same coordinate frame before all DL work started? How can the predictions be transferred to a specific data set?

    p. 7, 2nd paragraph: what is a “projection image”? 3rd paragraph: as R1 evaluated the small error results only, how did you make sure that this single observer is perroming as the other three? Why is there no radar plot for the small-error data subset presented?

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    will be ok

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    see above

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The central foudations of the paper cannot be assessed and so the advancement over the state-of-the-art can hardly be judged.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    7

  • [Post rebuttal] Please justify your decision

    My issues have been answered satisfactorily and the paper can now be accepted.



Review #3

  • Please describe the contribution of the paper

    This paper presents a unified automatic deep learning framework for the localization and ordering of the electrode array in postoperative CT and CBCT images following a cochlear implant surgery. It combines a multi-task U-net with several geometric localization and image processing algorithms. A large dataset composed of several types of electrode arrays with both closely and distantly spaced electrodes has been used to train, validate and test the model. Evaluation on a cadaver datasets as well as on real CT/CBCTs illustrates a good performance of the proposed method compared to the SOTA.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • A unified framework can support large-scale quantitative studies in a consistent manner instead of relying on different company-specific software with algorithms more adapted to their electrode arrays portfolio.
    • The heterogenous training and validation datasets including 8 types of electrode arrays with both closely and distantly spaced contacts from 2 institutes and different imaging modalities and acquisition configuration is a major challenge in CI image analysis given the very small size of the electrodes and the cochlea.
    • Using the MONAI framework strengthen the work being the fruit of a consensus work from the MICCAI scientific community.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • No major weaknesses could be identified.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    While none of the datasets is accessible, the reproducibility of the methods is possible thanks to the clear explanation of the implementation details for the presented algorithms.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The paper is well-organized and illustrated with good figures. The literature summary allows a good grasp on the topic. Though, anonymizing several related references hinders the comprehension of landmark works. In particular, references [8,12] which have been used as the SOTA for comparison. In addition, the cornerstone publication which uncovered the tonotopy characteristics of the cochlea is missing. Please add [Greenwood 1990].

    • “The images are preprocessed by being registered to an atlas,” -> Please provide brief details about the utilized atlas.
    • Are the training and validation data (from Dataset #1A) balanced wrt. the two categories of distantly/closely spaced electrodes? and the different types?
    • What is the percentage of the closely to distantly spaced electrodes in the testing dataset from Dataset #1 (the 561 cases)?

    The evaluation results show that in the subjective evaluation, which is based on the experts rating on Dataset #1B (Fig.3), the proposed method significantly outperforms the SOTA compared to the slight improvement which has been captured in the objective evaluation (results from Dataset #2 Fig.2). How can this be interpreted?

    [Greenwood 1990] Greenwood, D. D. (1990). “A cochlear frequency-position function for several species—29 years later.” The Journal of the Acoustical Society of America 87(6): 2592-2605.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    A clinically relevant work that addresses a highly challenging problem in the field of cochlear implant surgery. The presented method has a solid basis and the evaluation results are promising. In addition, the dataset is presumed to be the largest and most heterogenous in the related literature.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Overall most reviewers thought this was a well defined study on a diverse dataset. R3 was very positive on the work overall and really liked the clinical applicability.

    The main critiques from R1 and R2 where that it appears most of the work is built off of a previously established approach. R2 felt the methodology was not clearly explained without independently referencing previous work. R1 felt the improvements over existing state of the art were only incremental.

    There were many suggestions for additional data. I found that especially important were suggestions for explaining how training and testing sets were defined (R1 mentioned different modalities being used for different part of the development, R2 mentioned going from standard template space for training and patient specific space for testing).




Author Feedback

We thank the reviewers and AC for their insightful and positive comments and for recognizing the value of our study. Below are succinct clarifications/responses to the major concerns. Clarifications discussed will be included in the manuscript.

Proposed vs. previous work [R1,R2,AC]: [8,12] use traditional image processing (not deep learning (DL)) to localize closely- and distantly-spaced electrode arrays (EAs), respectively. [8] uses image intensity and the Frangi vesselness filter to extract centerline segment candidates. [12] detects the electrode candidates by a hand-crafted blob filter. Complex cost functions were designed in [8,12] for subsequent graph-search-based optimizations. In contrast, we use data-driven learning to produce the first unified DL method. It not only alleviates the need for manually crafted feature extractors and search mechanisms but also leads to a more robust method than the SOTA and is applicable to EAs from all three main manufacturers.

Performance [R1,AC]: The proposed method (PROP) outperforms SOTA in 2 aspects: (1) Robustness, which is key for clinical deployment as failed localization results require manual intervention. A 9% reduction of failure rate on 561 clinical cases is substantial. (2) Reduced median errors in all 5 P2PE metrics in dataset #2 (DS2) by a large margin (Max:6.0%; Median:10.4%; Mean:2.5%; Std:9.4%; Min:20.8%). Although statistical significance has not been reached, we note that DS2 is relatively small (27 cadaveric cases with paired CT-microCTs). We are increasing it and hypothesize that we will reach significance.

Training/test sets [R1,AC]: The CBCTs come from our institution and most CTs from our collaborators. Training with mixed data is explored, but training with CBCTs alone can (1) produce a model specific to our site and (2) permit testing its robustness to scanner, acquisition, and site differences. The clinical test set contains both CTs and CBCTs.

Atlas registration [R1,R2,R3,AC]: All training/test images are rigidly registered to the left ear (mirroring is performed if one case is a right ear) of a template volume (atlas). All images have thus similar orientation.

Evaluation metric [R1]. Visual evaluation is used because (1) The “ground truth (GT)” used for training (on DS1) is the output of the SOTA and only large errors (~>0.3mm) are manually corrected; correcting smaller ones on >700 training volumes would be too taxing. The Euclidean distance between this “GT” and the results in the test set would thus be zero for the SOTA for all contacts that were not corrected. This would unfairly favor it. (2) In current clinical practice at our institution, expert visual evaluation is used to determine whether results are acceptable.

Side-by-side comparison [R1]. When both methods are acceptable, users rank the quality of the results (e.g., center vs. slightly off-center contact localization). Average preferences across the three raters were: [No preference=49.5%; SOTA preferred=20.2%; PROP preferred=30.3%], giving PROP an edge.

Projection images [R1,R2]: we generate MIP (Maximum Intensity Projection) images in three orthogonal directions in a bounding box surrounding the implant as shown in Fig. 4. One of the axes is aligned with the midmodiolar axis, which permits visualizing the entire cochlea on one single image (largest panels in Fig 4).

Outlier in Fig.2 [R1]: The SOTA behaved poorly on one low-quality image.

Rater1 evaluation [R2]: Rater1 is the same for the large and small error evaluation. We believe radar plot for the whole test set can better show overall performance.

Result interpretation [R3]: Small Euclidean differences (the median of the mean P2PE between PROP and SOTA on the clinical test set is 0.1mm) may lead to different expert ratings. This reinforces the value of a visual evaluation.

Size of training set [R2]: We actually use 763 volumes from DS1 for training+validation.

Missing citation and details on EA specs will be provided. [R1,R3]




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After rebuttal two reviewers now believe this is a manuscript of sufficient quality for acceptance in MICCAI. The rebuttal overall did a good job of addressing concerns, as reflected by R3 these changes DO need to make it into the final manuscript. I think especially providing the context in terms of algorithm performance failure.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposed a two-stage framework for the localization and ordering of the electrode array in postoperative CT and CBCT images. The overall pipeline is easy to follow and the writing is also good. There’s no significant weakness in the manuscript. Accroding to the rebuttal, most of the issues have been carefully addressed by the authors. Therefore, I recommentd the acceptance of this submission.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed major concerns raised. The information provided in the rebuttal must be incorporated in the camera ready.



back to top