Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Adrito Das, Danyal Z. Khan, Simon C. Williams, John G. Hanrahan, Anouk Borg, Neil L. Dorward, Sophia Bano, Hani J. Marcus, Danail Stoyanov

Abstract

Pituitary tumours are in an anatomically dense region of the body, and often distort or encase the surrounding critical structures. This, in combination with anatomical variations and limitations imposed by endoscope technology, makes intra-operative identification and protection of these structures challenging. Advances in machine learning have allowed for the opportunity to automatically identifying these anatomical structures within operative videos. However, to the best of the authors’ knowledge, this remains an unaddressed problem in the sellar phase of endoscopic pituitary surgery. In this paper, PAINet (Pituitary Anatomy Identification Network), a multi-task network capable of identifying the ten critical anatomical structures, is proposed. PAINet jointly learns: (1) the semantic segmentation of the two most prominent, largest, and frequently occurring structures (sella and clival recess); and (2) the centroid detection of the remaining eight less prominent, smaller, and less frequently occurring structures. PAINet utilises an EfficientNetB3 encoder and a U-Net++ decoder with a convolution layer for segmentation and pooling layer for detection. A dataset of 64-videos (635 images) were recorded, and annotated for anatomical structures through multi-round expert consensus. Implementing 5-fold cross-validation, PAINet achieved 66.1% and 54.1% IoU for sella and clival recess semantic segmentation respectively, and 53.2% MPCK-20% for centroid detection of the remaining eight structures, improving on single-task performances. This therefore demonstrates automated identification of anatomical critical structures in the sellar phase of endoscopic pituitary surgery is possible.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43996-4_45

SharedIt: https://rdcu.be/dnwPq

Link to the code repository

https://github.com/dreets/pitnet-anat-public

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    The manuscript presents a multi-task DL-based method for anatomical region segmentation applied in endoscopic pituitary surgery.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Development of a multi-task model with two output layers and two loss functions to perform simultaneous segmentation and detection tasks.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    No comparison is provided to the conventional/ standard way of segmenting the target regions in pituitary surgery.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The data is collected in the representative institute and is not publicly available. The codes will be publically available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Section 4 - there is an overlap between the training and testing set. I recommend you explain more convincingly how this issue could not affect the predicted results.

    Section 6 - This section only focuses on the proposed approach’s results. The discussion of the results and the comparison with the related works are missing. Please consider these points.

    Section 6 - Although the results are very well explained, there needs to be an explanation to highlight of the value of the proposed approach and show how it can solve a clinical issue. One way to address this point is to compare the proposed method with the current conventional/ standard techniques used during pituitary surgery.

    Section 6 - No information about the computation time is provided, while it can be an added value of the proposed approach showing its feasibility for real-time application.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The manuscript represents a new method that addresses a known clinical issue in pituitary surgery. However, the value of the proposed method needs to be highlighted by showing how the model is capable to solve this issue. Moreover, steps to bring this system into current clinical settings should be discussed.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    Summary: In this paper, the authors describe a new application of segmentation and object detection techniques for identifying 10 critical structures during pituitary tumor removal. They combine the use of two existing model structures to create a multi-task model that segments the two most critical structures and detects the centroid of 8 other important structures. They achieve promising results and are the first to combine all 10 of these structures into one model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Strengths:

    • Novel Work: the ability to identify all the important features from one model makes this work novel and important for eventual clinical translation.
    • Description of problem and paper achievements are very clearly written and described.
    • In related work: The current state of the art is very clearly described and referenced making it easy to comprehend the proposed improvements by the authors.
    • Reproducibility: the authors provide a lot of information on how to reproduce this model (model architecture and hyperparameters).
    • Clinical Feasibility: This work seems to be transferable to clinical work and have great impact on outcomes.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Weakness:

    • Reproducibility: There is no mention of providing the trained model or the codebase through an open repository. I would recommend considering this.
    • Train/test division: No description of the number of images that were left out for testing. Out of the 640 images. Are the performance values from the 5 fold cross validation? Is there a completely left out test set used to confirm the performance values acheived?
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
    • Reproducibility: The authors provide a lot of information on how to reproduce this model (model architecture and hyperparameters). However, There is no mention of providing the trained model or the codebase through an open repository. I would recommend considering this.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Specific Comments:

    • Abstract: “make intra-operatively identification ” typo
    • Figure 3: Description say top dar is the frequency (which I read as percentage) and bottom bar is the relative total area - I believe the legend in the figure itself has flipped the two (assuming you mean to say that the Sella structure is in 100% of the images).
    • It would be nice to see a table that outlines the performance of each of the 8 centroid based structures. The granularity of addressing the small structures is one of the main contributions of this paper, and therefore should be supported by displaying the results. It would be interesting to see if the performance of each structure is correlated to its occurrence in the dataset, or if there are some structures that are more affected (covered/hidden) than others by the tools used during surgery and in consequence, have lower performance.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is concise and well written. It make great use of existing literature by applying it to a complex problem and providing a encompassing solution that addresses multi task detection/segmentation.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    I believe this paper was well organized, researched and thorough. The authors addressed most issues raised by reviewers and make good moved towards true reproducibility and novel application. If they include the changes and clarifications addressed in the review in the final paper, I believe this would make a great submission.



Review #5

  • Please describe the contribution of the paper

    This paper focuses on a novel clinical task of identifying anatomical structures in endoscopic surgical images using computer vision techniques. The authors have demonstrated a commendable effort by collecting their own dataset for the segmentation problem. The idea of using machine learning to tackle clinical challenges is exciting and promising. However, the technical contribution of this paper is limited. The proposed network, PAINet, is merely a modification of the existing UNet++ architecture by replacing the encoder with the more powerful EfficientNet. While the authors have evaluated PAINet on their dataset and compared it with other state-of-the-art models, they did not propose any novel or groundbreaking ideas to advance the field. Nonetheless, the paper provides a valuable benchmark for researchers to evaluate different models for the target task. Overall, the paper has more clinical contributions than the technical contributions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Strength: The paper is trying to solve a novel clinical problem. The authors have collected their own dataset for the segmentation problem, which I can imagine is a lot of hard work.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Weakness: The technical contribution of the paper is limited, as the proposed PAINet is simply a combination of existing EfficientNet and Unet++ architectures, without proposing any new models. The paper is rather benchmarking different architectures for the target task. Additionally, some technical descriptions in the paper are not accurate, such as “8-encoders (pre-trained convolutional neural networks) and 15-decoders” should be “an encoder with 8 convolutional layers and a decoder with 15 convolutional layers.”

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I’m not sure, the dataset is not public, the code is not publically available. My decision is not made upon the reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This paper focuses on a novel clinical task of identifying anatomical structures in endoscopic surgical images using computer vision techniques. The authors have demonstrated a commendable effort by collecting their own dataset for the segmentation problem. The idea of using machine learning to tackle clinical challenges is exciting and promising. However, the technical contribution of this paper is limited. The proposed network, PAINet, is merely a modification of the existing UNet++ architecture by replacing the encoder with the more powerful EfficientNet. While the authors have evaluated PAINet on their dataset and compared it with other state-of-the-art models, they did not propose any novel or groundbreaking ideas. Nonetheless, the paper provides a valuable benchmark for researchers to evaluate different models for the target task. Overall, the paper has more clinical contributions than the technical contributions.

    Strength: The paper is trying to solve a novel clinical problem. The authors have collected their own dataset for the segmentation problem, which I can imagine is a lot of hard work. Weakness: The technical contribution of the paper is limited, as the proposed PAINet is simply a combination of existing EfficientNet and Unet++ architectures, without proposing any new models. The paper is rather benchmarking different architectures for the target task. Additionally, some technical descriptions in the paper are not accurate, such as “8-encoders (pre-trained convolutional neural networks) and 15-decoders” should be “an encoder with 8 convolutional layers and a decoder with 15 convolutional layers.” Other comments: In Section 3.2, “Cnvolution” should be corrected to “convolution,” “trialed” should be changed to “applied.” Moreover, it would be helpful if the authors provide a reference for the Hausdorff distance loss used in the paper. Overall, the paper presents an approach to a novel clinical problem, but as MICCAI is a technical conference, more significant technical contributions are required.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea of using machine learning to tackle clinical challenges is exciting and promising. However, the technical contribution of this paper is very limited. The proposed network, PAINet, is merely a modification of the existing UNet++ architecture by replacing the encoder with the more powerful EfficientNet. While the authors have evaluated PAINet on their dataset and compared it with other state-of-the-art models, they did not propose any novel or groundbreaking ideas.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors present their work in developing PAINnet - a multitask encoder-decoder network that offers both segmentation and centroid detection of anatomic structures in endoscopic pituitary surgery.

    The authors utilize EfficientNetB3 that is pretrained on ImageNet and Unet++ decoder.

    Strengths of the paper include: 1) A description of the annotation procedures used and the use of consultant clinicians to verify annotations performed by (presumably) trainee neurosurgeons. 2) The authors provide extensive reporting of their results of attempting different models and the subsequent results on performance.

    Perhaps the main criticism of the work is that authors are exploring/benchmarking different architectures of encoder-decoder construction to assess performance on a newly proposed task/dataset, limiting technical novelty. That being said, the exploration of benchmarks is thorough.

    Other weaknesses that are not necessarily outweighed by strengths and merit further clarification by authors: 1) As reviewer #2 highlighted, the results are very well explained, but there is minimal discussion in the manuscript. The authors’ interpretation/explanation of the value of the proposed approach and how it can solve a clinical issue would be helpful for context. Some of the lack of context regarding their approach also comes from a lack of comparison to other methods to achieve segmentation. While the explanation of annotation is a strength of the paper, it could be made more concise to allow for greater interpretation of their results in the discussion. 2) Two reviewers did suggest that images may have been in both the training and testing set. The authors seem to note that the train/test splits were performed at the video level (the authors note: “Images from a singular video were present in either the training or validation dataset”), suggesting images from one video were either in the training or the test set for any given fold but not both. However, this seems to suggest that at some point all images were in the training set and there was not a held out test set. Additional clarification on this would help in interpreting results. 3) Given the authors highlight the granular nature of the task with regards to centroid identification, it would be helpful to see data as to whether performance on centroids was affected by representation in the data (e.g. was poorly represented structure like right optic carotid recess more severely impacted than planum sphenoidum?) 4) In figure 3, there is a discrepancy between the figure legend and the figure caption which raises some confusion regarding the representation of the data and should be clarified.




Author Feedback

The authors thank the reviewers for their constructive feedback. All reviewers acknowledged the clarity of the manuscript; the novelty of the clinical problem; the difficulty of obtaining the annotated dataset; and the thoroughness of the experimental exploration. Reponses to specific comments are found below, and these clarifications will be added to and elaborated on in the camera-ready version. (1) Technical novelty: MR; R2; R5. PAINet is the first multi-task model designed for simultaneously performing anatomy segmentation and landmark detection in surgery. The network mimics the thought process of surgeons during pituitary surgery where the surgeons visually identify the sella (the largest structure) and weakly localise the smaller structures with respect to it [10.1016/j.otc.2015.09.001]. To achieve this, PAINet uniquely utilised two loss functions for improved performance over single-task models due to the increased information gain from the complementary task. To the best of our knowledge, there is no model of any kind that automatically identifies these critical structures, as elaborated on in (2). Therefore, establishing a baseline architecture is important before further research advancements can be done. (2) Discussion: MR; R2. Potentially fatal damage to the carotid arteries is observed in approximately 4% of endonasal skull-based surgeries [10.3390/brainsci11010099], highlighting the importance of the problem as acknowledged by all reviewers. No models for this task currently exist. During surgery, neurosurgeons identify the critical structures using visual clues and with the aid of two instruments: (i) A stealth-pointer which aligns with a pre-operative MRI scan, allowing the surgeon to point at a specific anatomy and identify it from the MRI scan; and (ii) a micro-doppler which detects surface vibrations, allowing the surgeon to hear the blood pulses of the carotid arteries. The primary issue with both approaches is once the instrument is removed, identification is lost upon re-entry with another instrument, and can only be pegged to more visible anatomical landmarks. Secondly, utilising these instruments interrupts surgical workflow. The proposed method solves these issues without the need for additional instruments. (3) Dataset split: MR; R2; R3. 5-fold cross-validation was used due to the relatively small dataset size. For each fold the validation dataset was not used in training. Average results across all folds are presented. There is no separate hold-out testing dataset. (4) Results: MR; R2; R3. MPCK-20%: 28% planum sphenoidal; 37% optic carotids recess; 62% carotids; 63% optic protuberances; 74% tuberculum sellae. The results indicate performance is positively correlated with the number of images where the structure is present. This implies the limiting factor is the number of images rather than architectural design. Incorporating semi-supervised techniques such as teacher-student learning on unannotated images may further improve performance. The most important structures to identify and avoid, the carotids and optic protuberances, have high performance, and therefore demonstrate the success of PAINet. This performance is higher than similar studies in endoscopic pituitary surgery for different structures [10.1093/ons/opab187] but lower than anatomical detection in other surgeries [10.1007/s11548-021-02431-z], due to fewer visual features. As a segmentation task, the performance is better than other surgeries [10.1097/sla.0000000000004594]. (5) Clinical translation: R2. Valuation runtime is under 0.1 seconds per image on a NVIDIA Tesla V100 Tensor Core 32-GB GPU, therefore a real-time overlay on-top of the endoscope video feed is feasible intra-operatively using a local device (e.g. NVIDIA Clara AGX). (6) Reproducibility: R3; R5. The code will be publicly released. The dataset remains private due to current ethical approvals but will be released, as done for the PitVis EndoVis MICCAI-2023 challenge.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed the primary concerns and clarifications requested in the initial meta-review. The responses were convincing that the strengths do outweigh the weaknesses of the paper with the hope that these clarifications and discussion are more fully explored in the camera-ready version (space permitting). I think that this exploration of this approach would be of interest to the MICCAI community and present a project that has considered clinical translation at least with regards to approach mirroring or approximating clinical thinking as noted by the authors in rebuttal.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors responded adequately to the reviewers’ comments



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper aims to simultaneously segment and detect the centroid of many critical anatomical structures in the sellar phase of endoscopic pituitary surgery. This is a new and interesting task in CAI. After rebuttal, two out of three reviewers are positive. The remaining negative reviewer comments on technical novelty of the method, because the paper just combines existing network models. This concern is understandable, however, the meta-reviewer believes the main contribution of this work is new task formulation in CAI, rather than the deep learning technical contribution. Overall, this is a decent work and well presented. Therefore, the meta reviewer incline to accept it.



back to top