Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Negin Ghamsarian, Mario Taschwer, Raphael Sznitman, Klaus Schoeffmann

Abstract

Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant structures in these surgeries make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network, termed DeepPyramid, that can deal with these challenges using three novelties: (1) a Pyramid View Fusion module which provides a varying-angle global view of the surrounding region centering at each pixel position in the input convolutional feature map; (2) a Deformable Pyramid Reception module which enables a wide deformable receptive field that can adapt to geometric transformations in the object of interest; and (3) a dedicated Pyramid Loss that adaptively supervises multi-scale semantic feature maps. Combined, we show that these modules can effectively boost semantic segmentation performance, especially in the case of transparency, deformability, scalability, and blunt edges in objects. We demonstrate that our approach performs at a state-of-the-art level and outperforms a number of existing methods with a large margin (3.66% overall improvement in intersection over union compared to the best rival approach).

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_27

SharedIt: https://rdcu.be/cVRyF

Link to the code repository

https://github.com/Negin-Ghamsarian/DeepPyramid_MICCAI2022

Link to the dataset(s)

http://ftp.itec.aau.at/datasets/ovid/DeepPyram/

Reviews

Review #1

Please describe the contribution of the paper

The authors proposed a U-Net based model for semantic segmentation in cataract surgery called DeepPyram. The main contribution is the inclusion of two blocks: Pyramid View Fusion and Deformable Pyramid Reception. The former is in charge of recognizing the relative information between the object and its surroundings, whereas the latter performs shape-wise feature extraction. Ablation studies and comparisons to twelve models are also presented.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper is well written and clear in general. The use of deformable convolutions gives the model the ability to learn geometric distortions that are common in cataract surgery. The additional blocks appear to improve performance.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The authors use data from CaDIS dataset, but they do not compare its results to DeepLab v3+ and UPerNet, which are the two networks used in the original CaDIS article [1].

There is no discussion about the impact of the imbalanced class Instruments. Cornea (78:84), Instruments ( 3190:459), Lens (141:48), and Pupil (141:48).

The authors provided an 8-page supplementary material in which they further discussed their proposal and presented additional experiments. However, according to MICCAI guidelines, only images, tables, and proof of equations are allowed and that these materials must not exceed two pages.

[1] Grammatikopoulou, M. et al. “CaDIS: Cataract dataset for surgical RGB-image segmentation,” Medical Image Analysis, Volume 71, 2021.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors claim the dataset will be available after acceptance of the paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

What is the criteria for choosing the networks to be assessed?

I missed the authors did not include the best results that have been reported on the CaDis dataset for a fair comparison even though they used an extended dataset.

I recommend the authors to show cases where their proposal failed and discuss the reasons for such mis-segmentation.

Minor corrections:

In Abstract “Compbined”

In Fig. 1 “Challenges in semantic segmentation for different in cataract surgery” ??

In Page 2 the order of citations. “Several network architectures for cataract surgery semantic segmentation have been proposed or have been used in the recent past [15,13,14,1,21].” I recommend the authors to sort the numbers for a better presentation “… have been used in the recent past [1,13,14,21].”

In Section 4 and in Table 1. are PSPNet+ and PSPNet the same model?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper looks technically correct, but there is no reason to include experiments and further discussions in supplementary material.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

6
[Post rebuttal] Please justify your decision

Authors have addressed my comments and included UPerNet for comparison purposes.

Review #2

Please describe the contribution of the paper

This paper proposes a semantic segmentation network, which contains three modules. The main novelties of this method is a varying-angle surrounding view, shape-wise feature extraction and multi-scale semantic feature maps.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The main strengths of this paper are two-fold: 1) it designs a varying-angle surrounding view to extract features for each pixel. 2) It extracts shape-wise features, which is very useful for segmentation of complex objects.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The weaknesses of the paper are two-fold: 1) The proposed three modules are all combination of some exiting techniques. The novelties of this paper is very limited. 2) The experiment analysis is not sufficient. The ablation study did not explain some details. For example, the PL loss is removed, which other loss is used?
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The reproducibility of the paper is ok.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
1. Explain more motivation why some exiting techniques are used and more analysis why they are effective.
2. Provide more details about the experiments.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

the novelties of this paper is limited and the experiments are not sufficient.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Not Answered
[Post rebuttal] Please justify your decision

Not Answered

Review #3

Please describe the contribution of the paper

In this paper, a network, called DeepPyram, was proposed. DeepPyram can deal with challenges posed by transparency, deformability, scalability, and blunt edges in objects. It has three novelties: a Pyramid View Fusion module, a Deformable Pyramid Reception module and a dedicated Pyramid Loss module. The authors showed that the proposed approach outperforms existing methods without imposing additional parameters.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The structure and the text of the paper is good while a clear, and The paper contributes to the body of knowledge,
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The paper needs more organization, a detailed description and more experiments.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The technical contribution is limited.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
1) Abstract: The authors should mention the performance improvement achieved by their model compared to state of the art. 2) The main concern is that the authors need to justify why they include PVF in the decoder network. In most of pyramid view fusion networks used in the end of the encoder network. In a typical autoencoder architecture, the encoder extracts global context information from the input image, including the adjacent and class characteristics of the object. However, transmitting information to shallower layers will weaken the extraction of the context information due to the down sampling processes. Thus, many works proposed to use PVF in order to generate multi-scale features for each split using reduced pyramid pooling module. But here, the authors insert this layer in all layers of the decoder. The authors should explain their architecture in more details. 3) The Deformable Pyramid Reception (DPR) network is needed to more details. The proposed DPR is very similar to the network proposed in “DefED-Net: Deformable Encoder-Decoder Network for Liver and Liver Tumor Segmentation”. Also, most of papers proposed DPR networks used them in the encoder networks to learn better context information of the input images than the Atrous spatial pyramid pooling. 4) It is very important in any medical method, to show the limitation of the work is something very promising, given that when reading the work we can glimpse a way to get around these limitations. 5) The experimental section is very limited. There is no ablation study to address the effects of each module on the proposed segmentation model. The authors gave an overview of the proposed model, but they did not mention anything related to each submodule’s architecture. The article was missed a detailed description. 6) Missed References
- Nandi, A., Lei, T., Wang, R., Zhang, Y., Wang, Y., & Liu, C. (2021). DefED-Net: Deformable Encoder-Decoder Network for Liver and Liver Tumor Segmentation.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed model is with limited novelty.
Number of papers in your stack

3
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

6
[Post rebuttal] Please justify your decision

The rebuttal letter is clear, and the authors answer my doubts. The paper can be accepted but after some updating. Some parts in the supplementary file could be moved to the article and keep only images, tables and proof of equations in the supplementary material. For example, in Experimental Results, section 4, the authors need to start with the ablation study and show each module’s effect on the performance of the proposed model (i.e., Table 2). Then the author, for example, need to add Table IV in the supplementary file to the ablation study part. I would also like to add figure 5 in the supplementary file to the paper. Then the authors need to summarise the best combination of the modules and backbones for the proposed model. Afterwards, a comparison to the SoTA methods should be provided (i.e., Table 1). For the Methodology section, the authors need to add the explanation of why they selected the position of the PVF module in the decoder in the proposed model.

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper is reviewed by three experts in the field. The reviews of this work are quite divergent. The authors are suggested to provide a rebuttal to clarify main issues raised by reviewers, including: 1) Incremental novelty. 2) More explanation for architecture. 3) Lacking clinical applicability. 4) Insufficient experiments, e.g., more baselines, ablation study, discussion for imbalanced data, and limitation.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

8

Author Feedback

We would like to thank all reviewers for their constructive feedback.

All Reviewers) Criteria for choosing rival networks: BARNet, PAANet, and RAUNet are tailored for instrument segmentation in surgical videos. Other methods are SoTA for medical image segmentation recently published in top journals or conferences (e.g., TMI).

Although the proposed modules are based on operations such as deformable and dilated convolution and pooling, their novelty lies in how to efficiently exploit and combine these operations to optimally guide the computational resources towards the relevant and determinative content. The superiority of the proposed network compared to a large number of SoTA methods by a large margin as well as the ablation study results confirm the importance of our work.

R3 and Meta-Reviewer) Ablation study: Table 2 in the paper already provides an ablation study, and the last paragraph on page 6 analyzes the results.

R2, R3) Due to page limitations, we kindly ask the reviewers to refer to the supplementary document, section I, Figures 4-5, and Tables III-V for more details and experiments.

R1)

Due to the time constraints, we evaluated UPerNet which has shown to outperform DeepLabV3+ by a large margin in the CaDIS paper. The results using identical configurations to the other methods are listed below:

Object, IoU% Dice%

Lens, 77.78 86.93

Pupil, 93.34 96.52

Cornea, 86.62 92.67

Instrument, 68.51 78.68

Mean, 81.56 88.70

Accordingly, DeepPyram shows more than 3% improvement over UPerNet per object.

Our evaluations are based on binary segmentation per relevant object, so that we do not have the imbalance problem. In the case of multi-class classification, methods such as oversampling can mitigate the imbalance problem. Addressing the class imbalance in cataract surgery is previously explored in a paper in MICCAI 2021, and is beyond the scope of this paper.

We unfortunately were not aware of this important regulation and apologize for this mistake.

Since we have improved the instrument annotations of CaDIS, the results cannot be directly compared.

R2)

When PL is removed, the loss of auxiliary branches is not considered and the general loss (segmentation loss of the network’s output mask) is replaced. PL is explained in detail in the supplementary document, section I and Fig. 1.

R3)

We will add performance improvement to the abstract.

The first PVF module is added at the end of the encoder network as demonstrated in Fig. 2, and as the reviewer clearly stated, the PVF module in this location has the highest impact on the segmentation accuracy. However, using the PVF module for other feature resolutions can especially provide complementary information to improve segmentation accuracy in the region of the edges for objects with blurry or blunt edges such as instruments and Cornea.

While the Ladder-ASPP module aims to effectively encode global contextual information centering around each pixel, the DPR module aims to track the most relevant information to the central pixel considering the fine-grained edge information, which cannot be captured by the former module. Besides, Ladder-ASPP is not able to handle object deformations. Hence, these modules have different applications. We will add this missed reference upon acceptance.

We fully agree that using these modules on top of the encoder’s layers can improve the performance. However, it is always preferred to use a pre-trained backbone, especially for medical image segmentation due to the dearth of annotations. Changing the encoder network entails pretraining on a large dataset (such as ImageNet), which in turn imposes more computational costs. Hence, most of the methods (including nine competitors listed in Table 1) only add their proposed modules after the bottleneck. Nevertheless, since these modules are applied to concatenated features coming from the encoder network via skip connections, the encoder features can be effectively guided.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors have clearly clarified the issues raised by the reviewers. The explanations in the rebuttal look reasonable and correct to me.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

7

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors responded adequately to the reviewers’ comments
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

3

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper presents a cataract surgery segmentation network with two new modules as the main contributions: pyramid view fusion and deformable pyramid reception. The reviewers raised major issues regarding the innovative, interpretable, and experimental aspects of the paper. After rebuttal, two of the reviewers revised their scores and indicated that the feedback letter and additional materials addressed the main issues raised. AC believes that the feedback letter and additional material provide effective answers to the questions raised and would like the authors to improve the final version based on the additional material.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

4

back to top

Object, IoU%	Dice%
Lens, 77.78	86.93
Pupil, 93.34	96.52
Cornea, 86.62	92.67
Instrument, 68.51	78.68
Mean, 81.56	88.70

DeepPyramid: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos