Authors

Tal Shaharabany, Lior Wolf

Abstract

The leading medical image segmentation methods represent the output map as a pixel grid. We present an alternative in which the object edges are modeled, per image patch, as a polygon with $k$ vertices that is coupled with per-patch label probabilities. The vertices are optimized by employing a differentiable neural renderer to create a raster image. The delineated region is then compared with the ground truth segmentation. Our method obtains multiple state-of-the-art results for the Gland segmentation dataset (Glas), the Nucleus challenges (MoNuSeg), and multiple polyp segmentation datasets, as well as for non-medical benchmarks, including Cityscapes, CUB, and Vaihingen. Our code for training and reproducing these results is attached as a supplement.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_30

SharedIt: https://rdcu.be/cVRyI

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The paper proposes using patch wise polygons and neural renderers as the decoding branch of encoder-decoder deep architectures for segmentation purposes. This allows using an arbitrary resolution before rendering segmentation masks on the initial input size. The approach is evaluated on several medical and non-medical benchmark datasets and achieves top results.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

• Using neural renderers in a segmentation task is novel. • The method is evaluated on several challenging benchmark datasets and achieves top performance.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

• The paper needs fundamental improvements in terms of structure and writing (see detailed and constructive comments). The current form makes it very difficult to understand the rationale on why the proposed approach should provide better segmentation performance when compared to classical approaches. In particular, the authors states that a second type of output representation is provided, which seems not correct the final output is based on a rastered image, the latter being similar to classical approaches. Therefore, it is remains unclear why a classical approach could not learn an equivalent mapping. The related work section contains a enumeration of segmentation approaches and lacks conclusions on what is lacking in the literature and how the proposed approach could solve current issues. Fig. 1 uses many concepts that are introduced much later in the paper, making it difficult to understand. • The performance reported in Section 4 is impressive, but it is not clear how internal parameters (in particular s and k) were optimized for the various datasets. Only one parameter sensitivity analysis is reported in Fig. 6 of the appendix, suggesting high performance variation (well above inter-algorithm variations in Table 4) and overfit. The mIoU reported in Table 4 correspond to cherry picking the top performance among all parameters tested in Fig 6 of the appendix (i.e. 90.92 for k=5 and patch size=2^3).
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The paper uses publicly accessible datasources and provide their code as supplementary material (not tested). The reproducibility is very good.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

• Provide a section on k, s parameter selection for each benchmark dataset • Report average and standard deviations of the performance in every table over parameter choices if no clear strategy is available to fix k and s beforehand. • Improve the structure of the paper with the following: 1. The first paragraph of the introduction states that a second type of output representation is provided, which seems not correct the final output is based on a rastered image, the latter being similar to classical approaches. Therefore, it must be clarified why a classical approach could not learn an equivalent mapping. 2. The introduction mixes related work and methods, without really introducing the problem that is addressed. The sentence “Active contour methods” is unclear and seems misplaced. 3. The caption of Fig. 1 is using lots of elements that are only introduced on page 4. Please reformulate. 4. Section 2: “In another contribution, an attention maps to each feature map in the encoder-decoder block…” This sentence is incorrect. 5. Section 2 contains an enumeration of segmentation approaches and lacks conclusions on what is lacking in the literature and how the proposed approach could solve current issues. 6. In section 3, the output domain of f_1 is \mathbb{R}^{C x H/32 x W/32}, please justify why “32”. 7. Section 3 states that “The decoder f_2 contains two upsampling blocks to obtain an output resolution of s = 8”, but s is a free parameter with various values used in Experiments. Please clarify. • Typos are present, please verify the entire document (e.g. “patche”)
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

• Novel approach for segmentation using neural renderers. • Unsatisfactory paper organization. • Unclear optimization of internal parameters and risk of overestimation of the performance.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

4
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This paper presents an image segmentation method in which the object edges are modeled as a polygon with vertices. The method obtains multiple stage-of-the-art results in several public datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1.This paper adopt a neural renderer to translate the polygons to binary raster-graphics masks for optimization purposes of current activte contour methods. 2.The method achieves considerable results on both medical image datasets and a non-medical dataset.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1.According to the training script in the code,the proposed method is a single-class segmentation method. There is no such claim in the paper, which will make readers confused. 2.There is no figure to present the CNN model structure. Although it is not a hard requirement in the paper, it would largely help the reader understand the design of the techniques. 3.The references are limited.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

1.Please specify the single-class segmentation method in the paper. 2.It would be better when you show the figure for the CNN model. 3.There are several excellent papers you may cite to enchance the references: AFP-Mask: Anchor-free Polyp Instance Segmentation in Colonoscopy Colorectal polyp segmentation by u-net with dilation convolution
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
1. The multi-label segmention is common in medical images. It will be a more efficient technique to extend the method presented in the paper to multi-class tasks.
2. The contour methods would be effective if the segmentation of the object can be represented by one closed polyline. What if there are more than one polyline, for example, a tire which should be represented by two circles.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The technique is well designed. The results are considerable. The clarity and organization of the manuscript is poor.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The main contribution of the paper is the novel way to model the edge of segmentation boundaries using polygons with multiple vertices. These polygons will then generate a raster image via neural renderer.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The proposed method for medical image segmentation is novel in the way that it is not directly output pixel value segmentation but a geometry representation instead. To make this formulation trainable, the author use neural renderer to convert the representation into pixel map where cross entropy and dice loss can be applied.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The description of the network architecture is a bit hard to follow. It is better to have an image illustrating the overall structure and flow of the network.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Some details are missing. The authors seems just check everything ‘yes’ in the checklist.
1. Missing one of the loss function definition, i.e. L_{BCE}. Missing software framework and version.
2. No new dataset proposed.
3. Missing training time for the proposed network.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
1. The representation of a polygon with k vertices is not clearly described. The output is 2k + 1 for each polygon, then what does 2k and 1 means? Does it also mean all the polygon have k vertices? What is the k value then?
2. Where does the map M come from? How to choose suitable size for map M?
3. There is a discussion on the advantage of polygons that it allows one to be rasterized at any resolution. Could you further explain this point? why is it important for image segmentation task?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

In general, the presentation is lack of illustration for readers to follow. The proposed segmentation method is novel and the experimental results demonstrate its advantage. My main concern is the presentation of the paper only slight above average. Therefore, my initial rating is weak accept.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #4

Please describe the contribution of the paper

This paper introduces a 3D render method to improve the label of the medical images for segmentation, which achieves promising improvements on six medical image segmentation datasets.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The code is open source, which is beneficial for researchers in the community to follow. The experimental results demonstrate that the proposed methods achieve state-of-the-art performance on a variety of biomedical image datasets.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

This article seems to use the combination of the result label after 3D render and the original label. As can be seen from Figure 1, the 3D rendered label contains many mistakes, and the original label is used to correct the 3D rendered label. But how to evaluate the accuracy of the 3D rendered label or the combined label?

The improvements compared to the state-of-the-art results are relatively limited. It improved just 0.1% on five of the eight metrics on three polyp benchmarks. This paper contains some typos.

For example, ‘the Citycapes dataset’ should be corrected as ‘the Cityscapes dataset’
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The datasets are available. The code is available.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

The author is suggested adding some evaluation of the 3d rendered label and analysis of the efficiency of the 3d rendered method.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The experimental results demonstrate that the proposed methods achieve state-of-the-art performance on six biomedical image datasets.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

All reviewers express a favorable opinion of the paper. They appreciate the novelty of the approach and the quality of the results. They also concur on criticizing the lack of clarity in writing and manuscript organization, and R1 even recommends rejecting the paper on those grounds. The AC agrees with the reviewers’ majority on the value of the contribution to the MICCAI community and recommends acceptance. However, the authors should carefully include all reviewer comments and suggestions in the final version. In particular: (1) Detailed writing and organization suggestions by R1, (2) An overview figure for the method (R2, R3), (3) Missing experimental details (R1, R2).
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

3

Author Feedback

We thank the reviewer and the area chair for the supportive feedback. All requests for elucidation will be fully and carefully addressed, including (1) the incorporation of the detailed writing and organization suggestions by R1, (2) add an overview figure for the method, (3) add the suggested references, and (4) providing the experimental details in full.

back to top

End-to-End Segmentation of Medical Images via Patch-wise Polygons Prediction