Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jiazhen Zhang, Rajesh Venkataraman, Lawrence H. Staib, John A. Onofrey

Abstract

Segmentation of the prostate into specific anatomical zones is important for radiological assessment of prostate cancer in magnetic resonance imaging (MRI). Of particular interest is segmenting the prostate into two regions of interest: the central gland (CG) and peripheral zone (PZ). In this paper, we propose to integrate an anatomical atlas of prostate zone shape into a deep learning semantic segmentation framework to segment the CG and PZ in T2-weighted MRI. Our approach in corporates anatomical information in the form of a probabilistic prostate zone atlas and utilizes a dynamically controlled hyperparameter to combine the atlas with the semantic segmentation result. In addition to providing significantly improved segmentation performance, this hyperparameter is capable of being dynamically adjusted during the inference stage to provide users with a mechanism to refine the segmentation. We validate our approach using an external test dataset and demonstrate Dice similarity coefficient values (mean±SD) of 0.91±0.05 for the CG and 0.77±0.16 for the PZ that significantly improves upon the baseline segmentation results without the atlas. All code is publicly available on GitHub: https://github.com/OnofreyLab/prostate_atlas_segm_miccai2022.



Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_55

SharedIt: https://rdcu.be/cVRy9

Link to the code repository

https://github.com/OnofreyLab/prostate_atlas_segm_miccai2022

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a novel segmentation approach for prostate zones, by integrating the anatomical prior into an atlas map. Besides, the weight of fusing with the atlas could be adjusted in the testing phase.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The idea of integrating the spatial probabilistic prior into an atlas to help the segmentation framework is interesting and novel.
    2. The writing and organization is clear, making the whole process easy to understand.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. As the author mentioned, the assumption of having the available WG mask is very strong, and the impact of employing the WG mask as input is non-negligible, so this may harm the transferability of this method in practice.
    2. Besides, the author uses an external dataset to evaluate the method, while the atlas is built upon the source dataset, if the data bias is huge, will this strategy still work out.
    3. There are some atlas-based methods for prostate segmentation that are not included in the reference list, such as [Jia, Haozhe, et al. “Atlas registration and ensemble deep convolutional neural network-based prostate segmentation using magnetic resonance imaging.” Neurocomputing 275 (2018): 1358-1369.], [Ma, Ling, et al. “Automatic segmentation of the prostate on CT images using deep learning and multi-atlas fusion.” Medical Imaging 2017: Image Processing. Vol. 10133. International Society for Optics and Photonics, 2017.], [Padgett, Kyle R., et al. “Towards a universal MRI atlas of the prostate and prostate zones.” Strahlentherapie und Onkologie 195.2 (2019): 121-130.], [Singh, Dharmesh, et al. “Segmentation of prostate zones using the probabilistic atlas-based method with diffusion-weighted MR images.” Computer Methods and Programs in Biomedicine 196 (2020): 105572.]. Similar methods should be at least taken into discussion to better address the main contribution of this work.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of this paper is good.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The major concern has been listed in my comments to question 5. Also, the comparison with other similar methods (section 4) is not specific enough, more study is needed in this aspect.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea of building the voxel-wise probabilistic atlas to help the network capture anatomy prior is interesting, and the author has explained their approach clearly. The experimental results demonstrate superior performance against the baselines as well as some other methods. However, as mentioned in Question 5, some weaknesses affect the judgment of the contribution with this work.

  • Number of papers in your stack

    3

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    No change in decision.



Review #2

  • Please describe the contribution of the paper

    The authors propose a new deep learning approach for automated semantic segmentation of prostate zones by including prior shape information from an anatomical atlas.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    In general, this is a well organized and clearly written paper. I like the incorporation of a probabilistic atlas to provide semantic context.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Minor problems with English grammar. Please edit. Setting different weights of the atlas during testing means “human in the loop” and might make the algorithm operator dependent.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors provide specifics and point to the data they used.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    As mentioned above, lambda could be viewed as human in the loop. Fig 3 does not clarify for me why lambda=0.4 was chosen.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Good paper, relevant topic but a concern that needs to be addressed.

  • Number of papers in your stack

    2

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Somewhat Confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    Multizonal prostate segmentation is performed on T2w MR images using a combined 3D anatomical atlas and deep learning (U-Net based) segmentation algorithm. The impact of each method on the final segmentation may be adjusted in real-time to optimize results by tuning a hyperparameter. This approach requires a whole-gland prostate segmentation as input, then predicts the central gland segmentation, and through elimination identifies the peripheral zone segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This work integrates two segmentation approaches to perform two-zone prostate segmentation. Novelty lies in allowing the user to adjust each method’s influence to improve segmentation by adjusting a hyperparameter in real-time.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Whole gland prostate segmentation is required as one of the inputs to the model, limiting its application to only cases in which this is already available.

    The anatomical atlas was developed by registering images with an affine transform, but a deformable registration may achieve better results.

    Figure 2 compares the Dice of the baseline method with and without mask, in addition to the proposed methods in which the hyperparameter that determines the influence of the semantic segmentation v. atlas is varied. Clearly, adding a mask improves the model performance, but in absence of a mask the model performs significantly worse than typical two-zone prostate segmentation methods in which no mask is used. It would be interesting to see how this U-Net model performs on two-zone prostate segmentation when trained without a mask on this dataset.

    Only a couple comparisons of their results to other work are included, but multi-zonal prostate segmentation is a well-studied topic, and inclusion of comparison to more recent work would be beneficial.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This work utilizes publicly available datasets, Prostate x and Prostate 3T. The methods are clearly described and code publicly available, making this work reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    This paper is clearly structured, with the motivation for the segmentation work well defined. Fig. 2 depicts the Dice for segmentations produced at baseline and with different lambda values for the dataset as a whole. However, it is unclear from this figure if there are some types of cases in which small lambda values yield better results and others in which large lambda values yield better results. For example, does prostate size, presence of BPH, or location of cancer influence which lambda value would be best for use in a specific prostate segmentation? While classifying what types of cases perform best at given lambda values is beyond the scope of this work, it would be interesting to see how the ability to fine-tune results by adjusting lambda on a case-by-case basis rather than setting it as a fixed and evaluating on the entire dataset improves results. If someone who had not seen the ground truth segmentations were to adjust lambda to optimize segmentation accuracy on a case-by-case basis, how would these results compare to others in Fig. 2? One advantage of automated segmentation is that it minimizes the bias of an individual in performing segmentation. Allowing users to modify the influence of each segmentation technique re-introduces user bias while also possibly enabling more accurate segmentations. It would be beneficial to mention inter-reader variability in this work, including DSC for two-zone prostate segmentation.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the approach has merit, the fact that the prostate segmentation is needed in order to segment the central gland reduces the enthusiasm for this work.

  • Number of papers in your stack

    1

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    A useful application with seemly sound methodology and validation. Please focus on addressing the third reviewer’s comment during rebuttal.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    nr




Author Feedback

We thank the reviewers for their thoughtful feedback and valuable comments. Our approach was praised for its novelty to integrate a probabilistic atlas prior directly into the segmentation model (R1, R2) and the ability to adjust the atlas’s influence in real-time (R3).

  1. Table S1 We mistakenly underreported numerical Dice values in Table S1 by averaging over the base, midgland and apex ROIs instead of the whole gland (WG). The reported Dice values increased by ~0.05 in all cases, eg mean CG Dice 0.86 -> 0.91 in λ=0.4, which outperforms the SoTA (mean Dice ~0.85) by a large margin. All other results (Fig. 2) were reported correctly and conclusions remain unchanged.

  2. Requirement for the WG segmentation mask (R1, R3) We note this limitation in the Conclusion. Our approach was designed to work within the commercial ProFuseCAD system that uses semi-automated WG segmentation as part of the routine workflow. WG segmentation methods are an active research area, but clinical workflows rely on manual user intervention to verify and revise segmentation results. In this paper, we assume that radiologists are included in the loop to verify any automated or semi-automated WG segmentations, and instead focus on the novel incorporation of the shape atlas and real-time refinement. In future work, we envision incorporating the WG segmentation step within an end-to-end training or a cascaded segmentation approach.

  3. Deformable atlas (R3) Our decision to utilize an affine transformation to build the anatomical atlas derived from concerns for both simplicity and reduced computational time. We agree that an atlas based on deformable registration has the potential to yield a better representation of zonal anatomy; however, we hypothesized that this approach would be more sensitive to errors in the WG segmentation. In contrast, our affine approach relies less on the WG segmentation being perfect and allows for spatial normalization to the gross shape. Furthermore, non-rigid registration approaches are more computationally demanding than linear registration and may be subject to poor optimization due to the presence of many local minima (even in the case of binary or distance map-based registration approaches). We also envision that the affine transformation could be implemented as a spatial transformer in an end-to-end trained network in future work.

  4. Segmentation without mask as second input channel (R3) We concatenated the available WG mask in the network’s first layer as a hard constraint. We ran additional experiments without the WG mask input channel and observed no significant differences in segmentation performance.

  5. Optimal selection of λ (R3) and human-in-the-loop performance (R2) The ability to adjust the segmentation in real-time is a major innovation of this work. Our results demonstrated that using a value λ=0.4 performed best on average over the test set (mean±SD) 0.91±0.05, which shows that operator fine-tuning is not compulsory to achieve improved segmentation results. However, as R3 notes, different subjects may have different optimal λ values. Because our model allows for λ to be selected in real-time at inference, we performed an experiment to select the maximum Dice for all λ values for each subject and noted the λ for the maximum Dice. This strategy results in small improvements 0.92±0.05 for the central gland (CG) where the optimal λ was 0, 0.4, 0.6, and 0.8 for 16, 8, 5, and 1 subject, respectively. Standard segmentation without the atlas (λ=0) was used in 16 of 30 test subjects and the remaining 14 subjects utilized the atlas to different degrees. These results simulate best-case human-in-the-loop performance, where the radiologist could select the optimal λ. As R2 suggested, this would be an interesting user study to quantify inter-rater variance and how our method’s real-time refinement affects segmentation performance.

  6. Additional references (R1, R3) We will include these references in the final submission.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This is a borderline paper. I agree with the review that the fact that adding a segmentation to improve the zonal segmentation needs to be considered a limitation of this work at the application level, which otherwise seems sound and moderately novel, though with prior presentational problems. I would encourage the authors to clearly identify the value of the work in its future development.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NR



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal well answered the key questions on the requirement of prostate segmentation for the proposed method to be applied, human-in-the-loop and sensitivity to the choice of lambda. Overall, it is an interesting and solid work.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors were responsive to all the reviewers’ critiques. Some of the issues highlighted by R3 could impact the accuracy of the approach, albeit authors indicated that the experiments performed post review indicated otherwise. Nevertheless, it would be a good idea to discuss the limitations and potential for worsening consistency with respect to inter-raters delineations with this approach. Otherwise, this is a well written paper with merits outweighing the limitations.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    6



back to top