Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Chengyin Li, Yao Qiang, Rafi Ibn Sultan, Hassan Bagher-Ebadian, Prashant Khanduri, Indrin J. Chetty, Dongxiao Zhu

Abstract

Computed Tomography (CT) based precise prostate segmentation for treatment planning is challenging due to (1) the unclear boundary of the prostate derived from CT’s poor soft tissue contrast and (2) the limitation of convolutional neural network-based models in capturing long-range global context. Here we propose a novel focal transformer-based image segmentation architecture to effectively and efficiently extract local visual features and global context from CT images. Additionally, we design an auxiliary boundary-induced label regression task coupled with the main prostate segmentation task to address the issue of unclear boundaries from poor soft tissue contrast CT images. We demonstrate that these designs can significantly improve the quality of the CT-based prostate segmentation task over other competing methods, resulting in substantially improved Dice Similarity Coefficient and reduced Hausdorff Distance and Average Symmetric Surface Distance on both private and public CT image datasets.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_57

SharedIt: https://rdcu.be/dnwBR

Link to the code repository

N/A

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper proposes a novel model for prostate segmentation using CT images which is a relevant and common problem in medical imaging. Their main contributions are a Focal Unet Transformer architecture for prostate segmentation and an auxiliary loss that compares the boundaries predicted by their network to some approximation of the boundary existent in the ground truth image obtained by applying a Gaussian kernel. In their experiments, they outperform state-of-the-art approaches and validate each contribution’s performance improvement in an ablation study.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Simple ideas that seem to work well in the experiments with the private dataset
- comprehensive experiments evaluating different models with different architectures on two datasets.
- Ablation study to verify the performance of each proposed component independently.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) I believe the novelty in the technical contributions is limited. The proposed focal self-attention mechanism is very similar to the one proposed in [22], but integrated with an Unet decoder that is also introduced in [1]. The auxiliary loss may be novel, but it should be compared to existent losses for improving the segmentation boundaries like the work “Boundary loss for highly unbalanced segmentation”.

2) The evaluation with AMOS dataset seems not to use a validation split which is problematic since a lot of the training parameters should be decided on this split for an unbiased evaluation.

3) The results achieved by the proposed method are only marginally above the baselines in the Amos dataset (e.g., relative improvements inferior to 1%). What is the reason for that?

4) Other loss terms for encouraging segmentation boundary quality exist and were not considered in the experiments.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

I saw in the reproducibility checklist that the authors do not plan to release trained models. I think the weights of the models trained on public datasets should be released to easier the comparison to future works. In addition, information on the statistical significance of reported differences in performance between evaluated methods, the average runtime, and the memory footprint for each approach should also be reported in the paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

See sections 6 and 8.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I think this paper has limited technical novelty and some flaws in the evaluation. More specifically, I think the proposed method is an application of existing techniques to the prostate segmentation problem. I want to hear from the author how their self-attention formulation differs from the existing ones. Also, how do their auxiliary loss formulation and performance compare to existing boundary loss terms? Finally, some explanations about the setup and performance of their method on the AMOS dataset are also necessary. Due to these reasons, I do not recommend the acceptance of this paper.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

In this paper, authors proposed one deep learning system (FocalUNETR) for Prostate Segmentation from CT images. The system was designed based on focal transformer and have an auxiliary task of contour regression for clearer segmentation boundary.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The manuscript is quite well-writen and easy to follow.
2. The experiment part is comprehensive in terms of compoarison to other state-of-the-art systems, ablation study, and results visulization.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. Though the study is complete and comprehensive in ananlysis. But generally, the novelty is limited as focal transformer and usage of auxiliary task are not new. However, I thank the authors for transplanting good exisiting techniques to other fields and tasks.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The description to the system is detailed and all important training parameters are provided. It should be easy to reproduce the experiment.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. Does “3 runs with different seeds” mean different random seeds for splitting the dataset? And please clearify that all models are trained and tested in the same 3 dataset splitting configuration?
2. Could you please explain why there is no 3D version of this model as I believe it would be easy to have 3D focal transformer. Is there any specific reason why you have not tried that?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Though generally, the novelty is limited as focal transformer and usage of auxiliary task are not new. However, I thank the authors for transplanting good exisiting techniques to other fields and tasks.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

5
[Post rebuttal] Please justify your decision

I believe all reviewer agree that this work has limited novelty as components are not new. However, the technique combination is reasonable and designed system is a good try, and also more importantly, good technique transplanting and efficient apllication systems are welcomed in MICCAI. I still go for the “weak accept”.

Review #3

Please describe the contribution of the paper

The paper presents a novel approach to CT image segmentation by incorporating a focal transformer into a U-Net-like architecture. This new architecture, named FocalUNETR, seeks to improve on previous transformer-based models and convolutional networks by providing a more effective mechanism for capturing local visual features and global contexts in medical images. In addition to the main segmentation task, FocalUNETR is also trained with an auxiliary boundary-aware regression task to improve results for images with unclear boundaries.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

FocalUNETR introduces a novel combination of a focal transformer with a U-Net-like architecture, which successfully captures local visual features and global contexts in CT images. FocalUNETR outperforms several state-of-the-art models, demonstrating superior performance particularly in challenging cases with unclear boundaries or irregular shapes. The multi-task learning approach, including an auxiliary boundary-aware regression task, is a strong point of the proposed model, as it helps to improve segmentation performance in complex cases.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
In general, this manuscript is well written and interesting. However, it also has some issues, which are commented on below.
1. The model is designed with a limited Hounsfield Unit (HU) range of [-50,150]. This could potentially limit the model’s ability to highlight different levels of organ textures. Using a broader or multiple HU ranges might enable the model to capture a wider range of textural information, thereby potentially improving segmentation performance.
2. The model is evaluated only on prostate gland segmentation. The performance of FocalUNETR on other organs or tissue types is not studied, limiting the scope of the results.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

If author provide github as mentioned in the paper, there should not be an issue on the reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I recommend to accept the paper based on the novelty of the method and comprehensive experiment results.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

6
[Post rebuttal] Please justify your decision

I’m inclined to hold the same rate for this paper due to the author’s rebuttal and other reviewers’ comments.

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The proposed FocalUNetTR targeted at improving segmenting prostate from CT images, especially around the boundary regions. The paper is well written and easy to follow. The method is clearly described. The authors also conduct experiments to compare with the SOTA methods and presented superior performance. However, the novelty of this paper is incremental, for example, the focal SA is proposed in [22] and this paper is combing the module in the decoder of UNet without great insight. Also, the boundary-aware regression auxiliary task is not new, many works have used the similar technique, even including some CT prostate segmentation papers. Besides, I have another concern, I’m not sure if the comparison in Table 1 is fair or not, because the FocalUNETR involves much more computational complexity. It will better to provide # of parameters and FLOPs in Table 1. Also, the visualization in Fig. 3 does not well support the advantages for the boundary region.

Author Feedback

We thank all reviewers for your time and insightful comments. They found our work interesting(R2,3), innovative(R2,3), and effective(R1,2,3), but also pointed out some issues. We will clarify the main points:

Q1(MR, R1) Novelty (technical): The adaption of the focal self-attention mechanism does not diminish the novelty of this paper. 1) As R2 stated, our work effectively adapts good existing techniques to the challenging CT-based prostate segregation task in medical image fields (also noted by R1 and R3). The superior performance is verified with by far the two largest prostate CT datasets we can access from both a collaborating hospital and a public benchmark. This kind of adaptation is welcomed by the guidelines of Application Studies of MICCAI. 2) Due to the large gap between natural images and medical images, we carefully designed an effective focal transformer with a U-Net-like architecture specifically for the CT-based prostate segmentation task. We also employ a boundary-aware regression auxiliary task to further enhance the performance. We have not found any other works that utilize focal self-attention in this manner.

Q2(MR, R1) Parameters, FLOPs, and average inference time: We have already listed the # of Params. in Table 1 (appendix). For FLOPs(G) and average inference time (s) per case, we state them here: U-Net (9.3, 3.12), UNet++(60.4, 4.31), AttUNet(25.5, 3.53), TransUNet (29.3, 4.87), Swin-UNet (9.0, 3.58), U-Net3D(285, 6.51), V-Net3D(58, 6.72), UNETR3D(75.4, 6.49), SwinUNETR(3D)(350, 7.23), FocalUNETR-S(15.7, 4.36), and FocalUNETR-B (27.5, 5.35). nnUNet with 19.3M # of Params, 389GFLOPs, and 9.65s average inference time. FocalUNETR shows a comparable model size, relatively small FLOPs, and fast inference speed to most of the SOTAs.

Q3(MR) Fig. 3 does not well support the advantages of the boundary region: We believe that based on Fig.3, our FocalUNETR-based models demonstrate better boundary predictions, particularly for challenging cases such as rows 2 and 4. We indeed included additional comparisons in the appendix (Fig.1) to provide a clearer comparison of the boundary regions.

Q4(R1) No validation split for AMOS dataset: We tried using 10% percent of AMOS training set as validation data to find a better training parameter setting, and re-trained the model with the full training set. However, we did not get improved performance compared with directly applying the training parameters learned from tuning the private dataset. We will add this information in the updated version.

Q5(R1) Marginally above the baselines in the AMOS dataset: The AMOS dataset mixes the prostate(males)/uterus(females, a relatively small portion). The morphology of the prostate and uterus is significantly different. Consequently, the models may struggle to provide accurate predictions for this specific portion of the uterus. Thus, the overall performance of FocalUNETR is overshadowed by this challenge, resulting in only moderate improvement over the baselines on AMOS dataset. However, the performance in the real-world (private) dataset gains a much better performance margin.

Q6(R1) Other methods for improving boundary quality: Our focus and contribution lie in the design and integration of the auxiliary boundary regression task to augment the performance of FocalUNETR.

Q7(R2) 3 runs with different seeds: The private dataset has a fixed random split performed by collaborating doctors from the hospital. The AMOS dataset also has official pre-defined splits. We used three different random seeds to train three versions of the model and averaged the evaluation results on the same testing split for each dataset.

Q8(R2) 3D version of FocalUNETR: We do not use it for 1) the superior performance of the 2D version to both 2D and 3D SOTAs, and 2) a 3D version of FocalUNETR would require an efficient voxel splitting strategy to handle multiple focal levels. Therefore, we decided not to pursue a 3D version of this particular work.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

I am not convinced by the rebuttal and inclined to reject it. The novelty of this paper is really incremental. I agree that this paper has performance improvement or careful combination for focal SA and UNet, but they are not insightful. I’m actually fine with the transplant of technique from other fields to Medical image analysis fields, but it should not be just direct transplant without any insight or consideration about the characteristics of the task or medical images themselves. Also, it is still unknown about comparison of the auxiliary loss with the existing ones.

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

I like the rebuttal that addresses the novelty of the focal self-attention. This strategy is indeed showing good effects on the challenging CT data. Performance details are also reported in the rebuttal.

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Although the final rating scores are ‘4.weak reject’, ‘5.weak accept’, and ‘6.accept’, all the three reviewers has limited novelties as claimed by the final justification of R2 and R1. After checking the rebuttal and the manuscript, I also agree that the technical novelties of this work are not strong enough for publication in MICCAI 2023.

back to top

FocalUNETR: A Focal Transformer for Boundary-aware Prostate Segmentation using CT Images