Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Feng Chang, Chaoyi Wu, Yanfeng Wang, Ya Zhang, Xin Chen, Qi Tian

Abstract

To alleviate the demand for large amount of annotated data by deep learning methods, this paper explores self-supervised learning (SSL) for brain structure segmentation. Most SSL methods treat all pixels equally, failing to emphasize the boundaries that are important clues for segmentation. We propose Boundary-Enhanced Self-Supervised Learning (BE-SSL), leveraging supervoxel segmentation and registration as two related proxy tasks. The former task enables capture boundary information by reconstructing distance transform map transformed from supervoxels. The latter task further enhances the boundary with semantics by aligning tissues and organs in registration. Experiments on CANDI and LPBA40 datasets have demonstrate that our method outperforms current SOTA methods by 0.89\% and 0.47\% respectively. Our code is available at https://github.com/changfeng3168/BE-SSL.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16431-6_2

SharedIt: https://rdcu.be/cVD4J

Link to the code repository

https://github.com/changfeng3168/BE-SSL

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes Boundary-Enhanced Self-Supervised Learning (BE-SSL), leveraging supervoxel segmentation and registration as two related proxy tasks. The former task enables capture boundary information by reconstructing distance transform map transformed from supervoxels. The latter task further enhances the boundary with semantics by aligning tissues and organs in registration. Experiments on CANDI and LPBA40 datasets have demonstrated that our method outperforms current SOTA methods by 0.89% and 0.47%, respectively.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The motivation is clear and convincing. How to enhance the boundary segmentation result using self-supervised learning is a promising direction in medical image analysis.

    2. The idea of employing the distance transform map (DTM) based on supervoxels to emphasize the the edges and boundaries sounds reasonable to me. Also, applying self-supervised learning to predict DTMs is novel.

    3. Learning the registration from each volume to the mean volume is an interesting way to incorporate the semantic information, where I suppose the mean volume incorporates the semantics of the whole dataset.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The major weakness lies in the experiment section.

    1. I do not understand why the rest 5 baselines (3D-Rot, 3D-Jig, 3D-Cpc, PCRL, Genesis) perform worse (most of the time) than training from scratch in the fine-tuning stage, especially when the labeling proportion is 10%. Because self-supervised learning has been shown to be more effective when the amount of annotations is quite limited.

    2. The authors should conduct more analyses about why RubikCube performs better than other baselines, because the effectiveness of Genesis and PCRL have been validated on other challenging tasks.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The proposed method is reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Please address the problems in the weakness section.

    Overall, this is a good paper with promising technical novelty.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The motivation is clear and intuitive.

    2. The proposed method sounds reasonable to me.

    3. The experimental results are satisfactory.

  • Number of papers in your stack

    6

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    This manuscript proposes two pretext tasks for self-supervised pretraining for the downstream task of brain structure segmentation, i.e., regressing unsigned distance maps defined with respect to supervoxel over-segmentation, and volumetric registration. The proposed pretext tasks are evaluated on two public datasets and demonstrate superior performance to several SOTA methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed pretext tasks are straightforward.
    • The pretext task of regressing unsigned distance maps defined by supervoxels seems novel.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I have three major concerns, elaborated below.

    • One of the main claims of contributions that the registration pretext task matches and enhances boundaries concerns me. According to Eqn. (3), the authors employ voxel-wise similarity loss for the registration, which does not emphasize boundaries by definition.
    • Several aspects affect the reproducibility of the paper. First, the hyperparameters used to generate the supervoxels are not given, nor is the impact of different hyperparameters on performance investigated. Second, it is unclear how the supervoxels of interest (and background) are identified. Third, there is inconsistency regarding the evaluation setting (train/val/test split versus five-fold cross-validation). Fourth, it is unclear how many epochs are trained for both pretraining and fine-tuning. Lastly, no code is submitted, nor do the authors promise to publish codes.
    • The improvements upon existing SOTA seem minor.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproducibility is doubtful. Please see my response to Q5.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Eqn. (1): what is the “inf”?
    • Table 1 and Table 2: (1) how do you obtain the numbers for methods in comparison? (2) please add a space between “Dice” and “(%)”, and (3) “3d” -> “3D”
    • Page 7: “…, for CANDI and LPB40, Suggesting that …” -> “… , suggesting that …”
    • Page 7: “…, one enhancing the fundamental boundaries and the other enhancing the semantic boundaries.” Please differentiate the fundamental and semantic boundaries.
    • Ref. [12]: “cnns” -> “CNNs”; please also check other references for similar cases.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea of regressing distance maps defined by supervoxels is somewhat interesting. However, considering the major weaknesses mentioned in Q5, I rate it weak reject.

  • Number of papers in your stack

    6

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #5

  • Please describe the contribution of the paper

    This paper proposes a boundary-enhanced self-supervision method that is able to learn from supervoxel segmentation and registration tasks. The supervoxel branch is refined for the main task to get the final segmentation. The experiments on CANDI and LPBA40 datasets show the efficiency of the proposed method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper proposes a new self-supervision method that consists of supervoxel segmentation and image registration as proxy tasks to enhance boundary segmentation for the main task. The paper is well written and easy to follow.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weakness of the paper is the limited technical contribution and experiment performance. The authors claim they are the first to introduce registration as a proxy task for self-supervised learning, which is true to my knowledge. However, image registration has already been used in a self-supervision manner in [1], which limits the technical contribution of the paper. In experiments, the improvement compared to other self-supervision methods is very limited (less than 1% in most cases) and no information on the significance test is provided. This makes the method less convincing on boundary segmentation.

    [1] Li, Hongming, and Yong Fan. “Non-rigid image registration using self-supervised fully convolutional networks without training data.” 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018). IEEE, 2018.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors would like to make all codes publicly available, which ensures good reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    *Why not use one decoder and two output channels to train the two proxy tasks? Is there any technical limitation to doing that? I think this can save model capacity and learn more fused task-relevant features. *Significance test should be conducted for the results. *Report the final convergence points in Figure 4 and 5. They seem quite close to each other in later epochs. *Highlight the boundary improvements if there are any in Figure 6, such as over-segmentation and under-segmentation. *Add necessary citations that use supervoxel and registration in self-supervision. The literature review is not complete enough.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The limited technical contribution and performance are the main weaknesses of the paper. The method should be further refined to enhance boundary segmentation and more solid results should be provided.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    5

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The rebuttal solves most of my concerns. It can be more convincing if more solid validation on real testing data in different challenges are provided.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Three knowledgeable reviewers read the paper and provided mixed reviews. The reviewers appreciated the proposed method and the clear motivation, presentation, however they raised some weaknesses including concerns about the experimental setup and reproducibility [R1, R2], limited technical contribution [R3] and very small improvement compared to SOTA [R2, R3]. The authors should try to properly address these points in the rebuttal.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NR




Author Feedback

We thank all reviewers and AC for their constructive comments and provide responses below.

  1. Experimental Setup: 1) 2000 supervoxels are generated for each dataset, and compactness of 0.3 and 0.2 are used for CADNI and LPBA40. 2) The background is identified by intensity, as its intensity is relatively fixed. 3) 75% of each dataset is used for training, and the remaining 25% for testing. Five fold cross-validation is performed on the train set. For each cross-validation fold, the best-performing model is evaluated on the test set. The average Dice score is reported. 4) The pre-train and finetune are 200 epochs each for CANDI, and 100 epochs each for LPBA40. We will release code once the paper is accepted.
  2. Limited technical contribution: Our contribution is introducing supervoxel segmentation and registration as two coupled proxy tasks for self-supervised segmentation, where the former captures boundary information, and the latter enhances the boundary with semantics. Both R1 and R2 consider the supervoxel task novel (“predict DTMs is novel”; “… by supervoxels seems novel”). R3 recognizes our paper as “the first to introduce registration …”. It is true that self-supervised solution has been proposed for registration, which actually motivates us to adopt registration as a proxy task. Our contribution is not on how to perform self-supervised registration, but on showing the effectiveness of self-supervised registration as a proxy task for segmentation.
  3. Small improvement compared to SOTA: The improvement of BE-SSL over SOTA is small but statistically significant according to independent two-sample t-test. All tests are with p < 0.05 except for one (BE-SSL vs. RubikCube on LPBA40). Further per-class analysis has revealed that BE-SSL tends to gain more on classes with clear visual boundaries (~3% gains), while less on classes with fuzzy boundaries (~0.1%-1% gains), suggesting that BE-SSL is good at capturing the inherent boundaries. R1: 1) The location and shape of the brain structures can be easily learned as they are quite similar among instances. Refining the fine-grained boundaries, where most errors occur, is thus critical. 3D-Rot, 3D-Jig, 3D-Cpc, as discriminative SSL methods, tend to capture more global semantics and ignore fine boundaries. While Genesis and PCRL, as reconstruction SSL methods, do extract more low-level features, they treat pixels equally without focusing on the boundaries. We therefore do not expect these SSL methods to be beneficial to segmentation tasks, and are not surprised that they sometimes perform worse than Scratch. 2) Edge matching is an important clue when playing RubikCube, which enables RubikCube to capture the boundary and perform better. R2: 1) Boundaries matching is necessary in minimizing Eq 3. Voxel-wise similarity loss is optimized to push the moving boundary to the fixed one. Moreover, (Bhalodia et. al, 2021) also shows that adopting such loss discovers landmarks on boundaries. So registration is expected to emphasize boundaries. 2) Official open-source codes are obtained for Genesis and PCRL, and we implement the rest methods ourselves. 4) Fundamental boundaries are the visual boundaries generated by intensity gradients. Semantic boundaries are those need to be identified with semantics. For example, when there are discontinuities at the boundary or unexpected deformation, we can correct them with semantic boundaries. R5: 1) One encoder with multiple decoders is typical for multi-task learning. Here, the two proxy tasks are quite different and thus require different features. A shared decoder will make the two sets of feature representations interfere with each other. In fact, the two decoders are only necessary for pre-training. Only the decoder of supervoxel branch is kept after pre-training. At the fine-tuning and inference stages, BE-SSL has the same model capacity as other SSL methods. 2) Convergence points in Fig 4 are listed in the 1st column of Tab 1 and 2.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal addresses properly most of the raised from the reviewers’ concerns. I think the paper would be an interesting contribution to MICCAI 2022.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    1



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The key concerns have been addressed

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    3



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    It is interesting to introduce the supervoxel segmentation and registration as two coupled proxy tasks for self-supervised segmentation, which could enhance the boundary-related semantic segmentation. The rebuttal addressed most of the concerns from my view. Thus I lean to accept.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    6



back to top