Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jia Fu, Tao Lu, Shaoting Zhang, Guotai Wang

Abstract

Accurate segmentation of the fetal brain from Magnetic Resonance Image (MRI) is important for prenatal assessment of fetal development. Although deep learning has shown the potential to achieve this task, it requires a large fine annotated dataset that is difficult to collect. To address this issue, weakly-supervised segmentation methods with image-level labels have gained attention, which are commonly based on class activation maps from a classification network trained with image-level labels. However, most of these methods suffer from incomplete activation regions, due to the low-resolution localization without detailed boundary cues. To this end, we propose a novel weakly-supervised method with image-level labels based on semantic features and context information exploration. We first propose an Uncertainty-weighted Multi-resolution Class Activation Map (UM-CAM) to generate high-quality pixel-level supervision. Then, we design a Geodesic distance-based Seed Expansion (GSE) method to provide context information for rectifying the ambiguous boundaries of UM-CAM. Extensive experiments on a fetal brain dataset show that our UM-CAM can provide more accurate activation regions with fewer false positive regions than existing CAM variants, and our proposed method outperforms state-of-the-art weakly-supervised segmentation methods learning from image-level labels.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43990-2_30

SharedIt: https://rdcu.be/dnwLF

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    In the work, the authors develop a method for fetal brain segmentation that only utilizes image level labels. The method uses two different types of pseudo label for segmentation. The first type is the multi-scale CAM generated from a classification network. Multiple CAM are merged based on uncertainty. The second type is the seed-derived Pseudo Label generated from the centroid and the corner points of the bounding box of the UM-CAM. A segmentation network is then trained on these pseudo masks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The proposed method increase the robustness of pseudo mask by combining multi-scale CAM and SPL.

    2. The paper has a very good structure.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Experiments of some important hyperparameters are missing, such as the lambda for combining the two types of pseudo masks.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The author plan to release the source code after review.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    . Could the authors provide more details about the classification task in the first stage. Is it to classify whether a image contains brain or not? What is the percentage of different classes in the training data.

    1. How does the hyperparameter lambda and M affect the performace of the model? How are these hyperparameters selected in the experiments.

    2. What is the range of gestational age in the data. Can the trained model generalizes to data of different gestational age?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors present a weakly supervised method for fetal brain segmentation. Extensive experiments are performed to demonstrate the efficacy of the method. Some minor questions need to be addressed in the paper.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposed a semisupervised method for fetal brain segmentation. It only requires image level labels. The result showed superior performance compared to the current method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The present work proposed two new methods: 1 UM-CAM, in which the activation maps were combined using entrop-weights, to provide better initial segmentation of the target 2 Geodesic distance-based Seed Expansion (GSE) method, which generated labels based on Geodesic distance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. It is not possible to acquire fetal MRI at a resolution of 0.5mm or 0.6mm. But I suppose the method is not specific for resolutions.
    2. It is not clear what the image level label was. I suppose it is whether a slice contains the fetal brain.
    3. It is not clear how the ground truth was defined. For example, it is the skull or the brain, since the mask has a smooth edge. It is also not clear whether the proposed method requires a smooth foreground.
    4. Because 2D fetal image was not use directly, I suppose the major purpose of the paper was to provide a step to help the slice-to-volume reconstruction. If the location of the brain can not provide details, it may give little differences from a YOLO localization in the 3D reconstruction. Or it needs to be evaluated further.
    5. It is not clear how activation maps were transferred to segmentation masks in the calculation of dice.
    6. It is not clear whether the propose method can be extend to 3D.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Although the data were not able to provide, it is reasonable to reproduce the method with the code and model of the authors.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Brain segmentation may be miss-leading. Brain extraction would be more accurate for the title and main content.
    2. It requires further evaluations in the following slice-to-volume reconstruction, since it is only an interval step of the fetal processing pipeline. Otherwise, it may not show advantages in practice.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Locating the fetal brain is highly demanded in the current fetal 3d reconstruction pipeline. Although there have been a few methods, it is good to have a semisupervised approach. Especially for its technique innovations and potential useage in other applications.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The paper describes a new approach to fetal brain segmentation using only image-wise labels. The method is based on a grad-cam identification of the relevant region and a refinement step. The authors propose extensions to both steps and evaluate the proposed solution on an in-house dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed method seems to be clearly ahead of other state-of-the-art approaches. Furthermore, the evaluation of the proposed approach is quite comprehensive. Although it is based on a single dataset, it covers most of the aspects I would like to see in the experiments for a MICCAI paper. In addition, they addressed a relevant medical problem and the paper is well structured overall.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weakness is the writing of the paper. The paper is unnecessarily complex. For example, one sentence extends over 8 lines plus additional equations. Furthermore, the evaluation is based on a single in-house dataset, which limits the generalisability and reproducibility of the results. However, none of these points go beyond a typical MICCAI paper.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Although it is stated that the code for the method and experiments is available, I did not find this information in the paper. The methods described in the paper seem sufficient to reproduce the results. The reproducibility points given are mostly valid.

    The following items were marked in the reproducibility response, but I missed them in the paper: I) Hyperparameter tuning II) Sensitivity to parameter changes III) Number of runs IV) Baseline implementation V) Failure situations.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    In general, the paper is well structured and interesting. The evaluation is good and covers all the important aspects I expected. However, there are still a few points that could be improved (roughly in order of importance)

    1. Improve and simplify the writing: The writing is too complex and could be improved. For example, on page 4, first paragraph, a single sentence takes up 8 lines plus two formulas. Shorten sentences, avoid inline sentences etc… This would improve the paper considerably.
    2. This may be out of focus for this MICCAI paper, but would improve the paper significantly: Include more than one dataset. I don’t see any reason why it shouldn’t work on different datasets, but at the moment it’s only being tested on a fetal brain dataset. I suggest either using a second similar dataset or another dataset that is suitable for the task, such as organ segmentation.
    3. The choice of entropy-based fusion of CAM information from different levels is explained but not justified.
    4. Table 1 gives a dice score obtained from the GradCAM information. However, it is unclear to me how the activation map was converted into a segmentation. Was a fixed threshold used (e.g. at 0.5)? This should be mentioned in the description of the table.
    5. In tables 1 and 2 an uncertainty is given (+/- something). Is this a confidence interval, a standard deviation from several runs, or something else? Please mention this in the description of the table.
    6. I missed a discussion or at least a mention of the shortcomings of the proposed method. Are there any disadvantages? Were there any cases where things went wrong?
    7. GradCAM-based methods are mentioned as the main method for learning segmentation from images, but there are several other approaches. Please mention them in at least one sentence before focusing only on CAM-based approaches.
    8. In Table 1 the authors report the results for SPL alone. I am not sure how this was achieved. Was it done by using the baseline and then applying SPL? Or how were the midpoints obtained?
    9. Related to the point above: If Baseline+SPL was not reported, I would find it really interesting to know how this performed.
    10. I suggest updating the order of the images in the figures. Usually the suggested approach is the most left or the most right. It’s to the authors’ credit that they didn’t want to push their approach too far, but I think it would make the figures easier to read (especially Fig. 2). 11: In subsection 2.1 “Multi-resolution exploration and integration” - last line: L_m was not introduced before, what is it? 12: Subsection 2.2 - after “cue map P_SPL”: Parenthesis is closed that was not open before. More points after MICCAI: Do a cross-validation and not just a fixed train/val/test split. Include multiple classification backbones like ResNet, DenseNet etc.. in addition to VGG-16
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is interesting, addresses a relevant challenge and proposes a well-performing solution. There are some shortcomings in the writing and the use of only a single dataset, which prevents me from going straight for a strong acceptance, given the strong competition at MICCAI.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The reviewers agree that the paper has novel elements and addresses an important problem in the field of weakly supervised learning. Please consider the suggestions for improvement, especially the writing revisions and clarification of parts of the paper.




Author Feedback

N/A



back to top