Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jiameng Liu, Feihong Liu, Kaicong Sun, Mianxin Liu, Yuhang Sun, Yuyan Ge, Dinggang Shen

Abstract

Precise brain tissue segmentation is crucial for infant development tracking and early brain disorder diagnosis. However, it remains challenging to automatically segment the brain tissues of a 6-month-old infant (isointense phase), even for manual labeling, due to inherent ongoing myelination during the first postnatal year. The intensity contrast between gray matter and white matter is extremely low in isointense MRI data. To resolve this problem, in this study, we propose a novel network with multi-phase data and multi-scale assistance to accurately segment the brain tissues of the isointense phase. Specifically, our framework consists of two main modules, \textit{i.e.}, semantics-preserved generative adversarial network (SPGAN) and Transformer-based multi-scale segmentation network (TMSN). SPAGN bi-directionally transfers the brain appearance between the isointense phase and the adult-like phase. On the one hand, the synthesized isointense phase data augments the isointense dataset. On the other hand, the synthesized adult-like images provide prior knowledge to the ambiguous tissue boundaries in the paired isointense phase data. TMSN integrates features of multi-phase image pairs in a multi-scale manner, which exploits both the adult-like phase data, with much clearer boundaries as structural prior, and the surrounding tissues, with a larger receptive field to assist the isointense data tissue segmentation. Extensive experiments on the public dataset show that our proposed framework achieves significant improvement over the state-of-the-art methods quantitatively and qualitatively.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_6

SharedIt: https://rdcu.be/dnwCG

Link to the code repository

https://github.com/SaberPRC/IsointenseBrainTissueSeg.git

Link to the dataset(s)

https://www.nitrc.org/projects/ndarportal/


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper tries to leverage the adult-like images to enhance the tissue segmentation for the 6-month brain, whose tissue shows iso-intense patterns. It uses the paired input from the same subject to first train a generative adversarial network to generate the adult-like image for the isointense image. And then use both the isointense image and the generated adult-like image to do the tissue segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Using high tissue contrast images to guide the segmentation of the low tissue contrast image can introduce prior knowledge to the network to improve the segmentation accuracy.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Fig. 1, are the isointense and adult-like image slices showing the same brain location? It seems after the mapping, the brain changes a lot? In Fig. 2 (a), the I_{RAB} and the I_{GIB} seems to be not the same slice? Which leaves the impression that the brain changes a lot after the SPGAN?
    2. What’s the difference between the two challenges mentioned in paragraph 2 in the introduction.
    3. Compared to [1], the main difference is that this paper introduces a TMSN, while [1] uses a e 3D-DenseNet. But it is unclear why the fancy transformer network needs to be introduced? What is the motivation for involving this? It would be difficult for the users to follow the method without clear motivation.
    4. It is not self-included and not clear why the SPADE can maintain structural consistency in the paper, which is very important for the generative network.
    5. It is not clear to name the adult-like image in the paper. Does the paper use 1-year-old or 2-year-old images in NDAR dataset as the adult-like image?
    6. In table 1, what do the data augmentation and structural enhancement refer to? Which one is corresponding to using the SPGAN results?
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It is reproducible given the code and data

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Please check the figure to make them showing the corresponding slices, which can make the figure more clear.
    2. It is a general way to do the segmentation with the help of the better contrast images, what is the unique novelty of the proposed method. Please add some discussions.
    3. Some important dataset information is missing. see comments 5.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The novelty of the paper is not strong.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The author has addressed the unclear sections in the rebuttal. Given the method itself is sound and the data augmentation is important during the segmentation, the proposed method can provide moderate clues for improving the isointense segmentation.



Review #2

  • Please describe the contribution of the paper

    This paper presents a Transformer-based framework to segment brain tissues in isointense infant brain. The aim is to better segment white and grey matter during this phase in order to improve the diagnosis of brain disorders. The framework is composed of two stages, the first one is used to synthesize data (for data augmentation and to guide the segmentation), and the second one is the segmentation network based on transformers. The authors evaluate their method on a public dataset, the National Database for Autism research.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • the paper is clear and well motivated
    • the proposed method is interesting and the results are good and seem significant
    • the results show not only the means but also the std for each metrics
    • the evaluation is performed on a public dataset
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Figure 2 is dense and difficult to read
    • the link for the source code is not provided
    • the ablation study is not really convincing as the results are not really significant (especially for the dice: I am not sure we can conclude that DA is really a huge improvement…)
    • there is no notion of run time or number of parameters
    • I don’t understand why the authors do not test their method on the images from iSeg (and baby connectome project) as the challenges asses the problem of the segmentation of isointense brains.
    • linked with the previous point, the winning methods of iSeg should be included in the SOTA methods.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of this paper is good except for the link of the code that is not provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    To improve this paper, here are my suggestions:

    • the number of parameters and computation time of all the methods studied should be added
    • the link for the code should be added
    • a mention of iSeg/baby connectome project/winning team should be added
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Even if the paper suffers from some weaknesses, I find it interesting, with an interesting application case, and with convincing results.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposed a segmentation framework to better segment brain for isointense phase (6-9 months) infant structural T1w MRI with assistance from GAN for data augmentation (DA) and structural enhancement (SE), as well as a multi-resolution fusion segmentation strategy.
    In detail, the framework consists of two components: 1). a Semantics-preserved GAN (SPGAN) for bi-directional image synthesis between isointense phase and adult-like phase (>9 months). 2). A multi-resolution fusion segmentation strategy which employs high-resolution (HR) and low-resolution (LR) branches that take patches with different fields of view and at different resolutions. The authors also proposed using transformer-based cross-branch fusion (TCF) instead of direct concatenation to fuse the features from two branches.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The Authors conducted ablation studies on DA, SE, multi-resolution fusion, and TCF. Multi-resolution fusion is the most effective component in terms of improving segmentation performance, while DA and SE demonstrate moderate improvement. And TCF only provides mild improvement. The framework also outperforms 3D UNet, Dual-attention network, and Swin UNetr.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Although I appreciate the extensive ablation studies conducted in the paper, the comparison with other strong segmentation baselines is limited. There are many strong segmentation frameworks that were not included and compared, such as nnUNet, TransUNet, and SegFormer. The 3D UNet, DualAtt Net, and SwinUNETR can’t completely represent SOTA methods.
    2. The idea of multi-scale segmentation has already been proposed and studied, the authors should acknowledge the previous works, such as [1,2].

    [1]: Chen, Wuyang, et al. “Collaborative global-local networks for memory-efficient segmentation of ultra-high resolution images.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. [2]: Hoyer, Lukas, Dengxin Dai, and Luc Van Gool. “HRDA: Context-aware high-resolution domain-adaptive semantic segmentation.” Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXX. Cham: Springer Nature Switzerland, 2022.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Although some technical details are missing in the current version of manuscript, the authors committed in the checklist that will provide codes. The data is public. Work is reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Please correct the typos: (1). in Page 4 Section Isointense Phase Synthesis: In the last sentence, it should be “discriminator DA2I to provide adversary loss to train the generator GA2I”. The current version used subscript ‘I2A’ which indicate Adult-like phase synthesis, which is not discussed in this section. (2). In Page 4 Section Adult-like Phase Synthesis: it should not be ‘(synthesized by the segmentation model SA)’, it should either be ‘(synthesized by the generator GI2A)’, or be ‘(segmented by the segmentation model SA)’, depends on whether authors want to explain the tissue probability maps or the synthesized brain images that are mentioned earlier.
    2. Lack of technical details: in multi-resolution fusion segmentation, from Fig2.c, it seems that authors apply different channel numbers in HR and LR branches. However, authors didn’t provide clear statement of whether two different models are used for HR and LR branches separately. From Fig2.c and text, the authors didn’t mention how to derive the final segmentation during inference given there are two predictions from two branches. Did authors only use prediction from HR branch or perform kinds of majority voting/weighted summation to ensemble predictions from two branches?

    3. Future directions: Ablation study for SPGAN components. In addition to classical cycle-consistence loss, the SPGAN features two components SPADE and structure loss. What’s the gain of performance by adding those two components? If the improvement is not significant, directly applying a 3D CycleGAN or StarGAN might be more straightforward. It will be more helpful if the authors could compare with more strong segmentation baselines such as nnUNet, SegFormer, and TransUNet
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I appreciate the complete ablation studies and the significant improvement from baseline UNet in this work. However, there are several typos/errors/confusing points in the current version of manuscript and the comparison with strong segmentation baseline is insufficient. I wish authors could correct the typos and errors in the final submission. Overall, I would recommend an acceptance of this work.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    I will maintain the previous grade. I think this paper is really borderline; in the first round of review, there are two main flaws: clarity and comparisons. Author claimed that they will improve clarity. And, the authors claimed that they will address comparison by adding more methods but the new results (especially TranUNet and SegFormer) look weird to me. The 2D network has some intrinsic limitations because it can’t access 3D information. However, severe slice artifacts and impacted performance should be alleviated if it is sufficiently trained. For example, with sufficient training, a 2D UNet would have some slice artifacts compared to a 3D UNet, but the Dice difference won’t be too large (say 10%). The huge performance drop of TransUNet (84.17% avg Dice) and SegFormer (78.21% avg Dice) is strange and can’t be well-justified by reason of the 2D model solely. Without seeing the training curve, I can’t justify if the model is sufficiently trained with enough slices and training iterations.

    That’s my major concern after reviewing their rebuttal. Because of that concern, I won’t improve the score nor decrease it. I’m clined to maintain my original grade.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors propose a transformer-based method for isointense infant brain tissue segmentation utilizing a generative model to synthesize isointense brains from adult-like brains and vice versa for data augmentation and a multi-resolution fusion strategy. The reviewers agree that the idea to use high tissue contrast image to guide the segmentation is interesting and they appreciate the ablation study. However, they expressed concern that the comparison to state-of-the-art methods is limited. In the rebuttal, the authors are expected to better motivate and justify the chosen baseline methods and discuss other SOTA approaches, as indicated by the reviewers. Additionally, the results of SPGAN (R#1) should be clarified, the figures improved, and details about the implementation and dataset added.




Author Feedback

We thank reviewers (R1, R2, & R3) and meta-reviewer (MR) for constructive comments. We appreciate their efforts and address their concerns below.

  • Comparison to SOTA methods (R1,R2,R3,&MR) We have now added nnUNet, TransUNet, and SegFormer as comparative methods. nnUNet achieves Dice (%) of 95.22 in CSF, 94.58 in GM, and 95.06 in WM. TranUNet achieves Dice (%) of 86.58 in CSF, 83.77 in GM, and 82.16 in WM. SegFormer achieves Dice (%) of 81.18 in CSF, 76.25 in GM, and 77.01 in WM. The relatively lower Dice ratios are achieved by TransUNet and SegFormer since they are 2D methods and produced apparent slice artifacts. These results further demonstrate the superiority of our proposed TMSN method, and we will add the results to the final paper.

  • Figure readability and description clarity (R1,R2,R3,&MR) We appreciate the feedback. We will thoroughly resolve all issues by carefully checking all potential mistakes in figures and also clarifying descriptions for better understanding.

  • Motivation of the transformer network used in TMSN (R1&MR) TMSN was designed to fuse two branch features with different fields of view (FOVs), where the transformer could better capture voxel correspondences across two different branches (with local and global features). TMSN can produce better feature fusion of the two branches than the simple concatenation, as supported by ablation results in Table 1.

  • How SPADE maintains structural consistency (R1&MR) SPADE directly utilizes tissue maps to perform feature normalization, by incorporating structural information into the synthesis process. In our work, SPADE takes advance of adult-like tissue maps, thus the generated isointense brain images can have consistent tissue structures with the adult-like brain images. We will include these explanations in the final paper.

  • Data augmentation and structural enhancement of SPGAN (R1&MR) Thanks for pointing out our unclear description. In our work, TMSN exploits both data augmentation (DA) and structural enhancement (SE) results, i.e., the outputs of SPGAN. In practice, SPGAN is a circle-GAN-based method, which is able to synthesize both isointense data and adult-like data. For DA, the synthesized isointense data are used to increase the size of training dataset, thus solving issue of lacking isointense data. For SE, the synthesized adult-like data are used to enhance the ambiguous tissue boundaries in the isointense data. We will make all these points clearer in final paper.

  • Difference between two challenges in the introduction (R1&MR) Both challenges come from the same issue, i.e., overlapping intensity distribution of tissues in the isointense phase. In particular, the first challenge emphasizes issue of lacking high-quality annotations. On the other hand, the second challenge emphasizes difficulty of segmenting isointense data, even with enough high-quality annotations as training data. We will clarify both challenges in final paper.

  • Interpretation of adult-like used in the experiment (R1&MR) We used the term ‘adult-like’ to denote the 1-year-old infant.

  • The ablation study is not really convincing (DA) (R2&MR) Table 1 shows the ablation study, which demonstrates the success of the TMSN. Although DA could exert a relatively large effect on SegNet, its benefit is not significant if employed in the TMSN. Nevertheless, in this study, we only augmented the isointense data twice, although more data augmentation times (i.e., 10) could benefit more. We will add the performance change with respect to the number of augmentation times in paper.

  • Results on iSeg and BCP dataset (R2&MR) We performed experiments on iSeg and BCP, and the results further demonstrate the superiority of our proposed framework. Due to page limit, we will introduce just brief results in final paper.

  • The link for the source code and model parameters are not provided (R2&MR) We will make the codes publicly available now. 8.82s for inference, and 44.51M parameters.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Some concerns were adequately addressed, others not. For example, the authors did not discuss the missing comparison to SOTA methods, but provided new experiments, which they most probably won’t be able to include in the final manuscript. Also, one reviewer pointed out issues with these new results. However, even with weaknesses in the experimental setup, I think the method is interesting for the MICCAI community and recommend acceptance.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors effort to address the reviewers’ concerns is appreciated. In the camera ready version, the authors should include additional details and clarifications as promised in the rebuttal. In particular, the post-rebuttal comments made by one of the reviewers, related to the comparisons with SOTA methods should be addressed. Additional results for iSeg and BCP should be included in the sup. if you cannot (due to lack of space) include them in the main paper.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    While this paper received mixed scores during the first phase of the reviews, it seems that after the rebuttal period there was a consensus towards the acceptance of this work. While I will not go against the general consensus, after reading the whole reviews, rebuttal and paper, I believe that authors should have better positioned their work compared to existing approaches (specially related to [1], as requested by R1). To my eyes, the idea is very similar to that work (applied in the same task), with the exception of the transformer-based model for fusing different modalities. Similarly, I found the empirical validation not convincing. In particular, authors compared their method to general segmentation methods, which do not even consider multi-modality fusion, and disregard a whole body of literature that has been specifically designed for this task (e.g., iso-intense brain segmentation, and for the iSeg challenge, among others). Thus, I consider that the value of this work is hampered by these two weaknesses. Having said this, I recommend the authors to consider all the criticism raised during the review process to improve their manuscript.



back to top