Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Malo Alefsen, Eugene Vorontsov, Samuel Kadoury

Abstract

Automated medical image segmentation using deep neural networks typically requires substantial supervised training. However, these models fail to generalize well across different imaging modalities. This shortcoming, amplified by the limited availability of expert annotated data, has been hampering the deployment of such methods at a larger scale across modalities. To address these issues, we propose M-GenSeg, a new semi-supervised generative training strategy for cross-modality tumor segmentation on unpaired bi-modal datasets. With the addition of known healthy images, an unsupervised objective encourages the model to disentangling tumors from the background, which parallels the segmentation task. Then, by teaching the model to convert images across modalities, we leverage available pixel-level annotations from the source modality to enable segmentation in the unannotated target modality. We evaluated the performance on a brain tumor segmentation dataset composed of four different contrast sequences from the public BraTS 2020 challenge data. We report consistent improvement in Dice scores over state-of-the-art domain-adaptive baselines on the unannotated target modality. Unlike the prior art, M-GenSeg also introduces the ability to train with a partially annotated source modality.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_14

SharedIt: https://rdcu.be/dnwCY

Link to the code repository

https://github.com/MaloADBA/MGenSeg_2D

Link to the dataset(s)

https://www.med.upenn.edu/cbica/brats2020/registration.html


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a new semi-supervised generative training strategy (M-GenSeg) for cross-modality tumor segmentation on unpaired bi-modal datasets. The method combines healthy-diseased translation and modality translation together, which achieves domain adaptation from a partially annotated source modality to an unannotated target modality. The experiments demonstrates the effectiveness of the method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1)This paper introduces a novel way of training with a partially annotated source modality by healthy-diseased translation for target modality tumor segmentation. 2)The experiments are thorough. The attention maps showed in Fig.2 are helpful for understanding the architecture. The performance looks good when 1% of the source modality are annotated. 3)The writing is good.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1)The novelty is limited, due to it is a combination of GenSeg and CycleGan. 2)Unlike UAGAN, the method is only applicable on unpaired bi-modal datasets, which can not be used when multi-modal datasets are available simultaneously.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I believe the paper is reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    1)Why the method is called M-GenSeg, please explain the meaning of ‘M’. 2)When doing absence to presence translation, why you sample a code from normal distribution to replace the encoded unique code from the original image? Since there is no evidence that unique code obeys normal distribution. 3)Why the common decoder focus the outside of the brain in Fig.2a? Please explain. 4)What is the annotation rate of the experiments in Table 1? 5)Why the performance of supervised Unet is decreasing when the annotation rate is increasing? Please explain.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    A new semi-supervised generative training strategy for cross-modality tumor segmentation on unpaired bi-modal datasets. The experiments are thorough and convincing.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    Authors propose a model based on image translation and domain adaptation to conduct cross-modality image segmentation with only weakly-supervised source domain labels. The method merges modality translation with healthy/disease translation in order to improve label efficiency, in the hope that by learning to hallucinate tumors the segmentation model will improve the segmentation performance in the target modality. Experiments are conducted on the BraTS 2020 dataset across the T1, T2, T1ce and FLAIR MRI contrasts, with strong baselines and ablation. Results show that the model improved the performance in both the source and target modalities.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Overall this is a well written paper, with no major grammar/spelling/conceptual errors, relatively good readability, well motivated methodology, well designed experimental setup and strong results in comparison to competitive baselines.

    Historically CycleGAN approaches have been unsuccessful in segmenting structures as tumors mainly due to their large intra-class variability and lack uniformity across samples. The authors propose a cyclic network capable of dealing with that by learning to hallucinate novel tumors, while at the same time performing cross-modality translation. That is an interesting approach.

    The methodology is sound and intuitive, even if its presentation can be improved.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The major weakness in this work is the lack of reprodicibility, mainly as no code or pretrained models are provided and with a considerably complex framework such as this one. The reviewer strongly suggests that the authors publicize their work in an online repository.

    Additionally, authors should further clarify the distinctions between this work and the reference [13], as the differences are not explicit.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of this work is compromised, as the architecture is rather complex and not enough details are provided in the manuscript to replicate it. Also, no code nor pretrained models are made available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    “Recently Billot et al. [2] proposed a domain randomisation strategy to segment images from a wide range of target contrasts without any fine-tuning. The method demonstrated great generalization capability, but is not suitable for tumor segmentation.”

    • Why is this method not suitable for tumor segmentation?

    • Authors should fix the title in odd pages: “Title Suppressed Due to Excessive Length”.

    • I strongly suggest the authors to try to improve the mathematical notation in Section 2. In the current form it is rather difficult to follow this section, even with good knowledge of Cycle-Consistent GANs for image translation.

    • Equations (6) through (9) lack explanation in the text. The translation process is a key to this work’s methodology and, thus, should be carefully described for readers not familiarized with CycleGANs. Also, authors should make sure that all mathematical notation in these equations is clear and objective (i.e. the operation ◦ is not explained).

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I’d say this paper is an accept, but I cannot argue higher than a “weak accept” with reproducibility concerns as the ones from this work. If the authors fix this issue by the rebuttal phase, I’ll strongly argue for the acceptance of this paper.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper presents a domain adaptation method for cross-modality brain tumor segmentation. The method is based on an existing method GenSeg to locate the tumor on source and target images by learning to translate between a diseased image and a healthy image. A modality translation component is further added to deal with the domain shift across modalities. Experimental evaluations are conducted on BraTS 2020 dataset, showing some improvements.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The method is designed dedicatedly for cross-modality tumor segmentation by combining the tumor/healthy translation and modality translation.

    • The method shows superior performance than previous methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The technical contributions are not sufficiently significant since a major part of the framework is based on the existing method GenSeg. The added image translation is also a commonly used domain adaptation technique.

    • The entire framework is complicated and difficult to follow. It is unclear how the absence to presence translation is achieved. There is no loss function to supervise the translation. Since there are no annotations on the target images, how to train the tumor “presence” to “absence” translation which requires image-level labels?

    • Ablation study lacks clarity on how to implement each variant. Since the presented framework is complicated consisting of multiple modules and loss functions, it is difficult to understand how each ablation experiment is conducted and how each component contributes to the overall framework.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code will be released.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • The authors need to justify their technical contributions since their framework is largely based on existing works.

    • The overall framework is difficult to understand. It lacks clarity on how each component is realized and supervised and how each component is related to the cross-modality segmentation.

    • Fig. 3 only provides the relative Dice scores compared to the baseline. It is important to provide the absolute Dice scores which shows whether the cross-modality tumor segmentation can be of practical values.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Combining GenSeg with modality translation is a reasonable extension of GenSeg, but the technical contributions are not significant enough. Moreover, the overall framework lacks clarity, making it difficult to understand how each translation component is achieved.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    Although not significantly novel, the healthy-diseased translation is an interesting application for target modality tumor segmentation and the authors presented their contributions compared to prior works in the rebuttal. The entire framework is complex and difficult to follow, but the provided code and model can help reproduce the work. Based on these reasons, the reviewer has adjusted the score to a “weak accept” but would not feel regret if the paper were ultimately rejected.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Assessment: The idea of using semi-supervised generative training for segmenting multi-sequence MRI, particularly by synthesizing tumors into healthy brain regions is potentially interesting. However, the technical and conceptual novelty is highly limited given that such an approach has previously been explored by several prior works (e.g. https://aapm.onlinelibrary.wiley.com/doi/full/10.1002/mp.14701, there are several others). The proposed framework is also rather difficult to follow. The figures could be improved. Also, a point to consider is that the presented approach is not cross-modality augmentation like some of the referenced works did (e.g. CT to MRI or vice versa) but deals with single modality MRI images. Of note, typically, the multi-parametric MRIs are acquired during a single session and especially for brain, they can be trivially coregistered using standard registration methods. What then is the use of learning to segment on the other sequences? This should be better motivated. It’s also important to consider that the different MR sequences capture slightly different characteristics of the tumors (e.g. edema vs. tumor core vs. necrosis etc). As a result, the tumor definition itself can be different from sequence to sequence. How is this handled by the proposed approach?

    To address in rebuttal: Please clarify the method details better as it is hard to follow. The contributions should be clearly differentiated from prior works that specifically focused on brain glioma augmentation. Also, what is the clinical relevance of this task, in light of the fact that the images can be trivially coregistered when they are acquired in a single imaging session. Reviewers noted other concerns and clarification questions such as the rationale for the sampling method used for synthesis. The authors are strongly encouraged to address all of the reviewers’ concerns.




Author Feedback

1) Novelty While our work shares the idea of generating diseased samples from healthy ones for data augmentation as in [A], there are significant differences :

  • Our model aims primarily at tackling cross-modality lesion segmentation tasks. Using [A] would involve training the generation of masks with the pixel-level annotations available in the source modality. Diseased target samples could then be generated from healthy target images via the inpainting process. However, our proposed model also synthesizes diseased target samples from diseased source samples, therefore we do not need to generate additional tumor masks.
  • Inpainting like [A] are limited to data augmentation and do not incorporate any unannotated diseased samples when training the segmentation network.
  • These methods were not tested with limited pixel-level annotations in a weak-supervision setting. It is likely that such a deficit would hinder the diversity of the generated masks and the resulting segmentation performance as well.
  • [A] assumes that there is only one instance to generate, which would be limiting for multiple lesions (e.g. HCC in livers). GenSeg was shown to be able to handle several instances, and this feature would extend to our model.

[A] Kim et al. (2021) Synthesis of brain tumor multicontrast MR images for improved data augmentation. Med Phys.

2) Clinical relevance

  • We agree that it is not clinically useful to learn cross-sequence segmentation if multi-parametric acquisitions are performed. However, our experiments were designed to select only one specific contrast to segment per patient, therefore different to cross-sequence segmentation. We believe our modified version of BraTS provides an excellent study case to assess the actual performance of any modality adaptation method for tumor segmentation.
  • Although we didn’t present MR/CT adaptation, this is ongoing work. While cross-modality organ segmentation is common, there are few public datasets for MR/CT tumor segmentation.
  • Conditions such as Vestibular Schwannoma, where new hrT2 sequences are set to replace ceT1 for diagnosis to mitigate the use of contrast agents, is a sample use case for this method.
  • We perform unsupervised tumor delineation via GenSeg, which allows us to adapt segmentations to the specific tumor features in each modality.

3) Details

  • Notations will be made more explicit. S_L and T_L will refer to source and target modality images, with label L ∈ {H,D} indicating whether it is Healthy or Diseased. S_L^T/S_L^TS and T_L^S/T_L^ST will refer to the cross-modality translations and cyclic reconstructions of the latter. GenSeg notations and Fig.1 will be harmonized accordingly.
  • About R3’s interrogations, we specify that all the images (S and T) are provided with healthy/diseased weak labels, distinct from the pixel-level annotations that we provide only to a subset of the data.
  • To improve clarity of the proposed framework, we will specify that Fig.1 represents one of the two branches of the whole model. Source images are passed through the first branch to train the source GenSeg module and the S → T* → S cyclic translation. Symmetrically, target images are treated in the second branch to train the target GenSeg module and the T → S → T cyclic translation. Target GenSeg segmentation decoder is also trained on annotated fake T* images.
  • We outline that we train a GenSeg framework on the source modality for the model to be aware of the tumor appearances in the source images, even with limited source pixel-level annotations. This preserves tumor structures during the generation of pseudo-target samples, an improvement over traditional GAN’s.
  • We enforce the distribution of unique codes to match the prior N(0,I) by making u_AP match u, where u_AP is obtained by encoding the fake diseased sample x_AP, produced with random sample u.
  • Fig.3 will display absolute Dice scores, rather than in Table 1.
  • Code/pre-trained models : https://github.com/MaloADBA/MGenSeg_2D




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The idea of using semi-supervised generative training for segmenting multi-sequence MRI, particularly by synthesizing tumors into healthy brain regions is potentially interesting. The authors clarified the differences of their work from one representative prior work, which it is hoped the authors will update in the paper’s discussion. The paper remains difficult to follow, but assuming the code is provided following acceptance, the paper might be easier to replicate. The authors should also clarify in the discussion that multi-modality refers to different MRI acquisitions as opposed to different imaging modalities.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presents an interesting approach to domain adaptation by employing a generative model to leverage partially provided labels for training the segmentation model across various modalities. However, during the initial review phase, several concerns were raised, such as unclear method description and a perceived weak contribution compared to prior work. Nevertheless, the authors have adeptly addressed most of these issues during the rebuttal, resulting in a change of rating from weak reject to weak accept by R3. In conclusion, considering the paper’s focus on clinically relevant problems and its notable merits, I recommend accepting this submission.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors addressed most of the concerns in the rebuttal. In the camera ready version please include the details and discussion presented in the rebuttal. Also, revise the paper for clarity and ensure that your work in reproducible.



back to top