Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Xinyi Yu, Guanbin Li, Wei Lou, Siqi Liu, Xiang Wan, Yan Chen, Haofeng Li

Abstract

Nuclei segmentation is a fundamental but challenging task in the quantitative analysis of histopathology images. Although fully-supervised deep learning-based methods have made significant progress, a large number of labeled images are required to achieve great segmentation performance. Considering that manually labeling all nuclei instances for a dataset is inefficient, obtaining a large-scale human-annotated dataset is time-consuming and labor-intensive. Therefore, augmenting a dataset with only a few labeled images to improve the segmentation performance is of significant research and application value. In this paper, we introduce the first diffusion-based augmentation method for nuclei segmentation. The idea is to synthesize a large number of labeled images to facilitate training the segmentation model. To achieve this, we propose a two-step strategy. In the first step, we train an unconditional diffusion model to synthesize the Nuclei Structure that is defined as the representation of pixel-level semantic and distance transform. Each synthetic nuclei structure will serve as a constraint on histopathology image synthesis and is further post-processed to be an instance map. In the second step, we train a conditioned diffusion model to synthesize histopathology images based on nuclei structures. The synthetic histopathology images paired with synthetic instances maps will be added to the real dataset for training the segmentation model. The experimental results show that by augmenting 10% labeled real dataset with synthetic samples, one can achieve comparable segmentation results with the fully-supervised baseline.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43993-3_57

SharedIt: https://rdcu.be/dnwN2

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces a novel approach to augment nuclei instance segmentation training by synthesizing a large set of labeled nuclei images. The authors propose a two-stage process, when an unconditional model is first trained to generate instance maps, followed by a conditional diffusion model to synthesize corresponding histopathology images. This ensures that the generated images are accurately paired with the instance maps. The proposed method provides an interesting solution for generating a diverse range of training data for nuclei segmentation models.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper presents a novel and well-organized approach to data augmentation for nuclei segmentation training, using diffusion models to synthesize a large number of labeled images. Overall, the paper is well-written and provides an interesting contribution to the field:

    1.The proposed method provides a novel and effective approach to data augmentation, using diffusion models instead of traditional techniques like flipping, rotation, and cropping. 2.The two-stage pipeline presented in the paper ensures that the generated images are accurately paired with instance maps, resulting in high-quality training data for nuclei segmentation models. 3.The paper’s well-organized structure and clear presentation make it easy to follow and understand the proposed method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    This paper presents a novel approach to data augmentation for nuclei segmentation model training. However, the experiment section is relatively weak, and there seem to be some rushed aspects, such as the identical captions for Table 1 and 2 without detailed information.

    1.The paper lacks validation on a large-scale dataset, such as Lizard, which is a significant dataset for nuclei segmentation and classification. Limiting the validation to the relatively small datasets is inadequate.

    Graham, Simon, et al. “Lizard: A large-scale dataset for colonic nuclear instance segmentation and classification.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.

    2.There is no comparison of the proposed augmentation method with traditional methods like rotation, flipping, cropping, and color jittering. A table comparing these methods with the proposed one is necessary to demonstrate the effectiveness of the proposed method.

    3.Tables 1 and 2 have identical captions and lack detailed information about the data. This issue gives the impression of a rushed paper.

    4.The generalization problem of the proposed augmentation method on PFF Net is presented in both Table 2 of the main paper (Kumar) and Table 2 of the supplementary material(MoNuSeg). Additionally, in Table 1, both these datasets are shown in the same table, further highlighting the rushed nature of the paper.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors have indicated that they will release both the training code and evaluation code, which is a positive step towards ensuring reproducibility of the results presented in the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Here are some suggestions that may assist the authors in polishing their paper:

    1. The authors may consider conducting more experiments on larger datasets, such as the Lizard dataset, to better demonstrate the effectiveness of their proposed data augmentation method. This would also highlight the importance of their approach in addressing the challenges of nuclei segmentation.

    2.It would be beneficial for the authors to include a comparison of their augmentation method with traditional data augmentation techniques such as rotation and flipping. This would help validate the effectiveness of their proposed approach and provide a more comprehensive evaluation of the method.

    3.The manuscript should be revised carefully, with particular attention given to the captions for the tables. It may be helpful to re-organize the tables according to their data and ensure that all captions provide detailed information about the contents of the table. This would improve the clarity of the paper and avoid giving the impression of a rushed publication.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the proposed approach to data augmentation for nuclei segmentation model training is novel and has potential, there are some areas of weakness in the paper that need to be addressed. Specifically, the experiment section should be strengthened with more diverse experiments on large datasets and a comparison with tradition methods of data augmentation. Additionally, a careful revision of the manuscript is needed to clarify the tables and captions, which currently give the impression of a rushed publication.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors have provided valuable feedback addressing my major concern, which was the lack of comparison with other methods, including simple naive augmentations such as flipping or rotation. Their response has alleviated my concers, and based on this, I have decided to revise my rating for the paper.



Review #2

  • Please describe the contribution of the paper

    This paper proposes a method of data augmentation for nuclei segmentation in histopathology, based on diffusion models. A first diffusion model is trained to generate nuclei structure maps in terms of a binary semantic mask and horizonal and verical distance maps. A second diffusion model is trained interchangeably in conditional and unconditional modes to denoise H&E patches, conditional on nuclei structure maps. Generated pairs of image patches and instance masks are used to augment training data on two standard datasets for two standard segmentation networks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written and well argued. The results are visually impressive. The performance improvement with this type of augmentation is uniform and significant. The paper appears to be a novel demonstration of the potential for diffusion models in histopathology analysis and, in particular, the strength of diffusion models as conditional generators offering diverse samples (see supplementary Figure 1), a known shortcoming of GANs. GAN-based methods have also previously relied on ad hoc mask generators to produce samples, whereas this pipeline learns the generator from the annotated masks directly.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Major:

    • There is no comparison to other data augmentation techniques or prior art. A comparison should be made at least with standard augmentation techniques. Another option would be to compare with previous work with pix2pix[1] or cycleGAN[2], which require a mask generator (this could be the first diffusion model).

    Minor:

    • w in Equation 6 is not defined.

    • There are various typos throughout the paper that could be addressed with a proof read or automatic spell checker.

    [1] Hollandi, Reka, et al. “A deep learning framework for nucleus segmentation using image style transfer.” Biorxiv (2019): 580605. [2] Mahmood, Faisal, et al. “Deep adversarial training for multi-organ nuclei segmentation in histopathology images.” IEEE transactions on medical imaging 39.11 (2019): 3257-3267.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The code should be released. The datasets studied are public. There don’t appear to be any obstacles to making the results fully reproducible apart from the computational burden of training the diffusion models.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    See above weaknesses. The only real flaw is a lack of benchmark comparisons.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The lack of benchmarking is scientifically important and would help position the work, but this is outweighed by the results, which are impressive in and of themselves.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    After consulting the rebuttal and my fellow reviewers, my review is unchanged. I agree with R1 and R3 that a lack of comparisons is a notable weakness, and I do not think this is fully addressed in the rebuttal, but I still think this is outweighed by the strengths of the paper, which is furthermore topical enough to warrant publication.



Review #3

  • Please describe the contribution of the paper

    The paper presents a diffusion-based data augmentation method for nuclei segmentation in histopathology images, consisting of an unconditional nuclei structure synthesis model and a conditional histopathology image synthesis model taken synthesized nuclei instance map as input. Experiments on MoNuSeg and Kumar datasets showed that the proposed method consistently improved segmentation performance compared with segmentaion model trained without any augmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The paper addresses the important problem of label efficiency training in medical image segmentation, presenting a clear motivation and an interesting idea.

    2) The proposed two-step synthetic augmentation method consistently improves segmentation performance across different labeling proportions on both MoNuSeg and Kumar datasets compared with model trained without any augmentation. The method also generalizes well to different segmentation models Hover-Net and PFF-Net.

    3) The synthesized histopathology images appear realistic, and the alignment between nuclei structures and images can be clearly observed.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The paper’s main weakness is the lack of comparison to other augmentation methods. It would be beneficial to compare the proposed method with conventional augmentations, nuclei instance segmentation methods such as [1], and synthetic augmentation methods like [2]. Proper comparisons are essential for evaluating the effectiveness of the proposed synthetic augmentation method. [1] InsMix: Towards Realistic Generative Data Augmentation for Nuclei Instance Segmentation. Lin et al. MICCAI 2022 [2] Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation. Choi et al. CVPR 2019

    2) The paper claims that concatenating nuclei structure and image through a cross-attention module would degrade image fidelity and yield unclear correspondence. However, no comparison or evaluation is provided in the experiments to support this claim. For the adopted SPADE model, unlike in the original paper, which used a discriminator to improve the alignment between the label map and image, an explicit guarantee of SPADE alignment is missing in the proposed method. Therefore, an ablation study on the adopted SPADE model is necessary to strengthen the paper’s findings.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The description of the method is clear, the paper seems reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    An ablation study on the SPADE module and more evluations on the alignment between the nuclei structure and the synthesized images could better demonstrate the contribution of the SPADE module. More comparisons to other SOTA augmentation methods are necessary yet lacking,

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the idea of this paper is interesting, evluations of the proposed method is insufficient. Both alation study on the

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    I appreciate that the authors include additional comparisons to CycleGAN and an ablation study on the SPADE module in the rebuttal. However, I believe CycleGAN should no longer be considered a SOTA for this task. Overall I think this paper is interesting and I tend to keep my original rating.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper introduces a two-stage diffusion model-based data augmentation method for nuclei segmentation in histopathology images. It first uses an unconditional diffusion model to synthesize nuclei structure images, and then adopts a conditioned diffusion model to synthesize histopathology images based on the previously generated nuclei structures. With only a small amount of labeled real data, the method can produce realistic nuclei images, which can be used to train state-of-the-art deep models for nuclei segmentation.

    Despite impressive image synthesis results, the reviewers raised several concerns as follows:

    1. All the reviewers (R1, R2 and R3) commented that a major weakness is a lack of comparison with other data augmentation and/or nuclei segmentation methods.

    2. R1 commented that both datasets used in the experiments are small, and evaluation on larger datasets should be considered. Although the authors are not encouraged to add new experiments in the rebuttal, they should consider clarifying why using the current datasets for model evaluation is sufficient.

    3. R1 pointed out that the clarity and organization of the manuscript needs to be improved. R2 also commented that there are a lot of typos in the paper, and a revision is needed.

    4. R3 commented that the motivation and the justification of the SPADE module is insufficient.

    Please consider addressing these comments in the rebuttal.




Author Feedback

We thank the reviewers for their supports: R1: “…a novel and effective approach to data augmentation”; R2: “this pipeline learns the generator from the annotated masks directly”; R3: “the alignment between nuclei structures and images can be clearly observed”. We address the comments as below.

Q1 (R1/R2/R3): No comparison between our method and traditional or other data augmentation methods. A1: Naive augmentation (NaiveAug), such as flipping, rotation, and color jittering, involves image transformations, while our method focuses on image generation. The two methods are compatible, as each generated image can be further transformed to create augmented samples. All results in the paper are obtained with NaiveAug. In Tab 1&2 of the draft, “labeled” indicates the use of NaiveAug while “augmented” indicates the use of NaiveAug and our method. Using our method with NaiveAug surpasses using NaiveAug alone. For the 10% labeling proportion of MoNuSeg (Tab 1), using our method with NaiveAug improves Dice by 3.2% and AJI by 4.4% compared to only using NaiveAug. Besides, we compare our method to cycleGAN[a] on MoNuSeg with HoverNet (our Step1 model as the mask generator). Tab A shows that our method surpasses cycleGAN for any of the 4 labeling proportions. For the 10% labeled dataset, our method exceeds cycleGAN by 1.7% in Dice and 2.7% in AJI. Prop. Method Dice AJI 10% cycleGAN 0.8124 0.6511 10% Ours 0.8291 0.6785 20% cycleGAN 0.8179 0.6605 20% Ours 0.8219 0.6657 50% cycleGAN 0.8275 0.6745 50% Ours 0.8291 0.6764 100% cycleGAN 0.8138 0.6557 100% Ours 0.8336 0.6810 Tab A. CycleGAN v.s. Ours with HoverNet on MoNuSeg. [a] Unpaired image-to-image translation using cycle-consistent adversarial networks. ICCV 2017.

Q2 (R1): Identical captions of Tab 1 & 2. A2: The captions of Tab 1 & 2 are different. Tab 1 is to show the effectiveness of our method with HoverNet. Tab2 is to show the generalization with PFFNet. The details are reported in the 2nd & 3rd paragraphs in Sec 3.2. Thanks for the comment and we will add more explanations in the revision.

Q3 (R1): Lack of being validated in a larger dataset. A3: MoNuSeg and Kumar are two popular benchmarks for nuclei segmentation. We aim to show that datasets with less samples can benefit from our proposed augmentation. Here we report the results on the large Lizard dataset. In Tab B, our method consistently enhances segmentation metrics across the 4 labeling proportions. Notably, augmenting the 10% labeled dataset with our method yields increases of 1.9% in Dice and 5.2% in AJI. Even with the fully-labeled data, using our method improves the baseline (denoted as ‘–’) by 0.6% in Dice and 0.5% in AJI. Prop. Aug. Dice AJI 10% – 0.7371 0.4951 10% Ours 0.7562 0.5475 20% – 0.7579 0.5328 20% Ours 0.7588 0.5516 50% – 0.7605 0.5595 50% Ours 0.7630 0.5626 100% – 0.7791 0.5775 100% Ours 0.7847 0.5826 Tab B. Results with HoverNet on Lizard.

Q4 (R2): w in Eq. 6 and serveral typos. A4: In Eq 6, w is a scalar controlling the strength of classifier-free guidance. We will correct all the typos.

Q5 (R3): An ablation study on the SPADE module. A5: To compare SPADE with the cross-attention module, we replace SPADEs with cross-attention modules and keep the other parts of U-Net unchanged. A cross-attention module takes a feature map and the nuclei structure as input. Query is computed from the nuclei structure and Key & Value are computed from the feature map. In Tab C, SPADE consistently exceeds cross-attention by 0.3%-1.2% Dice and 0.4%-2.1% AJI for all labeling proportions. The results affirm that the SPADE module is the preferred choice for our Step2 model. Prop. Module Dice AJI 10% cross-attn 0.8175 0.6579 10% SPADE 0.8291 0.6785 20% cross-attn 0.8198 0.6597 20% SPADE 0.8219 0.6657 50% cross-attn 0.8247 0.6723 50% SPADE 0.8291 0.6764 100% cross-attn 0.8292 0.6773 100% SPADE 0.8336 0.6810 Tab C. SPADE v.s. cross-attention with HoverNet on MoNuSeg.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The reviewers found merit in the proposed study, such as novel algorithm design or a new application of diffusion models for data augmentation, and impressive experimental results. The rebuttal (partially) addressed the major concerns from the reviewers including comparison with other data augmentation methods, and all reviewers are in favor of (weak) acceptance after the rebuttal.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Authors have adequately addressed the concerns raised by reviewers and the paper is quite meritorious. The reviewers agree on the paper’s impact and after the rebuttals there is consensus to accept the paper



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Reviewers agree on the decision of acceptance.



back to top