Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Hyun-Jic Oh, Won-Ki Jeong

Abstract

Nuclei segmentation and classification is a significant process in pathology image analysis. Deep learning-based approaches have greatly contributed to the higher accuracy of this task. However, those approaches suffer from the imbalanced nuclei data composition, which shows lower classification performance on the rare nuclei class. In this paper, we propose a realistic data synthesis method using a diffusion model. We generate two types of virtual patches to enlarge the training data distribution, which is for balancing the nuclei class variance and for enlarging the chance to look at various nuclei. After that, we use a semantic-label-conditioned diffusion model to generate realistic and high-quality image samples. We demonstrate the efficacy of our method by experiment results on two imbalanced nuclei datasets, improving the state-of-the-art networks. The experimental results suggest that the proposed method improves the classification performance of the rare type nuclei classification, while showing superior segmentation and classification performance in imbalanced pathology nuclei datasets.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_33

SharedIt: https://rdcu.be/dnwBt

Link to the code repository

https://github.com/hvcl/DiffMix

Link to the dataset(s)

CoNSeP: https://warwick.ac.uk/fac/cross_fac/tia/data/hovernet/

GLySAC: https://drive.google.com/drive/folders/1p0Yt2w8MTcaZJU3bdh0fAtTrPWin1-zb


Reviews

Review #1

  • Please describe the contribution of the paper

    A new workflow to create new annotated H&E images for semantic segmentation is proposed. The authors propose a way to generate synthetic semantic segmentation masks of cell nuclei by balancing the label distribution or enlarging the existing labels. These are used by a synthesis module based on a diffusion model to generate the corresponding new virtual H&E image.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed workflow uses efficiently some of the existing approaches considering class-imbalance (balancing and enlarging) for data augmentation and the recent diffusion models, achieving accurate and realistic results. The authors tested their approach against state of the art methods for semantic segmentation of cell nuclei in histopathology images, showing part of the potential of using smart data augmentation

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • While the need for data augmentation in almost any DL model training is straightforward, it is not clear from the description of results, whether using this smart approach is indeed much more beneficial than developing a smart data sampling strategy that considers class imbalance during the training (and which is also, by far, easier to implement). The authors propose HoVer-Net and SONNET as reference approaches for the current segmentation task, however there is not clear information about the input data sampling, if any, developed in these networks.

    • The authors focused their work on the semantic data imbalance. However, there is also the case of having a biased distribution of the images being actually generated. So how much does the model depend on the distribution of the training data?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors stated that the code will be provided upon acceptance.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Page 3, Typo: with the experimentAL results… Page 5, Section 2.4. The text sais “with pretrained SDM conditionaed…”. Please review whether this sentence reads well. Additionally, I would recommended the authors citing or mentioning who contributed this pretrained model.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The results obtained by the authors are quite promising. However the task and the proposed workflow for data imbalance are less innovative.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors proposed to solve the class imbalance problem in cell segmentation via generating cell instances of the rare classes using conditional diffusion models. Their method adopted a recently proposed design to generate images conditioned on semantic masks. The authors validated their design on two public datasets and demonstrated improved results with most of the metrics and cell classes.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • This paper adopted a recent SOTA diffusion model and assessed its feasibility in addressing class imbalance problem in the histopathology cell nucleus segmentation, which is of great interest and can be potentially of very good inspiration to the community
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The novelty of the proposed method was limited to the design of masks to include rare classes.

    • The listed performance measures for the baseline GradMix (MICCAI, 2022) were different from these reported in the GradMix paper. The results from the original GradMix paper (Table 2 and 3) were better than these listed in this paper.

    • Some descriptions of the key aspects of methodology were ambiguous: what does the authors refer to by “ enlarging” (Method section and Figure 2) for one of the mask designs?

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    In general, it seems feasible to reproduce the results. However,

    • Some mask design details are missing.
    • Code is not published
    • Training time is not reported
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • It is unfair to train the baseline models under the same condition as the proposed method and reported only the reproduced results. It is standard practice at least to list the original results from baseline references.

    • Would the additional computational overhead of training/inferencing with diffusion models outweigh the practical benefits? The authors are encouraged to discuss and report the computational overhead.

    • Since GANs like Pix2Pix have been reported to perform well for histopathology image generation, it will be very helpful if authors include such models as baseline.

    • A question regarding the synthetic data is whether the designed masks with rare classes as well as the generated images are biologically plausible. The authors are encouraged to review these images with pathologists.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Despite that the topic for addressing class imbalance problem is of great interest to the community, the novelty was limited to adopting a recent conditional diffusion model to histopathology image generation. The reported model evaluation results were not convincing: the baseline performance listed in this paper were different from what was reported in the original baseline reference (GradMix).

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors proposed a novel data synthesis method based on a diffusion model. The experiments and results showed the method is significantly improved over the state-of-the-art on two datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written and presented. The formulas and equations are clearly defined and expressed. The experiments and results are self expanatory, which supports the paper goals.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some of paragraphs are a bit long, for instance, the one before the conlusion section, can be broken down into two.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The description of the method is clear and the implementation should be straightforward, it would be better if the authors can publish the code somewhere.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The experiments and results would be even stronger if the authors can include another much larger dataset, for instacne in the thounsands of H&E images. So they can demonstrate the scalability of their method.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written and the method is clearly explained. It has introuced something novel for the field of nuclei segmentation and classificiation. The experiments and results are strong, can support the paper goals.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    An image augmentation strategy, named DiffMix, is proposed for nuclei segmentation and classification. In the strategy, the authors contribute a label maps generation method to highlight rare classes and introduce shifted nuclei. These generated labels will be mapped to nuclei images using an off-the-shelf diffusion model [18] to enlarge the training set. Performances are validated on two publicly available datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is easy to follow and clearly organized. Performances are achieved using the same training iterations for a fair comparison. Table 1 shows the strategy could improve performance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Performance gain seems limited, especially on segmentation tasks where almost all improvements are less than 0.5%. The proposed method even achieves lower results compared to its baseline (HoVer-Net for GLySAC segmentation measured by AJI and DQ).

    To demonstrate the superiority of using the diffusion model for augmentation, the proposed DiffMix could be compared with other existing methods, such as CutMix, CutOut, and GAN-based methods. Existing results seem not strong enough to support the analysis in the introduction section regarding the limitation of existing methods.

    More careful proofreading is required. I think there are wrong results in Table 1. HoVer-Net for CoNSeP segmentation measured by SQ and PQ seems inversed.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I think the necessary details are included.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This work has a good motivation that explores whether a new generative model could contribute to medical image augmentation. However, the insufficient comparison with other related models, especially GAN-based methods, weakens its superiority, leaving the readers unclear why choosing the diffusion model rather than other options.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I feel concerned about insufficient comparison with existing related works and the limited performance gain. So, I suggest a weak reject and would like to raise it if more numerical evidence could be provided.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    The insufficient comparison with other related arts still concerns me and I would like to keep my rating.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors propose a method to solve the class imbalance problem in nuclei segmentation via generating nuclei instances of the rare classes using conditional diffusion models. By synthesizing new data, the authors demonstrate that their method can improve the segmentation and classification performance over the state-of-the-art methods on two datasets. This work adopted the recent SOTA diffusion model and assessed its feasibility in addressing class imbalance problem in the histopathology cell nucleus segmentation and classification, which is of great interest and can be potentially of very good inspiration to the community. The paper is well written and presented. The formulas and equations are clearly defined and expressed. However, there are some concerns on the methods and experimental setup and results. The introduction of diffusion model for nuclei image synthesis is interesting. But, the novelty of the proposed method is a bit limited to the design of masks to include rare classes only. Some descriptions of the key aspects of methodology are a bit ambiguous. The authors may improve them for better clarity of the paper. The authors discussed several related works but no direct comparison on them. Though additional experiments may not be feasible, authors may provide their further explanations and insights into the existing methods and how the proposed work tackles the limitation of them. Some of the experimental results are not consistent, so the authors may provide further explanations on them.




Author Feedback

  1. [R2, R4] Need comparison with more related works, such as CutMix, CutOut and GAN-based methods We mainly compared our method to one of the most recent SOTA methods, GradMix [MICCAI22]. [CutMix and CutOut] They are rather outdated methods and did not perform better compared to the baseline (HoVer-Net). CutMix showed lower scores (-2.5% to -0.5% per metric), while CutOut performed similarly in GLySAC and slightly better in CoNSeP (-0.3% to +1% per metric). Due to space limitations, we excluded CutMix and CutOut from the text. [GAN-based methods] SDM generates more realistic images compared to SOTA GAN methods. We also confirmed that the Pix2Pix method performed worse in CoNSeP (-3% to -0.6% per metric and much lower F_M). Therefore, we did not directly compare to GAN approaches.

  2. [R1, R2, R3] Need detailed description of data sampling and mask design “Enlarging maps” perturb the positions of nuclei instances to diversify datasets. As for the comparison with the smart sampling strategy, our method can handle imbalance at the instance level more precisely in the balancing map by removing or adding individual nuclei while the sampling strategy can be done at the image level by selecting more or fewer images to reduce class imbalance in the training data. We will clarify the text on the methodology description.

  3. [R1, R2] Limited novelty We introduce a unique combination of semantic diffusion models and virtual masks to synthesize realistic data, offering a novel solution for addressing data imbalance in pathology nuclei analysis. Moreover, our approach performs superior across SOTA networks and significantly improves the classification score for the rarest nuclei type (F_M) in CoNSeP (Table 1). To the best of our knowledge, this is the first attempt to employ the diffusion model to address the semantic data imbalance in pathology.

  4. [R2] Discrepancies in GradMix results from the original paper Replicating the exact results in the GradMix paper was impossible due to randomness in the GradMix algorithm, hyperparameters of SONNET, and random splits of the train set. We believe a direct comparison between our result and the numbers from the GradMix paper is not accurate to assess the performance difference. Instead, we conducted all the experiments under the same environment; We used the official codes (HoVer-Net, SONNET, and GradMix), conducted cross-validation, maintained consistent hyperparameters for the baseline networks, and carefully splitted data to include all nuclei types in each fold. The only difference between the methods was the type of synthesized data (by GradMix or DiffMix) in the training set. The results demonstrated that DiffMix consistently outperforms GradMix across SOTA baselines, particularly on the CoNSeP dataset (Table 1).

  5. [R1] Model dependency on training data distribution Diffusion models, unlike GANs that directly generate images from noise, progressively refine a noise vector to an image. Generating a new image from a noise-added input image as in our method, model dependency on the training data distribution can be mitigated to a certain degree.

  6. [R4] Limited performance gain on segmentation Due to the baseline networks having separate branches for segmentation and classification, our semantic balancing strategy had a minor effect on segmentation (because no semantic difference in foreground labels) but significantly improved classification.

  7. [R2, R3] Source code We will release the source code upon acceptance.

  8. [R1] Reference of pretrained SDM We pretrained SDM by ourselves (sec. 3.2), but will clarify in the text.

  9. [R2] Training time Training SDM took around 13 hours. A 256x256 image synthesis took approximately 20 seconds.

  10. [R2] Trade-off between computational overhead and practical benefits The major overhead is in the training, so we believe the usability at inference may not be degraded much while the high quality image from diffusion models is practically beneficial.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors propose diffusion model-based method to synthesize nuclei and use them to improve the performance of nuclei segmentation and classification. The rebuttal addressed most of the major concerns raised by the reviewers; however, the paper can be further improved by providing the technical novelty their insights into the method and the results in details.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    One of the major concerns on this work is the lack of technical novelty. The rebuttal has highlighted that this work is ‘the first attempt to employ the diffusion model to address the semantic data imbalance in pathology’. Though the attempt on the new application is appreciated, the concern on its technical novelty still remains, where the generative model-based image synthesis idea is common and the novelty of using a recent diffusion technique for that purpose therefore seems limited. Though this could be acceptable to some extent, the limited improvement on segmentation experiments does not show much benefits of the proposed method on addressing the imbalanced problem. These altogether therefore weaken this work.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The new method has not been compared with simple data augmentation methods that are widely used and that are, by far, simpler to implement and deploy than the complex proposed method. On that note, the description of the proposed method is often hard to follow and can be substantially improved.

    Moreover, many of the results seem almost indistinguishable between different methods (Table 1), and no statistical significance tests are reported. Moreover, these results are not properly discussed. Why, for example, in some experiments the method works well and in some other experiments it is even worse than the baseline network?



back to top