Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Mariia Sidulova, Xudong Sun, Alexej Gossmann

Abstract

Consideration of subgroups or domains within medical image datasets is crucial for the development and evaluation of robust and generalizable machine learning systems. To tackle the domain identification problem, we examine deep unsupervised generative clustering approaches for representation learning and clustering. The Variational Deep Embedding (VaDE) model is trained to learn lower-dimensional representations of images based on a Mixture-of-Gaussians latent space prior distribution while optimizing cluster assignments. We propose the Conditionally Decoded Variational Deep Embedding (CDVaDE) model which incorporates additional variables of choice, such as the class labels, as conditioning factors to guide the clustering towards subgroup structures in the data which have not been known or recognized previously. We analyze the behavior of CDVaDE on multiple datasets and compare it to other deep clustering algorithms. Our experimental results demonstrate that the considered models are capable of separating digital pathology images into meaningful subgroups. We provide a general-purpose implementation of all considered deep clustering methods as part of the open source Python package DomId (https://github.com/DIDSR/DomId).

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43993-3_64

SharedIt: https://rdcu.be/dnwN9

Link to the code repository

https://github.com/DIDSR/DomId

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This study proposed a model, the Conditionally Decoded Variational Deep Embedding (CDVaDE), which integrates additional variables, as conditioning factors to facilitate the clustering process.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The author proposed Conditionally Decoded Variational Deep Embedding which is one step forward from Variational Deep Embedding.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The weakness is restricted to the style of article. It would be better if a figure of the whole framework could be attached.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors meet all criteria on the reproducibility checklist But the pre-trained model
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

Overall solid work with deduction and comparison with previous methods. It would be better if a figure of the whole framework could be attached, as well as the detailed training process.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Convincing results of the comparison between proposed methods and previous baseline methods.
Reviewer confidence

Not confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The authors explored deep clustering models to identify significant subgroups within medical image datasets. The proposed CDVaDE model integrates a conditioning mechanism to direct the clustering model to focus on discovering previously unidentified image subgroups or domains.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Designed a novel VAE based model using extra info (conditionally decoded) for clustering tasks.
- Applied and compared with competing algorithms in relevant dataset, which is useful for readers to compare and understand the proposed method.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Some numerical experiments can be further conducted, e.g., for clustering methods, one key indicator is stability of the method, it would be helpful to show stability of the proposed CDVaDE model against competing models.
- Difficult to have a high-level understanding of the proposed architecture, lack a comprehensive diagram to describe the pipeline, - Figure 1 presentation is poor.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

n/a
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

Fig 1 is difficult to understand for readers not familiar with VAE architecture, would be good to make major changes on that and explicitly depict the proposed method.

I don’t think you need to spend more than 1 page to describe VaDE’s ELBO derivation etc, that doesn’t provide too much useful info as this is almost the standard ELBO derivation steps.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I think the presentation of the main methodology and the results should be further improved before publication, but technically sound work.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The authors proposed a Conditionally Decoded Variational Deep Embedding (CDVaDE) model including conditioning factors to identify subgroup structures in the data that are not known. They compared the proposed CDVaDE model with two other methods Variational Deep Embedding (VaDE) and Deep Embedding Clustering (DEC) on the Colored MNIST and HER2 datasets, and showed that CDVaDE could identify subgroups that were not associated the known labels.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The math description about models is clear and detailed.
2. The analysis and comparison across 3 methods were extensive.
3. The code will be made available, which helps to reproduce the result.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. My main concern is that there was no sufficient quantitative evidence to show the proposed method CDVaDE can identify meaningful subgroups. No statistical test was performed.
2. In Section 3.1, how many images in the Colored MNIST dataset? How were train/valid/test sets split?
3. In Section 3.1, what about the accuracy of CDVaDE to classify digits in colored MNIST? The conclusion is vague without quantitative metrics.
4. In Fig. 2, what are labels for these different colors?
5. In Section 3.2, “we use a subset of this dataset consisting of 672 images.” How were images selected? Were scores/classes balanced?
6. In Fig. 3, the argument about visual difference is a bit subjective. Would it be possible to use quantitative measures to evaluate similarities or differences of images in each domain?
7. In Fig. 3, domain labels in x and y axis (0/1/2 and 1/2/3) don’t match.
8. “In Fig. 4, HER2/neu median scores of the three clusters move closer together, illustrating the decrease of association with HER2 class labels, as intended by the formulation of the CDVaDE model.” Better to add statistical tests here.
9. I found it hard to understand the key take-away messages from the figures immediately. The authors might want to highlight the key take-aways in figure captions.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The code will be made available if accepted
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. I would suggest the authors to come up with quantitative measures to demonstrate the practical value of CDVaDE.
2. I suggest to add statistical tests to show if CDVaDE is different from baseline methods.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I think this paper is at the borderline and tend to reject it. It proposed a technical solid method but there is no sufficient quantitative measure to demonstrate that subgroups identified from the proposed method are meaningful. In addition, visualization is relatively poor.
Reviewer confidence

Somewhat confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

5
[Post rebuttal] Please justify your decision

I appreciate the authors’ response and understand that it is challenging to have quantitative metrics to validate unsupervised clusters. I have changed my decision to weak accept because the authors agreed to discuss potential limitations of the present study and improve visualization in the revised manuscript.

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The review comments are mixed. The authors may answer the reviewers’ questions during the rebuttal.

Author Feedback

We thank the meta-reviewer [MR] and all reviewers [R] for the relevant suggestions provided to improve our paper. We want to emphasize that the goal of this work is “identifying the unannotated subgroups”. Therefore devising appropriate quantitative measures for this unsupervised task is inherently difficult. As stated in “The Elements of Statistical Learning” (Hastie et al.), “In the context of unsupervised learning, there is no such direct measure of success.” Thus, CDVaDE is assumed to be an exploratory tool designed to unveil unknown subgroups in a dataset, utilizing a novel conditioning mechanism to guide clustering.

[R3] expressed concerns about quantitative and statistical analysis of results presented in Fig. 3 and 4 for HER2 data. To address quantitative analysis concerns, we would like to draw attention to the last paragraph of Sec 3.2, where we discuss how the correlation with HER2 scores decreases for CDVaDE compared to VaDE and DEC, as intended by the conditioning on HER2 class. We also reported that the overlap between VaDE and CDVaDE cluster assignments is less than 50%. [R3] noted the lack of statistical testing for this model overlap. While statistical testing would enhance the evidence, a full and proper evaluation of the unsupervised models cannot be accomplished by statistical testing alone and must involve a “review by a pathologist” (refer to Sec. 4), which is beyond the scope of this paper. Finally, the HER2 dataset is relatively small and may not provide sufficient power to establish statistically significant effects given the number of comparisons. Given the presented comparisons, descriptive statistics, and our source code (a user-friendly Python package), we firmly believe that the lack of statistical tests should not preclude the publication of this work.

[R3] has highlighted the absence of a reported accuracy for MNIST digits. The main purpose of the colored MNIST experiments was to showcase that while other clustering models inadvertently tend to cluster by color (a trivial visual characteristic), CDVaDE can be conditioned to cluster based on complex visual features (which may or may not be the digit label). In our future work, we will focus on developing specialized quantitative evaluation metrics. A sentence highlighting this issue will be added to the conclusion section.

[R2] has pointed out the issue of stability, which is a complex research field in ML on its own. Our primary contribution in this paper is a novel clustering method, and comparing stability could be included in future studies. Additionally, due to the exploratory nature of clustering, the demand for absolute stability is less pronounced compared to trust-demanding deep-learning application scenarios.

Minor: Fig 1 raised confusion among [R1 and R2], yet, we tried to align our visualization of CDVaDE with the established practice in this field (Jiang, 2019). We have detailed encoder and decoder architectures in Sec. 2.1 but will add a diagram in the supplementary materials of the final paper. It appears that [R3] has some confusion regarding the datasets. To clarify, a total of 600 images per digit were taken from the original MNIST dataset at random. The number of colors for the experiment was evenly distributed for each of the digits. Each digit’s images were then divided into training and validation datasets, with an 80/20 split. The original HER2 dataset of 964 total image patches was separated into disjoint 672 and 292 image sets. The latter subset was held out for future research continuation. Splitting was performed on a per-slide basis (i.e., different data splits do not contain any patches from the same slide). Stratification based on HER2 class labels was employed to balance the subsets (refer to Sec 3.2). [R3] Fig. 2 depicts bar graphs labeled “Color” where each color represents a specific color of colored MNIST digits. In the “Digit” plots, colors correspond to digits labels. Suggested edits will be incorporated.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The reviewers agree to accept this paper after the rebuttal.

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

In this paper, a deep clustering models Conditionally Decoded Variational Deep Embedding (CDVaDE) is proposed to identify subgroups in histopathology image datasets. It is a novel VAE based method for clustering tasks. The mathematical expressions and explanations of CDVaDE method in this paper are very detailed and easy to follow. In the supplementary, authors add more visualization evidence. Despite their rebuttal, I still think that further numerical experiments are necessary. I am inclined to accept this paper.

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The rebuttal has adequately addressed the reviewers’ concerns. The authors are encouraged to incorporate clarifications and improvements based on reviewers’ comments.

back to top

Deep unsupervised clustering for conditional identification of subgroups within a digital pathology image set