Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Jing Wei Tan, Won-Ki Jeong

Abstract

Contrastive learning has gained popularity due to its robustness with good feature representation performance. However, cosine distance, the commonly used similarity metric in contrastive learning, is not well suited to represent the distance between two data points, especially on a nonlinear feature manifold. Inspired by manifold learning, we propose a novel extension of contrastive learning that leverages geodesic distance between features as a similarity metric for histopathology whole slide image classification. To reduce the computational overhead in manifold learning, we propose geodesic-distance-based feature clustering for efficient contrastive loss evaluation using prototypes without time-consuming pairwise feature similarity comparison. The efficacy of the proposed method is evaluated on two real-world histopathology image datasets. Results demonstrate that our method outperforms stateof-the-art cosine-distance-based contrastive learning methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_66

SharedIt: https://rdcu.be/dnwKp

Link to the code repository

https://github.com/hvcl/Deep-Manifold-Contrastive-Learning

Link to the dataset(s)

N/A

Reviews

Review #2

Please describe the contribution of the paper

Authors propose a method leveraging existing approaches for clustering, distance measurement in high dimensional representation, self-supervised learning and multiple instance learnings. Although all these techniques are not new in this paper it is introduced for the first time the geodesic distance measurement as a metric to better cluster tile represented in their latent representation. The use of geodesic distances although not new (as previously used by ISOMAP and other manifold learning approaches) it is used for the first time to my knowledge in the context of histopathology. Nevertheless authors provide baseline comparison with two SOTA methods for contrastive learning using also clustering of latent representations although not geodesic distances. Some clarifications (see below) might make the paper more clear to the reader and for future work to reference.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Both self-supervised approaches (contrastive learning) and multiple instance learning remain two approaches that addresses well the need in computational pathology for robust feature representation learning and classification with weak labels. The paper it is very clear and structuring the loss function as a combination of intra and inter sub-cluster loss it is in line with recent work showing interesting results.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

An interesting comparison to the geodesic distance would have been other non-parametric distance metrics known to capture non-linear relationship and preserver local neighborhood (as does geodesic distances) like for example wasserstein distance. I did not understand the rationale of having two distance metrics in the clustering: Euclidean (initial classes) + geodesic distance (sub-clusters). Could the authors develop why we did not use geodesic for both clustering tasks. Rather than a weakness but a limitation and towards an effort to make the paper more easily exploitable for future work direction I had some comments on the data used for training and evaluating the method. The authors list 168 patients and 332 slides. This would mean on average two slides per patients. To make the work more relevant to the readers the authors could provide some clarifications ? Given the task of liver sample subtyping several histological section of the same patient (same time-point) are likely to be similar however from different time-points (pre/post treatment) might add some variability and the 332 slides could be looked as independent. Same applies to the second dataset of liver cancer.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

From the reproducibility note the only standing issue I notice was the sharing of the data and code (training/evaluation). The authors have checked it as YES but I do not see any link in the manuscript. Could it be introduced ?
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

In the reproducibility response the authors checked yes for the data available and the code (training/evalution) however there is no such link in the manuscript.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper it is well written and clear with exception of what was mentioned in the comments section.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

6
[Post rebuttal] Please justify your decision

Given the response to the reviewers from the authors my evaluation stay the same and the motivation is as follow. Proposed changes (1) a new contrastive representation learning framework based on deep manifold embedding learning and (2) geodesic distance instead of cosine distance to describe the similarity between patches justify novelty. Commitment of the authors to release of the code would benefit the community at testing the approach on more challenging histopathology problems. Beyond the interesting changes proposed in the method the question is on the impact of these changes. Classification of liver cancer subtypes it is not a well known problem (how difficult it is and what specifically make this difficult) as there aren’t many public dataset. The number of patients (Not number of WSI) it is not large enough to my opinion to capture the whole spectrum of histopathological variability IHCCs. There is only one (from the link provided in the paper) PAIP challenge (2019) in liver cancer: (1) the data do not match the numbers reported in the paper here and (2) it is tumour segmentation.There are other available histopathology data part of public challenges that the community knows better and where the results of the paper could have helped to better evaluate the proposed changes. I agree with R4 that testing on a natural image benchmark dataset would have been also interesting made all together a stronger paper. In the interest of clarity I would suggest the authors to add a table (as a supplementary file or in the README in the public repo) to breakdown in detail the number of patients and WSI used during training and testing.

Review #3

Please describe the contribution of the paper

In contrastive learning, cosine distance is commonly used to measure the similarity of features between two samples. In this paper, the author propose to leverage geodesic distance as similarity metric. Specifically, the input patches are grouped into sub-classes by constructing graph and clustering subgraph based on geodesic distance. Intra-subclass and inter-subclass loss are designed to update parameters.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The main strength of the paper is the use of geodesic distance to measure sample similarity in contrastive learning. To implement the maniford learning, the author constructed a graph using the input image patches. The nodes of the graph are defined as the feature vectors of the image patches, and the edges are defined as the Euclidean distances of adjacent image patches. Therefore, the geodestic distance matrix M can be obtained by using Dijkstra algorithm. Through the distance matrix M, the graph can be divided into n subsets. The author designed intra class distance and inter class distance loss to constrain the update of patch features.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The geodesic matrix is determined by the structure of nearest neighbor graph and will not be updated with training, which may damage the performance of the model;
2. When calculating intra class loss, is it reasonable to use the mean of all patch features belonging to this class for the calculation of cluster center.
3. The separability of the sample features shown in Figure 3 is very good, but the accuracy in the table 1 does not seem to be so satisfactory, which is confusing.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

good reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. The number of cluster centers is an important hyperparameter, so it is necessary to conduct comparative experiments to optimize.
2. The edges of the graph constructed based on Euclidean distance may lose global information, and it can be considered to use feature similarity to compute the graph edges.
3. In MIL classification, the feature dimension of directly concatenating image patches in each bag may be high, which may damage model performance.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The author proposed a novel extension of contrastive learning by leveraging geodesic distance between features. Based on graph construction, the author designed a computational efficient minifold learning method.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

5
[Post rebuttal] Please justify your decision

The author basically provided a good response to the questions.

Review #4

Please describe the contribution of the paper

– Propose a new contrastive representation learning framework based on deep manifold embedding learning. – Use geodesic distance instead of cosine distance to describe the similarity between patch features in the contrastive learning.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

– A reasonable modification to the contrastive representation learning. The contrastive learning is performed on manifold built with nearest neighbor graph based on geodesic distance. It benefits more fine-grained similarity measurement than that in common contrastive learning methods.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

– The motivation for histopathology image recognition is unclear. The entire design of the framework seems to be general to image representation learning. The motivation the authors stated is also not specific for histopathology image recognition. So, reporting the methodology with histopathology WSI classification is confusing. It may be more appropriate to be evaluated on nature scene image dataset. – The experimental baseline is relatively low. The CNN structure is VGG16 and the so-called MIL classifier is simple, which are already out of state. It cannot show the advantages of the proposed method in the SOTA WSI classification pipelines. – The comparison is incomplete. The commonly applied contrastive learning methods, including MOCOv2/v3, SimCLR, BYOL, DINO, etc., are not compared.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The datasets are public available. The authors claim that the code will be released.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

– Discuss the specific motivation of the proposed method for histopathology image recognition. – Update the CNN network, the WSI classification method (based on MIL, Graph or Transformer) in the experiments, and then re-evaluate the proposed method. As shown in Table 1, the PCL and HCSC your compared cannot surpass pre-trained VGG16. – Compare the proposed method with more commonly applied contrastive representation learning methods. – The number of epochs for re-clustering seems to be a crucial hyper-parameter. I suggest the authors discuss the sensitivity of this parameter setting based on quantitative experiments. Similarly, the prototype number needs to be tuned. – Provide brief descriptions or references for Dijkstra’s algorithm, Geodesic distance, Hausdorff distance, and agglomerative clustering. It makes the paper more self-contained.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The motivation is unclear. The experiments are less convincing.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper presents a contrastive learning method for histopathology image classification. The use of geodesic distance is interesting. However, there’s a lack of comparison with other popular contrastive learning methods and the backbone classification model is quite old. More direct comparison with methods that are specifically designed for histopathology image classification is needed as well.

Author Feedback

Use of Euclidean distance [R2, R3] In geodesic distance computation, Euclidean distance acts as a foundational step. It is used to find the nearest neighbors to generate a graph representing the data manifold. The geodesic distance is then defined as the shortest path distance on the graph, which provides a more accurate measure of proximity that considers the underlying structure and relationships within the data.

Clarification of training and testing samples. [R2] It is true that each patient case consists of two slides, but we split the training and testing samples based on the patient, i.e., no single patient case is split to training and testing at the same time.

Reproducibility [R2] We will release the source code and related data to the public upon acceptance.

Updating of geodesic matrix [R3] The geodesic matrix is updated every five training epochs as mentioned in Section 3.2.

Confusion between Figure 3 and Table 1 [R3] Figure 3 illustrates per-patch feature embedding before the MIL classification, which is just an example to show the difference between geodesic and cosine distance, and it does not represent the entire case.

Ablation study on number of prototypes [R3, R4] We had conducted the ablation study on the number of prototypes and the accuracy for 2, 10, 20, 40 and 100 prototypes were 73.65%, 74.85%, 77.03%, 73.35%, and 74.02%, respectively. Due to the page limitation, we reported only the best case in Table 1.

Motivation for histopathology image recognition [R4] Histopathology image classification is known to be inherently ambiguous and difficult even for pathologists due to histological incompleteness, subjectiveness, and the low mutation rate. Conventional contrastive learning with multiple instance learning methods is commonly used for WSI classification, but the performance is not the best due to the nature of histopathology images (i.e., unclear texture differences). Our motivation is to show that using geodesic distance is effective to represent WSI features by capturing intrinsic relationships and structural information better than conventional cosine distance methods. To the best of our knowledge, this is the first study about manifold learning for histopathology image classification. As the reviewer commented, applying our method to natural image classification would be an interesting future research direction.

Simple and old backbone and MIL [R4] The primary objective of our study is to enhance representation (feature) learning through the utilization of a novel geodesic-distance-based loss. Hence, we aim to emphasize the relative improvement achieved over the baseline when integrating our proposed manifold loss. Furthermore, our approach is not limited to a particular encoder and Multiple Instance Learning (MIL) methods; instead, it can seamlessly integrate with them, treating them as a black box. This enables the incorporation of any newer encoder and MIL method. Note also that although VGG16 is considered an older architecture, it continues to be extensively employed and has demonstrated its effectiveness, particularly in the medical image domain, for example, LIDP [MICCAI22], DeepMIF [MICCAI22] and Zhang et. al [CVPR23].

Comparison with common contrastive learning methods [R4] In fact, we also compared our method with SimCLR (NT-Xent) loss in Table 3. To ensure a fair comparison, we modified the inputs by using the anchor and its prototype instead of two augmented images from an anchor. These experiments were labeled as ”cosine distance” in Table 3 to highlight the distinction between geodesic distance and the commonly used cosine distance in contrastive learning. Furthermore, we chose PCL and HSCS for the SOTA comparison because these methods incorporate prototypes and handle hierarchical labels tasks similarly with our method.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The rebuttal has provided relatively satisfactory responses to most of the comments. The authors should revise the paper to clarify all the points.

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper introduces a novel approach that leverages geodesic distance as a similarity metric. The proposed method involves grouping input patches into sub-classes by constructing a graph and clustering subgraphs based on geodesic distance.

During the first round of review, the reviewers appreciate the innovative idea of using geodesic distance instead of the cosine distance. However, they also raise questions regarding the rigor of the evaluation, the use of an outdated backbone design, and the lack of comparison with other methods. The author responds with a comprehensive rebuttal, summarizing and addressing these concerns. As a result, the paper receives two positive reviews and one negative review.

I find the overall idea of using geodesic distance in contrastive learning to be interesting. The results demonstrate superior performance compared to PCL and HCSC methods. However, the lack of rigor in the evaluation is a noticeable drawback. Thus, it is a borderline case for me.

Taking into account the innovative idea of using geodesic distance, my recommendation leans towards acceptance.

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This is a technique aimed at effectively implementing contrastive learning. Upon reviewing the reviews and the authors’ rebuttal, the authors have addressed the majority of the issues effectively. It is crucial to include comparisons with other recent contrastive learning techniques in the final version.

back to top

Histopathology Image Classification using Deep Manifold Contrastive Learning