Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Lanfeng Zhong, Xin Liao, Shaoting Zhang, Guotai Wang

Abstract

Segmentation of pathological images is a crucial step for accurate cancer diagnosis. However, acquiring dense annotations of such images for training is labor-intensive and time-consuming. To address this issue, Semi-Supervised Learning (SSL) has the potential for reducing the annotation cost, but it is challenged by the large amount of unlabeled training images. In this paper, we propose a novel SSL method based on Cross Distillation of Multiple Attentions (CDMA) to effectively leverage unlabeled images. Firstly, we propose a Multi-attention Tri-branch Network (MTNet) that consists of an encoder and a three branch decoder, with each branch using a different attention mechanism that calibrates features in different aspects to generate diverse outputs. Secondly, we introduce Cross Decoder Knowledge Distillation (CDKD)between the three decoder branches, allowing them to learn from each other’s soft labels to mitigate the negative impact of incorrect pseudo labels in training. Additionally, uncertainty minimization is applied to the average prediction of the three branches, which further regularizes predictions on unlabeled images and encourages inter-branch consistency. Our proposed CDMA was compared with eight state-of-the-art SSL methods on the public DigestPath dataset, and the experimental results showed that our method outperforms the other approaches under different annotation ratios.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_55

SharedIt: https://rdcu.be/dnwKa

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #4

  • Please describe the contribution of the paper

    In this paper, the authors proposed a semi-supervised learning method based on mutual knowledge distillation and entropy minimization for semantic segmentation in histopathology images. Their approach leverages soft pseudo-labels, to reduce which three decoders were designed with different attention mechanisms, aiming at learning features of different focus. To learn consistent predictions from the three decoders, entropy minimization of the average prediction from the three decoders was proposed. The authors validated their design in a colorectal cancer dataset and showed improved performance against baselines.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Semi-supervised learning for semantic segmentation in histopathology is of great interest to the community
    • Good list of baseline comparisons and extensive ablation studies
    • The proposed new attention-based decoders as mutual knowledge distillation mechanism seems intuitive and largely well motivated
    • The writing is very easy to follow and the organization is very clear
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. It is unclear the motivation behind some design choices of the mutual knowledge distillation from decoders with three different attention mechanisms: 1.1 the authors stated that the three decoders focus on different features, but is it more effective to have three different encoders to extract features from different aspects? Decoder is needed for high resolution mask generation but the main information extraction part is largely done by the encoder. 1.2 Unclear to me what the considerations are to design three decoders (i.e. three attention mechanisms) and what kind of attention mechanisms to choose? In ablation studies, the performance of 2 vs 3 decoders (each with a different attention mechanism) seemed only differ by < 0.15% - 0.25% DICE. Without reporting significance level, it is hard to say 3 is better than 2 sets of decoder predictions 2.Considering the aforementioned ablation studies, it seems unclear whether the three attention mechanisms actually learn different focus.
    2. It is unclear how the small portion of labeled images (5% and 10%) were selected (was it randomly selected?) and how many repeats were performed.
    3. In Table 1, although the proposed method had the highest mean values in DICE and JI in most cases, a few baselines with lower mean DICE/JI had no statistically significant differences in performance compared to the proposed method. Together with Bullet Point 3, difference in mean alone of performance does not necessarily means statistically significant, especially when the variations between experimental runs are very large (1/3 or higher of mean performance). As such, it is unclear ablation studies with no reported significance levels reflect the actual performance differences among different settings.
    4. Only one dataset was used for experimentation.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It will be not very difficult to reproduce the results of the paper, with some concerns below:

    1. not sure if code will be publicly available (not mentioned in the paper, though mentioned in the reproducibility doc)
    2. No details about how the small portion of labeled images were selected and how many repeats were performed.
    3. No statistical significance reported for ablation studies
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. As mentioned in the weakness section, it is unclear about the choice of both the number and type of attention mechanisms in the decoders. I would suggest assessing the following: Is it a generic approach with which one can use various combinations of attention mechanisms? Or only the included attention mechanisms would work (spatial/channel/spatial+channel attention)? What about considering multi-headed attention and/or transformers?
    2. As mentioned in the weakness section, learning with different attention mechanisms in encoders might be an alternative/better option to extract more diverse features for mutual teaching.
    3. To demonstrate that the three attention mechanisms actually learn different focus, it would be helpful to leverage activation maps or other methods on explainable AI.
    4. It will be of great help to validate the effectiveness of the proposed method if the authors also test the design on other datasets, such as cell segmentation (MoNuSeg and PanNuke) and segmentation in other modalities (MRI, CT, etc).
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The topic is of interest to the community. The proposed mutual knowledge distillation mechanism seems relatively motivated, though I have some concerns on design choices. The novelty seemed limited as similar mutual teaching mechanisms are common approaches and similar attention mechianisms were leveraged in, for example, A Two-Stream Mutual Attention Network for Semi-supervised Biomedical Segmentation with Noisy Labels, AAAI,.2019

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    This paper proposes cross decoder knowledge distillation (CDKD) for semi-supervised learning. To apply CDKD, this paper proposes a Multi-attention Tri-branch Network (MTNet) that has three decoders with three different attention mechanisms (channel, spatial, and channel+spatial). The method can learn from unlabeled data by each other’s soft labels. The proposed method was evaluated with many SOTA methods and achieved the best performance. In addition, the ablation study was conducted to show the effectiveness of each module.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The idea of introducing cross decoder knowledge distillation (CDKD) for semi-supervised learning is novel and interesting.
    • The idea of introducing three different attention mechanisms (channel, spatial, and channel+spatial) for CDKD is also an interesting idea.
    • The proposed method was evaluated with many SOTA methods and achieved the best performance.
    • The ablation study was conducted to show the effectiveness of each module.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The evaluation was conducted using only one dataset.
    • It is unclear what kind of statistical test was performed. Did the author perform multiple comparison?
    • When 10% labeled, MTNet(ensb) was better than the proposed one (MTNet + Lcdkd + Lum). Why? The explanation of MTNet(ensb) is not sufficient. How to apply L_{cdkd} after ensemble of the three branches?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It is O.K. if the code will be available after acceptance.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see the weakness.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    As shown in the strengths, the idea of introducing cross decoder knowledge distillation (CDKD) for semi-supervised learning is novel and interesting. The evaluation was well conducted to show the effectiveness of the proposed method. Therefore, my decision is `accept’.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposed a multi-branch network for the semi-supervised pathological segmentation task. The model is enforced to achieve consistent predictions by the decoders with different attention mechanisms. Experiments on the DigestPath dataset demonstrate its effectiveness.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Good performance for the semi-supervised pathological segmentation task;
    2. Extensive experiments on the DigestPath dataset.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Some typos and grammar errors;
    2. The authors claimed soft labels would be more effective in such a setting. However, some related works advocated the entropy-minimization is also important like: [1] Xu, Zhe, et al. “All-around real label supervision: Cyclic prototype consistency learning for semi-supervised medical image segmentation.” IEEE Journal of Biomedical and Health Informatics 26.7 (2022): 3174-3184. [2] Wu, Yicheng, et al. “Exploring smoothness and class-separation for semi-supervised medical image segmentation.” MICCAI 2022, 2022. [3] Wu, Zhonghua, et al. “Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation.” ECCV 2022, 2022. Please include these papers for discussion to state their differences.
    3. For pathological segmentation tasks, is there some unique challenge that is relative to SSL? Some essential analysis are lacked in this paper.
    4. Here, T-softmax is a kind of sharpening function. The authors are suggested to revise some over-claim statements;
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good Reproducibility

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. See the above weakness;
    2. Add more latest comparisons since semi-supervised segmentation is a typical task;
    3. Improve the language and revise some claims.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, this paper is sound for semi-supervised pathological segmentation. Considering its good performance in this specific task, I lean to a “weak accept”.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    All three reviewers hold positive views on this paper, acknowledging its innovation and effectiveness in the key method CDKD. I have decided to provisionally accept this paper. However, I suggest that the authors carefully consider the shortcomings pointed out by the reviewers, such as the weaknesses raised by Reviewer 4, and address these issues in the final version.




Author Feedback

N/A



back to top