Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Md Abdul Kadir, Hasan Md Tusfiqur Alam, Daniel Sonntag

Abstract

Active learning algorithms have become increasingly popular for training models with limited data. However, selecting data for annotation remains a challenging problem due to the limited information available on unseen data. To address this issue, we propose EdgeAL, which utilizes the edge information of unseen images as a priori information for measuring uncertainty. The uncertainty is quantified by analyzing the di- vergence and entropy in model predictions across edges. This measure is then used to select superpixels for annotation. We demonstrate the effectiveness of EdgeAL on multi-class Optical Coherence Tomography (OCT) segmentation tasks, where we achieved a 99% Dice score while reducing the annotation label cost to 12%, 2.3%, and 3%, respectively, on three publicly available datasets (Duke, AROI, and UMN).

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_8

SharedIt: https://rdcu.be/dnwxQ

Link to the code repository

https://github.com/Mak-Ta-Reque/EdgeAL

Link to the dataset(s)

AROI: https://ipg.fer.hr/ipg/resources/oct_image_database

DUKE: https://people.duke.edu/~sf59/Chiu_BOE_2014_dataset.htm

UMN: http://people.ece.umn.edu/users/parhi/.DATA/OCT/DME


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposed an active learning method for OCT segmentation. Uncertain samples are selected to be annotated. The method has been validated on three datasets and both achieved good results.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    -Active learning is a nice topic in the community. The method has been validated on three public datasets with extensive experiments.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    -The main weaknesses is the lack of novelty. Using uncertainty sampling for active learning is not novel from the technical perspective. [1]

    • As only part of the image is labeled, my concern is that this partial label will cause performance degradation.
    • Only select uncertain sample will have less representative of the overall sample space. [1] Nguyen, Vu-Linh, Mohammad Hossein Shaker, and Eyke Hüllermeier. “How to measure uncertainty in uncertainty sampling for active learning.” Machine Learning 111, no. 1 (2022): 89-122.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The parameters has been clearly presented, the dataset is public, and code will be released.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Will the annotation area cause any sample imbalance problem? For sample in figure 2, only the left corner is annotated, which can be seen as a noisy label. Will this kind of low quality label damage the performance of network? -In figure 3, why the % dataset labelled is not the same for different datasets? -The performance of MAR and RMCDR is so different in the three datasets, what is the reason? -What is the key components that makes EdgeAL better than MCDR and RMCDR?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The novelty is not strong.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    They have replied all my questions, and I appreciate their hard work.



Review #2

  • Please describe the contribution of the paper

    The authors propose an active learning approach based on the entropy and divergence across edges to select superpixel regions for OCT image segmentation. During evaluation on 3 datasets, their method is shown to outperform random sampling and active learning baselines.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Extensive evaluation: The proposed method is compared against 8 active learning baselines in addition to random sampling in 3 datasets. Further experiments are conducted with 5-fold cross validation and different model architectures. The results demonstrate the generalisability and consistent improvements of the approach across different networks and weights initialisation.

    • Labelling budgets: A realistically small initial labelled set has been used.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Since entropy and divergence scores have been leveraged for active learning in previous works, the integration of edge information constitutes limited novelty in technical contributions.

    • The limitations of the proposed approach have not been discussed and the future directions should be further elaborated.

    • The placement of Tables 1 and 2, and Figure 3 can be adjusted to improve readability.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The implementation details for the models and hyperparameters are generally defined, and the datasets are publicly available. However, it is unclear how the labelled validation set is being constructed from the 20% pool and whether this labelled set is being varied across different labelling budgets.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • A more recent hybrid uncertainty-diversity active learning baseline should be used for comparison, e.g. BADGE.

    • The discussion should be extended to include limitations of the proposed method and to elaborate on future directions.

    • To improve organisation of the paper and readability, the following edits are recommended:

      • the competing methods should be defined before the first reference to the results in Table 1.
      • Figure 3 is incorrectly referred to as Figure 1 in the text. The plots should be closer to the reference in the text.
      • The dataset used is not mentioned in the caption of Table 2.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The rating is based on the extensive evaluation process in which the proposed method is shown to outperform the baselines. However, a higher rating has not been awarded due to the limited technical novelty.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors have elaborated on their technical contributions and justified their choice of baselines (fairly reasonably). There is no change to the overall score.



Review #3

  • Please describe the contribution of the paper

    This paper proposed a novel Edge estimation-based active learning approach that achieved incredible experiment results on active learning for OCT segmentations.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The exploring direction of using edge as prior is promising as edge contains more variance than the chunk area of the segmentation mask. Also, the edge is actually more important and focused during annotation procedure.

    2. The proposed method using edge entropy and edge divergence are novel and sensible.

    3. The experiment results is incredible comparing to those fine baselines like coresets.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The baselines are relatively old.
    2. Lacking ablation study to show the effectiveness of the designs of each components.
    3. Only focusing edge might make the model ignore other aspect
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    no obvious issues

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The baselines compared by this paper are relatively old. It would be better to illustrate the effectiveness of the method with some latest baselines like learning loss[1] and ISAL[2]
    2. It would be more comprehensive to show the effectiveness of each design like edge entropy, divergence, and superpixel via ablation study by applyting each of them to the method solely or in different combination.

    [1]Donggeun Yoo and In So Kweon. Learning loss for active learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 93–102, 2019.

    [2] Zhuoming Liu, Hao Ding, Huaping Zhong, Weijia Li, Jifeng Dai, and Conghui He. Influence selection for active learning. arXiv preprint arXiv:2108.09331, 2021.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The exploring direction - edge is promising and the proposed method also makes sense. Although the baseline methods are relatively old, the results are still incredible. Better baselines and comprehensive ablation studies will make the effectiveness more convincing.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    I agree with R1’s view that the novelty in uncertainty perspective is not enough. Although I still believe edge is something that deserves to be focused on, however, lacking of ablation study makes the effectiveness of each components vague which makes me and R1 have the same question - -What is the key components that makes EdgeAL better than MCDR and RMCDR?. This question seems not solved in a quantitative way.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposes a method for active learning using edge information. It is applied to OCT images. I am not convinced of the method’s novelty as uncertainty is a well used concept. Moreover the concept of using only edge divergence is not theoretically sound as edge information is not the only criteria for OCT image classification. Experimental details and implementation are not clear and the baselines are very old methods.




Author Feedback

Thank you for your comments and observations. We acknowledge that our terminology “uncertainty measure” may have been misleading, which we deeply regret. As highlighted in the suggested paper by reviewer1, the superiority of epistemic uncertainty measures over evidence-based uncertainty is well established. We utilize Monte Carlo dropout (also known as Bayesian model averaging) outlined by equation2 in our methodology for approximating uncertainty, which is rooted in the concept of epistemic sampling for uncertainty measurement. On top of that, we’ve repurposed Contextual Calibration, a technique originally used in few-shot learning for language models (reference paper: 30), for finding uncertainty across edges. Our implementation of Contextual Calibration uses an exponential function strategically designed to favor stronger gradients - typically associated with genuine image edges - over gradients likely induced by noise. It enhances the precision of our analysis by effectively distinguishing authentic image features from noise artifacts. This strategy, while seemingly straightforward and showed strong performance in non-ML-based segmentation, has been largely unexplored in AL. This makes out method unique from MCDR and RMCDR and demonstrates a significant enhancement in model performance by efficiently emphasizing and quantifying uncertainty in these areas.

The Reviewer’s concerns regarding the partial labeling of images are understandable. Challenges with noisy labels persist in both partial and full annotation, especially with active learning algorithms. One solution is evaluating model performance after each active learning iteration. Also, to avoid overfitting or performance decay, it’s common to retrain models on random samples after each active learning step. This also applies to EdgeAL, where these strategies can help maintain high-quality annotations and optimal performance. Moreover, in partial annotation, it’s highly likely that at least some parts of all unlabelled images are being annotated. Whereas, in full image annotation, some unlabelled images are completely ignored due to active selection. As a result, the partial annotation method ensures a more uniform sampling representation of the overall sample space.

The % dataset labeled in Figure 3 is not same due to the different sizes of trainsets. As labeled, pixel % is determined by the ratio of annotated pixels to the total pixels in training set.

(Performance difference of Mar and RMCDR): Unlike MAR, RMCDR accounts for epistemic uncertainty. When trained on the balanced AROI dataset (with eight annotated classes), the model effectively learns data distribution. As MAR lacks this consideration, queries from MAR may be less meaningful. Meanwhile, RMCDR performs better due to its incorporation of epistemic uncertainty. However, with the imbalanced UMN dataset (two classes, 95:5 ratio), the model struggles to grasp the data distribution. RMCDR’s performance dips because it relies on this model knowledge. Conversely, MAR, which considers evidence-based uncertainty, copes better with the imbalanced seed data set. Thank you for your suggestion regarding the comparison with BADGE (2020)and Learning Loss (2019), and ISAL (2021). While these methods are certainly valuable, they come with a relatively high computational cost due to the requirement of backpropagation or learning, which becomes increasingly demanding with volumetric data, such as in Optical Coherence Tomography (OCT). Since OCT data comprises numerous slices from each patient, swift computation becomes essential for annotating one patient’s data. Furthermore, these suggested methods have not been extensively explored for segmentation tasks. To maintain relevance and comparability, we chose to benchmark our method against RMCDR (2018), CORESET(2017), MAXRPR(2017), CONF(2016), which are well-established for segmentation tasks. This provides a fair and feasible comparison while fulfilling the objective.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal does a fair job of addressing reviewers concerns. However the questions around the key contributions of the paper still persist and not all reviewers are convinced about it



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presents a framework for OCT image segmentation based on active learning. The reviewers questioned the novelty of the paper and the comparative experiments. After the rebuttal, the author elaborates on the reasons for the choice of baseline, which is also endorsed by R2. However, R3 still expresses concerns about the innovation of uncertainty. This issue constitutes a potential case for rejection. Therefore, considering the current state of the paper, I am inclined to recommend rejection.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposed an active learning method for OCT segmentation with uncertain samples. the rebuttal partitially addressed the concerns arised by reviewers. I recommend accept.



back to top