Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Tao Chen, Yitian Zhao, Lei Mou, Dan Zhang, Xiayu Xu, Mengting Liu, Huazhu Fu, Jiong Zhang

Abstract

Choroidal neovascularization (CNV) is a leading cause of visual impairment in retinal diseases. Optical coherence tomography angiography (OCTA) enables non-invasive CNV visualization with micrometerscale resolution, aiding precise extraction and analysis. Nevertheless, the irregular shape patterns, variable scales, and blurred lesion boundaries of CNVs present challenges for their precise segmentation in OCTA images. In this study, we propose a \textbf{R}eliable \textbf{B}oundary-\textbf{G}uided choroidal neovascularization segmentation \textbf{Net}work (RBGNet) to address these issues. Specifically, our RBGNet comprises a dual-stream encoder and a multi-task decoder. The encoder consists of a convolutional neural network (CNN) stream and a transformer stream. The transformer captures global context and establishes long-range dependencies, compensating for the limitations of the CNN. The decoder is designed with multiple tasks to address specific challenges. Reliable boundary guidance is achieved by evaluating the uncertainty of each pixel label, By assigning it as a weight to regions with highly unstable boundaries, the network’s ability to learn precise boundary locations can be improved, ultimately leading to more accurate segmentation results. The prediction results are also used to adaptively adjust the weighting factors between losses to guide the network’s learning process. Our experimental results demonstrate that RBGNet outperforms existing methods,achieving a Dice score of $90.42\%$ for CNV region segmentation and $90.25\%$ for CNV vessel segmentation.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_16

SharedIt: https://rdcu.be/dnwC0

Link to the code repository

https://github.com/iMED-Lab/RBGnet-Pytorch.git

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a new method called RBGNet for segmenting CNV in OCTA images. The proposed method uses a dual-branch encoder and a boundary uncertainty-guided multi-task decoder to capture both global long-range dependencies and local context of CNVs with significant scale variations. The method also incorporates uncertainty maps to enhance the robustness and accuracy of CNV segmentation models.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Novel method: The proposed RBGNet method is a novel approach to segmenting CNVs in OCTA images, which uses a dual-branch encoder and a boundary uncertainty-guided multi-task decoder.
    2. Multi-task optimization: The proposed method incorporates multi-task optimization to enhance the robustness and accuracy of CNV segmentation models.
    3. Uncertainty maps: The proposed method uses uncertainty maps to guide the segmentation process and improve the accuracy of CNV segmentation models. This is an interesting approach that could be applied to other medical image analysis tasks as well.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Limited generalizability: The proposed method has been evaluated on a specific dataset of OCTA images from patients with CNV-related retinal diseases. It is unclear how well RBGNet would perform on other datasets or for other types of retinal diseases.
    2. No discussion on computational efficiency: The paper does not discuss the computational efficiency of RBGNet, which could be an important factor for clinical adoption.
    3. The paper looks difficult to implement without open-source implementation of RBGNet.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of the paper is somewhat limited due to the lack of an open-source implementation of RBGNet. While the paper provides detailed descriptions of the proposed method and experimental setup, it would be beneficial for other researchers to have access to the code in order to reproduce the results and build upon the work. However, the paper does provide a detailed description of the dataset used for evaluation, including information on how it was collected and annotated. This information could be used by other researchers to obtain similar data for their own experiments. Overall, without the open-source implementation and dataset, The reproducibility of the paper is limited.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. It would be beneficial for other researchers to have access to an open-source implementation of RBGNet in order to reproduce the results and build upon the work. Providing an open-source implementation would increase the reproducibility and accessibility of the proposed method.
    2. The proposed method has been evaluated on a specific dataset of OCTA images from patients with CNV-related retinal diseases. It is unclear how well RBGNet would perform on other datasets or for other types of retinal diseases. Discussing potential limitations and generalizability issues could help readers better understand the scope and applicability of RBGNet.
    3. The paper does not discuss the computational efficiency of RBGNet, which could be an important factor for clinical adoption. Providing information on computational efficiency would help readers better understand how practical it is to use RBGNet in real-world clinical settings.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed RBGNet method is a novel approach to segmenting CNVs in OCTA images. The paper provides a detailed description of the proposed method and experimental setup, as well as a thorough evaluation of its performance compared to state-of-the-art methods. The use of uncertainty maps and multi-task optimization are interesting approaches that could be applied to other medical image analysis tasks as well. However, there are some limitations to the reproducibility of the paper due to the lack of an open-source implementation of RBGNet and dataset.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors extract image features with a combined CNN and vision transformer encoder. Vessel segmentation is performed with auxiliary tasks of vessel region segmentation, boundary segmentation, and regression to improve the primary task. Uncertainty maps are generated for each network output to target uncertain regions during training. The method improves choroidal neovascularization segmentation in OCTA images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The segmentation masks were utilized in multiple ways to improve the main segmentation task without requiring more data. Designed loss function weights automatically (at pixel and image level) using uncertainty maps, eliminating the need for empirical values as is commonly done to select loss weights. Network architecture tailored to address the specific need of segmenting structures of different sizes and shapes.

    Good comparison to other segmentation approaches as well as analysis of the effect of different components of the proposed approach. Considerable improvement compared to other segmentation approaches and each component shown to increase performance individually.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Computing uncertainty maps using dropout would significantly increase network training time if multiple network passes must be performed to compute the variance used in the loss. The impact on training time is not described. It is not mentioned why uncertainty weights are only added to BCE loss.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method is well described with references to previous work used to create the proposed architecture. There seems to be a few small details missing including bottleneck layer size and number of samples performed with dropout.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The proposed method provides clear improvement for choroidal neovascularization segmentation where the vessel regions are quite dense, so the region boundary corresponds well with the outer vessel mask. It is not obvious whether the approach would provide similar benefits in other related tasks.

    Small note: some misspellings in tables and figures.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Clear motivation and contribution in the network architecture, problem formulation, and training strategy for the task of choroidal neovascularization segmentation. Thorough analysis and evaluation of proposed approach on a private OCTA dataset.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    In this paper, the authors propose a novel vision transformer-based architecture, named Reliable Boundary-Guided choroidal neovascularization segmentation Network (RBGNet), for segmenting Choroidal neovascularization from optical coherence tomography angiography images. The proposed architecture consists of an encoder and a decoder, similar to a conventional U-Net. The encoder combines convolutional layers and transformer layers to extract global and local features; whereas, the decoder employs a Bayesian network with Monte Carlo dropout to estimate pixel-level uncertainty and enhance the segmentation of ambiguous boundaries.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors make two major contributions in this paper. First, they introduce a novel architecture for segmentation that comprises an encoder and a decoder. The encoder integrates convolutional and transformer blocks, which differs slightly from the approach presented in [14]. Second, they apply the existing pixel-level uncertainty estimation technique to improve the segmentation of ambiguous boundaries.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some of the limitations of the article are as follows: • Redundancy: The paper could be improved by removing or replacing some of the redundant information that appears in multiple sections. • Lack of Architecture Detail: The paper does not explain the proposed architecture in enough detail, such as the number and configuration of convolutional and transformer layers, the input and output dimensions, activation functions and other similar details. This makes it hard for other researchers to implement or reproduce the technique. A table or a pseudocode that summarizes the architecture would be helpful. • Ambiguous Diagram: The diagram that illustrates the architecture is not clear enough to convey the structure and functionality of the model. The textual description does not complement or clarify the diagram. A more detailed and readable diagram would be beneficial. • Inadequate Comparison: The paper compares the proposed technique with general segmentation architectures, such as U-Net and TransUNet, rather than with state-of-the-art methods that address the same problem of CNV segmentation. This limits the evaluation and validation of the technique. A more comprehensive comparison with recent and relevant methods would be necessary. • Poor Representation: The paper lacks many important information that are necessary for replicating or implementing the proposed technique without consulting the code (if the authors decide to make it public). For instance, the paper does not mention the data preprocessing steps, the hyperparameters, the training procedure, or the evaluation metrics. This information should be included in the paper or in the supplementary material.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The article does not provide enough information on some key aspects of the proposed technique, such as the data preprocessing, the hyperparameters, and the evaluation metrics. It would be helpful if the authors could elaborate on these points in the paper or in the supplementary material. The code alone may not be sufficient to reproduce or apply the technique in different settings.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Here are some detailed and constructive comments for the authors: • The abstract is not clear enough for understanding. For example, the sentence ‘The decoder is designed with multiple tasks to address specific challenges’ in the abstract is vague. The authors should briefly explain what the multiple tasks and challenges are. Also, the sentence ‘Reliable boundary guidance is achieved by evaluating the uncertainty of each pixel label, By assigning it as a weight to regions with highly unstable boundaries, the network’s ability to learn precise boundary locations can be improved, ultimately leading to more accurate segmentation’ can be split into smaller sentences to make the point clearer. • In the introduction section, the authors mention ‘… deep learning-based techniques [7]’ and cite only one paper; whereas, there are many articles in the literature for CNV segmentation and the authors have included some of them later in the introduction. They should cite them here as well. • The authors should define what they mean by ambiguous region boundaries. How do they determine if a boundary is ambiguous? • The authors divide the output representations of the transformer layers in ViT into four groups, each containing three feature representations. What are the criteria for forming these groups? What are their dimensions? How do they select them? The authors do not explain this in the paper and the diagram is not self-explanatory enough. • The schematic diagram of the proposed architecture is not well-drawn. The authors can refer to reference [14] of their paper for a better illustration of the diagram. • In Section 2.3, please provide brief but comprehensive details. • The authors state in the introduction section: ‘However, common issues including substantial scale variations of CNV regions and low-contrast microvascular boundaries were not fully deliberated in previous network designs.’ The authors also review some of the techniques that address the same problem; that is, CNV segmentation, such as [7], [8], and [9] in their paper. Therefore, they should compare their results with those of the existing techniques to demonstrate the improvement in addition to some latest techniques.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Lack of Architecture Detail: The paper does not explain the proposed architecture in enough detail, such as the number and configuration of convolutional and transformer layers, the input and output dimensions, activation functions and other similar details. This makes it hard for other researchers to implement or reproduce the technique. A table or a pseudocode that summarizes the architecture would be helpful. Inadequate Comparison: The paper compares the proposed technique with general segmentation architectures rather than with state-of-the-art methods that address the same problem of CNV segmentation. This limits the evaluation and validation of the technique. A more comprehensive comparison with recent and relevant methods would be necessary. Poor Representation: The paper lacks many important information that are necessary for replicating or implementing the proposed technique. For instance, the paper does not mention the data preprocessing steps, the hyperparameters, etc.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Overall, the review provides a positive evaluation of the paper’s contributions, including the proposed RBGNet method and its use of uncertainty maps and multi-task optimization. However, the review also points out several limitations that affect the reproducibility of the paper and its comparison with state-of-the-art methods.

    One significant limitation highlighted in the review is the lack of detail in the proposed architecture. Without clear information on the number and configuration of layers, input/output dimensions, and activation functions, it may be challenging for other researchers to implement or reproduce the technique. Providing a table or pseudocode that summarizes the architecture would be helpful in this regard.

    Additionally, the paper’s comparison with general segmentation architectures rather than state-of-the-art methods that address the same problem of CNV segmentation is another limitation. A more comprehensive comparison with recent and relevant methods would provide a better evaluation and validation of the proposed technique.

    Finally, the paper’s lack of important information necessary for replicating or implementing the proposed technique, such as data preprocessing steps and hyperparameters, is also a significant limitation. Providing this information would increase the paper’s reproducibility and make it more useful for other researchers.

    In summary, while the review acknowledges the paper’s contributions and thorough evaluation of the proposed approach, several limitations need to be addressed to improve the reproducibility and validation of the technique.




Author Feedback

We appreciate your positive feedback to our technical novelties (e.g., “novel/interesting approach” by R1, “novel vision transformer-based architecture” by R3, and “make two major contributions” by R3), and the effectiveness (e.g., “…improves segmentation of ambiguous boundaries” by R1 and R3, “eliminating the need for empirical values” by R2, and “considerable improvement” by R2).

Q1: Generalizability (R1, R2) Our primary objective is to segment CNV regions and vessels in OCTA images, and we have demonstrated the effectiveness. Our method exhibits the strengths of extracting global and local information and leveraging mutual information in a multi-task learning setting. This unique combination enhances ambiguity localization in a broader context, extending its applicability to tasks with similar underlying features.

Q2: Computational efficiency (R1, R2) The uncertainty evaluation during training can increase the computational cost. We used 30 dropout samples, resulting in a training epoch duration of around 142s. Without dropout, the time is reduced to approximate 30s. The increase in training time improves the accuracy of our method. During testing, our method achieves a much faster speed of around 0.5s per test.

Q3: Why uncertainty weights are only added to BCE loss (R2) BCE loss is used for pixel-level classification by adding weights from uncertainty estimation. This improves the segmentation of ambiguous regions with pixel-level uncertainty. MSE loss and Dice loss are respectively used in the image-level to minimize the difference and measure the similarity between the prediction and ground truth. Hence, only BCE loss is weighted by the uncertainty for pixel-level prediction.

Q4: Parameter setting and reproducibility (R1, R2, R3, MR) Our method utilizes 12 Transformer layers, divided evenly into 4 groups (Fig. 2). To maintain a balance across different depths, each group is upsampled to align with the scale of the encoder (Sec. 2.1). We adopt the same number of channels and parameters specified in [16]. This choice ensures consistency and facilitates better comparison with existing methods. As for reproducibility,we aim to provide details of training hyperparameters in the “Implementation Details” section. We will also make our open-source code available and provide parameter details online.

Q5: Comparison with other methods (R3, MR) We met challenges in comparing with [8] and [9]. They were specifically tailored for processing B-scans in OCT tissue images, which significantly differs from our focus on OCTA microvascular images. As for [7], it uses both 2D and 3D images, while our work solely relies on 2D enface images, without 3D data available. Therefore, a direct comparison in terms of data availability and network feasibility is not possible. However, we still think it would be useful to compare with [8] and [9]. Due to the limited rebuttal period, we completed the comparison with [8], achieving a regional and vascular segmentation Dice of 0.8166 and 0.7582, respectively. We will include all the comparisons in the final version.

Q6: Briefly explain what multiple tasks and challenges are (R3) The multiple tasks include region and vessel segmentation, boundary prediction, and distance transformation. Among them, we primarily focus on region and vessel segmentation. Furthermore, we introduced boundary prediction to tackle the challenge of ambiguous CNV boundaries, and distance transformation was used to address irregular CNV shapes. In the final version, we will give concise descriptions of these tasks.

Q7: Definition of ambiguous boundary (R3) An ambiguous boundary denotes areas where the distinction between the CNV and the surrounding normal vessels is not clearly discernible. This can be attributed to multiple factors, including low contrast, irregular vascular patterns, shadowing effects, and noise.

Q8: Redundancy (R2,R3) We will thoroughly revise the expressions to enhance the conciseness and readability.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper in question introduces a novel technique dubbed RBGNet, specifically designed for the segmentation of Choroidal Neovascularization (CNV) in OCTA images. The methodology utilizes a dual-branch encoder and a boundary uncertainty-guided multi-task decoder. These components work in harmony to capture global long-range dependencies and local contexts of CNVs exhibiting significant scale variations. The incorporation of uncertainty maps to enhance the robustness and precision of the CNV segmentation models represents a compelling aspect of the work. An innovative feature of the paper is the automation of loss function weights at both pixel and image levels using these maps, which eliminates the need for empirical values. The authors demonstrate a thorough comparison with other segmentation approaches and conduct an extensive analysis of the effects of the proposed method’s various components. Each component is shown to independently contribute to enhancing performance, leading to significant improvement over other segmentation techniques. Despite its merits, the paper presents some shortcomings. These include an insufficiently detailed explanation of the proposed architecture, an inadequate comparison with contemporary state-of-the-art methods addressing CNV segmentation, and a lack of critical information necessary for the replication or implementation of the proposed technique, such as data preprocessing steps and hyperparameters. The authors’ rebuttal offers detailed responses to these concerns. They affirm the versatility and potential generalizability of their methodology to tasks with similar underlying features. They acknowledge the increased computational cost due to uncertainty evaluation during training but defend it as a trade-off for enhanced accuracy. They justify the exclusive application of uncertainty weights with the BCE loss, owing to its role in pixel-level classification. The authors commit to providing a more detailed account of their architecture and the training hyperparameters in the final paper, which will improve reproducibility. They also address the challenges of comparison with other specific methods and pledge to include all relevant comparisons in the final version. The authors adequately address the concerns about their method’s multiple tasks and challenges, providing clarification about the ambiguous boundary, which refers to areas where the distinction between the CNV and the surrounding normal vessels is not clear. They assure that the final version will offer a more succinct description of these tasks. Given the authors’ comprehensive and satisfactory responses and the paper’s novel contributions, I lean towards recommending the paper’s acceptance. However, it is essential that the authors ensure the concerns addressed in the rebuttal are accurately reflected in the revised manuscript.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The author has successfully addressed the main concerns. In their rebuttal, they have disclosed more details about their methods, the reason behind the choice of baseline approaches, and experiment design. I am pleased with their explanation.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    During the review process, several concerns were raised, mainly related to the lack of details on the architecture, and information needed to reproduce the reported results. Furthermore, an important concern was the empirical validation, which compared to general segmentation methods, rather than sota approaches that address the exact same problem. After reading the rebuttal I find that authors positively addressed all of these, as well as other comments from the reviewers. I appreciate that authors could provide results for a related work (i.e., [8]) and strongly recommend adding these in the camera-ready paper.



back to top