List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Wenguang Yuan, Donghuan Lu, Dong Wei, Munan Ning, Yefeng Zheng
Abstract
Retinal edema area, which can be observed in the non-invasive optical coherence tomography image, is essential for the diagnosis and treatment of many retinal diseases. Due to the demand of professional knowledge for its annotation, acquiring sufficient labeled data for the usual data-driven learning-based approaches is time-consuming and laborious. To alleviate the intensive workload for manual labeling, unsupervised learning technique has been widely explored and adopted in different applications. However, the corresponding research in medical image segmentation is still limited and the performance is unsatisfactory. In this paper, we propose a novel unsupervised segmentation framework, which consists of two stages: the image-level clustering to group images into different categories and the pixel-level segmentation which leverages the guidance of the clustering network. Based on the observation that smaller lesions are more obvious on large scale images with detail texture information and larger lesions are easier to capture on small scale images for the large field-of-view, we introduce multiscale information into both stages through a scale-invariant regularization and a multiscale Class Activation Map (CAM) fusing strategy, respectively. Experiments on the public retinal dataset show that the proposed framework achieves a 76.28% Dice score without any supervision, which outperforms state-of-the-art unsupervised approaches by a large margin (more than 20% improvement in Dice score).
Link to paper
DOI: https://link.springer.com/chapter/10.1007/978-3-031-16434-7_64
SharedIt: https://rdcu.be/cVRsw
Link to the code repository
https://github.com/mangoyuan/MUIS
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
The authors present a novel Multiscale Unsupervised Image Segmentation (MUIS) framework for the Retinal Edema Area (REA) segmentation task. It has two steps. 1: the image-level clustering groups the images in two categories. This provides guidance for the downstream segmentation task. 2: The pixel-level segmentation yields pixel-wise labels for each image. Results are very good and approach the supervised approach.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
clearly written good results related work relevant experiments well performed
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
weights in Eq 6 unclear. extra examples in supplementary material would have been nice. ‘failures not discussed (some ‘no’s in the reproducibility checklist) so open questions remain, see 8.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
no concerns
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
192 and 96 /192 seem to be a bit arbitrary - why these? would e.g. 128/192 or 128/256 have made sense? Fig 2: It seems that the GT has a small layer at the bottom “extra” compared to MUIS - nnUnet has it. Any reason for this? Is this layer an artifact, or essential? is MUIS focussed on elevation? Any reason for answering sometimes ‘no’s in the reproducibility checklist?
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
looks ok, but I see some open issues
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
2
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
Given that the authors implement the suggestions given, as well as their answers, this paper would be a nice fit to MICCAI.
Review #2
- Please describe the contribution of the paper
In the manuscript, the authors have proposed a novel Multiscale Unsupervised Image Segmentation (MUIS) framework for the Retinal Edema Area segmentation task. Based on the observation that smaller lesions are more obvious on large scale images with detail texture information and larger lesions are easier to capture on small scale images for the large field-of-view, they introduced multiscale information into both stages through a scale-invariant regularization and a multiscale Class Activation Map fusing strategy, respectively.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Overall, the manuscript is well written and well presented. The design of the framework in two stages has been effective, as well as the incorporation of multiscale information in both stages. The quantitative analysis on the public retinal dataset demonstrated the superior performance over state-of-the-art unsupervised approaches.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The design of the proposed architecture is heavily reliant on the existing DCCS approach. Section 2.1 consists of excerpts from the original DCCS study and hence, not a contribution of this manuscript. The main methodological contributions of the manuscript are actually presented in Sec 2.2 and 2.3.
There are too many grammatical mistakes throughout the manuscript, which is affecting the readability.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The contributions presented in the manuscript should be possible to be reproduced by modifying the base DCCS architecture. The authors have also claimed to release the code after review.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
Along with the comments presented above (Q#5), I would like to ask the reason behind setting the number of training epochs at 20. This number seems quite small, and I would suggest to incorporate an ablation study over varying number of epochs.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Overall, the manuscript is well written and well presented, and the experimental analysis demonstrates the superiority of the approach over related state-of-the-art methods. However, the proposed architecture is heavily reliant on existing DCCS approach, making the theoretical contributions limited.
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
2
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
5
- [Post rebuttal] Please justify your decision
Although the authors have provided some detailed explanations clarifying some parts of the manuscript, there are still some limitations in the main methodological contributions.
Regarding the optimization of the segmentation network, the manuscript would be really benefitted if the authors can present the ablation study, at least as the Supplementary Material.
Review #3
- Please describe the contribution of the paper
The authors proposed a 2-stage network architecture for unsupervised segmentation of retinal edema. In the first stage images are classified as normal/edema using DCCS. In the second stage, multiscale CAM fusing strategy is applied to guide segmentation for an encoder-decoder network.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The authors introduced scale-invariant regularization in image clustering, which is considered novel since the proposed network outperforms the SOTA substantially.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Formulation for the main novelty, which is scale-invaraint regularization for clustering, is not properly explained.
- More experiments should be performed on lesion types of significant scale differences to validate scale-invariant capability of the network.
- Failure analysis missing.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The work might not be reproducible because the AI challenge dataset is no longer available to the public.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- The abstract is not concise enough. Mechanisms of the two stages should be summarized in the abstract to bring out the novelty.
- Is the proposed network only applicable to edema? Edema is considered as a lesion of relatively large scale. How about other types of lesions of very different scales, such as hard/soft exudate and hemorrhage? Do they also benefit from scale-invariant regularization?
- More explanation is needed for eqn 5. What is R^(-1)F(R(X))? Do the modified DCCS obtain images of multiple scales? How many different scales? What is R^(-1)? Do you scale the features back? Why so? Since this is considered the only novelty of the proposed network, more explanation is needed.
- How scalable is the network if multiple lesion classes are considered?
- In a real clinical setting, a diseased image often contains multiple lesion types. How much degradation is expected on a dataset of this nature?
- Based on table 3 for ablation study, the most noticeable improvement comes from si. The improvement from ms is not considered significant. For this reason, I consider scale-invariant regularization the only novelty of the proposed network. In addition, multiscale fusion of CAM is a necessary step for combining CAMs from multiple scales, not something considered a novelty.
- How is “MUIS w/o ms, ac” done? Which CAM is selected for segmentation?
- The authors included a section explaining the original DCCS, but should avoid copying the words directly from the original paper, e.g. “category-style latent representation in which the category information is disentangled from image style and can be directly used as the cluster assignment.”
- Failure analysis missing.
- What are the data sizes for training, validation and testing? Are all 85 OCT volumes involved in training? The validation process has to be clarified.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Since unsupervised segmentation has not yet gained enough attention from the research community, this paper does present an interesting network combining multiscale unsupervised clustering and segmentation to achieve unsupervised segmentation. However, clarification is needed for the main novelty of the work and the validation process.
- Number of papers in your stack
3
- What is the ranking of this paper in your review stack?
1
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Not Answered
- [Post rebuttal] Please justify your decision
Not Answered
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
The paper presents a two-stage network architecture for unsupervised segmentation of retinal edema. The topic is of interest in particular since there is an actual need for unsupervised segmentation method. In their feedback the authors should address the issues and questions raised by the reviewers in particular refer to Eq. 5 and motivate the choice of hyper-parameters.
- What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
4
Author Feedback
We thank all the reviewers for their constructive comments and the recognition of our contribution. Below we summarize our replies to all reviewers’ concerns, and we believe these issues can be addressed in the final version.
Q1: Weights and other hyper-parameters (R1 and 2) Eq 6 is the overall loss function, which consists of four terms: mutual information regularization, disentanglement regularization, prior categorical distribution regularization and scale-invariant regularization. The weights are used to balance the terms, and their values are stated in Section 3.1. We follow the settings in DCCS for the first three weights for a fair comparison, and choose the scale-invariant regularization weight with the best clustering performance. The figure displaying the effect of the scale-invariant regularization weight will be added in the final version when extra space is allowed. As for the size of images, the method is not sensitive to it in a certain range, as displayed in row 4 and row 5 of Table 2. Considering the memory and efficiency, we set the image size as 96/192 based on the exploration experiments. The number of training epochs for the segmentation network is only set as 20 because the encoder Q has been trained in the clustering stage, so the optimization of the segmentation network converges very quickly, and the performance has little change after 20 epochs.
Q2: Failure analysis (R1 and R3) Thanks for pointing this out. Yes, there are few failure cases, mostly due to the too small size of lesion. Because most abnormal images have relatively large edema area, the ones with very small edema are likely to be recognized as normal image and their lesions can hardly be segmented. We will add the visualization and discussion of some failure cases along with the memory usage and average runtime in the final version.
Q3: The generalization ability of the method when multiple lesion types or small lesions are considered (R3) The method is easy to extend for multiple lesion types by clustering the images into multiple classes and use the CAMs for different classes to train the segmentation network. As long as the lesions have different appearance patterns, the method has the ability to recognize them. Small lesions, such as exudate and hemorrhage, can also benefit from scale-invariant regularization as long as the image resolution allows discerning the lesions.
Q4: Explanation of Eq 5 and other issues (R3) In Eq 5, R() represents the resale operation, while R^-1() denotes its reverse operation. In practice, we feed images with two different scales, 96 and 192, to the clustering network as stated in Section 3.1. The bottleneck features of encoder are rescaled to the original scales for the calculation of the scale-invariant loss. More explanation will be added in the final version. As we stated in the first paragraph of page 8, MUIS without si, ms and ac is equivalent to using the CAM of DCCS to train another segmentation network. Because of the unsupervised nature of this study, we do not split the dataset and use all volumes for both training and testing, same as previous unsupervised studies. We thank R3 for pointing out the writing issues of the abstract and Section 2.1, and will change them accordingly in the final version. [1] Kanezaki, A.: Unsupervised image segmentation by backpropagation. In: IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 1543-1547 (2018) [2] Ji, X., Henriques, J.F., Vedaldi, A.: Invariant information clustering for unsupervised image classication and segmentation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 9865-9874 (2019)
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
-
Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
-
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
-
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The authors have clearly clarified the issues raised by the reviewers. The explanations in the rebuttal look reasonable and correct to me.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
6
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
This article proposes a novel framework for the retinal edema area segmentation task. The main idea is to use multi-scale information at two levels: image level and pixel level, to perform unsupervised segmentation. Good results were obtained. The paper is well written. The rebuttal addressed most critical points. My proposal is therefore “acceptance”.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
5