List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Dong Zhang, Raymond Confidence, Udunna Anazodo
Abstract
Ischemic stroke lesion is one of the prevailing diseases with the highest mortality in low- and middle-income countries. Although deep learning-based segmentation methods have the great potential to improve the medical resource imbalance and reduce stroke risk in these countries, existing segmentation studies are difficult to be deployed in these low-resource settings because they have such high requirements for the data amount (plenty-shot) and quality (high-field and high resolution) that are usually unavailable in these countries. In this paper, we propose a SimIlarity-weiGhed self-eNsembling framework (SIGN) to segment stroke lesions from low-quality and few-shot MRI data by leveraging publicly available glioma data. To overcome the low-quality challenge, a novel Identify-to-Discern Network employs attention mechanisms to identify lesions from a global perspective and progressively refine the coarse prediction via focusing on the ambiguous regions. To overcome the few-shot challenge, a new Soft Distribution-aware Updating strategy trains the Identify-to-Discern Network in the direction beneficial to tumor segmentation via respective optimizing schemes and adaptive similarity evaluation on glioma and stroke data. The experiment indicates our method outperforms existing few-shot methods and achieves the Dice of 76.84% after training with 14-case low-quality stroke lesion data, illustrating the effectiveness of our method and the potential to be deployed in low resource settings. Code is available in: https://github.com/MINDLAB1/SIGN.
Link to paper
DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_9
SharedIt: https://rdcu.be/cVRyn
Link to the code repository
https://github.com/MINDLAB1/SIGN
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
This work approach the problem of Ischemic stroke lesion segmentation when the training dataset is small and the MRI images have low resolution. To accomplish this task, the authors propose a framework that simultaneously train a neural network to segment ischemic stroke lesion and brain tumor, such that the update from the ischemic stroke lesion is stronger than the update provided by the larger dataset on brain tumors. According to the authors, the key elements of their approach are 1) a module that identifies the lesions and iteratively refine the prediction (IDN), and 2) another module that transfers optimization direction from brain tumor problem to facilitate the learning to segment the stroke lesion (SDU). The authors compares the proposed method with three other methods proposed in the literature for few-shot learning in terms of Dice, Accuracy and Hausdorff distance, surpassing with a large margin in all metrics.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The capacity to exploit a large dataset to help training with a small dataset in a different problem could help to approach problems that has not been properly explored due to the lack of data. So this work could have an important impact.
-
Based on the tests, both proposed modules (IDN and SDU) had an clear impact in improving the performance of the baseline, surpassing the other three methods.
-
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
Lack of intuition on the process. Brain tumor is irregular with a complex structure (oedema, core and enhanced regions) represented by four labels, while the ischemic stroke don’t present the same complexity in structure, being represented by a single region (one label). This mismatch in complexity is not considered in the paper and the authors don’t provide a rationale for the soundness of the approach.
-
Lack of clarity on the architecture. The architecture in Fig 2 is explained briefly, omitting the definition of some modules, such as, background and foreground attentive inception, channel and spatial attention, number of feature maps, places where upsampling are performed and the use of some symbols, etc. So it is difficult to understand how the architecture were engineered.
-
Lack of detail on the training. How the authors deal with the fact that both problems use a different number of input sequences and the output stage are different? Also, the trained process is not adequately explained. We do not know when the SDU interacts with the network. Every batch? Every epoch? And the loss function? How is it composed?
-
Lack of clarity on the tests. The authors don’t explain how the three methods (MLDG, PROTO and Reptile) were trained. Also, it is not explained why those were chosen and if they could be applied in this context. For instance in MLDG [15], the common factor was the bone, while the variation was the place, acquisition protocol, orientation, field of view, or surgical implant. So, the rationale of including methods whose assumptions are quite different has to be properly explained.
-
Size of the test set. Although the metrics improve over the methods used for comparison: 1) the test set is small, so the metrics could be due to the random choice of cases and the inadequacy of the compared methods due to the mismatch on their assumptions. 2) Also, we don’t know the architecture of the baseline. Since, the authors only provides the boxplot for the baseline, we can’t compare the baseline with the proposed method and methods used for comparison at the same time.
-
- Please rate the clarity and organization of this paper
Poor
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The authors indicate that they will provide access to the implementation. This will help to reproduce the results, but at the cost of a careful reading of the code, since the reader has to fill the gaps in the description found in the article.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
The motivation of the article in investigating ways of using a large dataset to train a method on a small dataset on a different problem is interesting and may have impact. This could justify an article only on this problem without the goal of learning to apply in a situation with images with a lower resolution.
Although there is potential in the objective, the idea that by learning to distinguish between necrotic and enhanced tissue based on T1, T2, FLAIR and specially T1C may help to distinguish ischemic tissue, using a set of MRI sequences that differ partially is not obvious. So providing the rationale of the adequacy of the method is a must. Also, it is important to show that it is also consistent. This could be argued by more compelling tests. For instance, 1) the authors could have opted to present a test in the blind test set of SISS; this would allow comparing with state of the art methods on this dataset. An improvement over these would give indication on the strength of the proposal. This could be accomplished by not reducing the resolution of the images. 2) To give evidence on the consistency of the approach, the authors could test in another problem, for instance in SPES that distinguish between the core and the penumbra of the ischemic lesion. An improvement here could allow arguing the methods had potential to transfer information about the complexity of the brain tumor to another less complex problem but more complex than SISS.
The authors should improve the description of the method. Figure 2 should have all components defined and explained, or referenced to the respective article for detail. Also, the authors should indicate the number of feature maps, the input and output stages and, the places where the upsampling is performed. The training process is not adequately explained. The authors should explain when SDU updates the parameters. It is not explained if it is at every batch or every epoch. Also, when the samples of the glioma dataset are used and using which proportion? This is not explained in the article.
As referred in the weaker aspects above, the authors should review the tests. The rationale for the selection of the methods should be clear. Also, a comparison with state of the art methods on the problem would strength the paper. The training of each method should be described. Also, the baseline should be included in table 1.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
4
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The rationale for the validity of the approach is not evident and is not presented in the paper, which raises question on the reproducibility of the results in a slightly different setup. This is exacerbate by: 1) the small size of the test dataset; 2) the lack of comparison of state of the art methods on the problem; 3) the choice of methods that were not designed to work in a scenario as described in the paper – use of datasets from different domains and semantic labels.
- Number of papers in your stack
5
- What is the ranking of this paper in your review stack?
4
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
5
- [Post rebuttal] Please justify your decision
As stated before, the possibility of using a large dataset with ground truth to train a segmentation method on a different problem may have an important impact. In the rebuttal, the authors clarified aspects that are less clear in the article and provided other tests showing the strength of their proposal, so I have updated my evaluation taking this information into consideration. However, the motivation of using images with lower resolution, because of the target application, restricted the choices of methods used for comparison. If the authors had tested with images with the full resolution, we would have a better idea of the true strength of the proposed method.
Review #2
- Please describe the contribution of the paper
The authors propose a novel approach to automatically segment stroke lesions on low-quality and few-shots MRI. The method exploits attention mechanisms to first identify lesions from a global perspective and then progressively refine their segmentation. A new strategy is also proposed to overcome the few-shots challenge.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-The new contributions are explained well are in details. Both the attention mechanism and the soft-distribution aware-updating are novel ideas seamlessly integrated in the framework. -This work has potentially a real-world applicability, aiming to improve stroke lesion segmentation in low and middle-income countries. -The method proposed was compared with and outperformed different state-of-the-art approaches on publicly available datasets.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-The authors describe in details the shortage of radiologists and the low quality MRI available in developing countries, and this can be of interest to the reader. In this work, however, high-quality datasets are degraded to simulate low-quality data. It would have been extremely interesting to evaluate the proposed method also on an acquired low-quality MRI dataset. -Statistical tests to support the results obtained are missing.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Code and data are publicly available.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
-Rather than splitting the dataset in training, validation and testing once, a nested cross-validation could be considered to evaluate the method over the entire dataset. -Fig1. B has a typo. To be employed BY our method. -Throughout the manuscript, the authors mention the voxel spacing without indicating the unit. I assume this is mm and it should be specified. -Fig.2 could be improved. Instead of repeating ResBlock and DiscernBlock, a legend showing the color-block correspondence could be added. The text is quite small and difficult to read. -I understand the limited number of pages available, but the Introduction is rather long and there is no Discussion section. Rather than discussing the results in the Experiment section, I would do that in a separate Discussion section. -It would be interesting to add a time comparison between a manual annotator and the automated method proposed. This would strengthen the contribution of this work and its applicability. -Evaluation. “Four matrices” is probably a typos. This should be “Three metrics” as only three metrics are listed. -Conclusions: “Our further work will improve the lesion segmentation accuracy and quantify the lesion volume”. This is a bit vague, the authors should rather mention how they think the lesion segmentation accuracy can be further improved.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
6
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The authors propose a novel framework to segment stroke lesions in low-quality MRI. The contributions are explained in details and the method proposed could be useful in low-resources settings with a real-world application. Although it could be improved in some aspects, the manuscript is well-written and would be of interest to researchers in the field.
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
2
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
The authors have properly addressed most of my concerns in the rebuttal.
Review #3
- Please describe the contribution of the paper
The paper presents a framework for stroke lesion segmentation from low-quality and few-shot MRI. The authors present the Identify-to-Discern Network, which combines a pyramidal structure with attention layers and a multiscale loss. They also propose a Soft Distribution-aware Updating strategy (SDU) as a more effective alternative to pretraining on a related task. These techniques achieve good performance when co-training on glioma segmentation.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The two innovations are interesting and fairly novel. The Identify-to-Discern Network is well-designed and sophisticated without being overly-complicated. SDU with co-training is a simple and effective approach that successfully eliminates poor outliers compared to a pretrained baseline.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
Meta-learning baselines are not really fair. The meta-learning algorithms are typically applied when you have multiple training tasks (more than 1!) that are drawn from the same task distribution. It seems a little different from choosing a related task to pretrain/co-train on. It also would be helpful to have an attention-based baseline to compare IDN against. Maybe a segmentation transformer like UNETR.
-
Ablations are not sufficient to understand what parts of the innovations matter, or why IDN/SDU synergize. For IDN, multi-layered supervision should be its own ablation vs. the attention layers. “SDU only” should have its own results. Pretraining-then-finetuning is a well-proven approach for a lot of other applications, help us understand why SDU makes sense / performs better in this application.
-
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Code will be made available. Datasets are available.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- “pyramidal structure” is better than “coarse-to-fine”. “coarse-to-fine” usually describes an architecture that first explicitly processes a coarse version of the image and then fills in finer details.
- a little better to rename “accuracy” to “precision”
- is stroke lesion segmentation really needed for “rapid stroke diagnosis”? seems this is more of a tool for clinical researchers. maybe can be used for radiological reporting but not for diagnosis.
- no related work section
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The proposed approach is interesting and novel, but does not provide enough comparisons/ablations to help readers understand how the different components work together to produce the more robust model that the authors achieve.
- Number of papers in your stack
5
- What is the ranking of this paper in your review stack?
4
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Not Answered
- [Post rebuttal] Please justify your decision
Not Answered
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
The reviewers are unanimous in praising the relevance of the work, attempting to leverage the existence of other larger datasets in tasks that may appear similar and in the obtained performance. However the rationale behind the choices of tasks, modules and experimental design could be made clearer. In order to justify the statement of improvement, statistical testing is required but currently missing.
- What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
NR
Author Feedback
Thanks to the meta-reviewer and all reviewers for recognizing the meaningful motivations and novel innovations in our work (“all reviewers are unanimous in praising the relevance of the work”). 1. Meaningful motivations (“this work could have an important impact”[R1], “This work has potentially a real-world applicability”[R2]). 2. Novel innovations (“both proposed modules had a clear impact in improving the performance” [R1], “Both proposals are novel ideas seamlessly integrated in the framework”[R2], “The two innovations are interesting and fairly novel”[R3]). We have carefully studied reviewer comments and conducted responses below. • Motivation related Q: The interior structure mismatch between glioma and stroke and its rationale for the approach soundness [R1]. A: Although glioma has more complex interior structures than ischemic stroke, it does not affect our approach’s soundness. Because we treated all the glioma structures as abnormal tissue by merging their labels to differentiate them from normal tissue. And we trained our segmentation module to focus on the difference between normal and abnormal tissues instead of the interior structure of abnormal tissue. Q: Using partial MRI sequences is not as obvious as full sequences (with extra T1C) to distinguish ischemic tissue [R1]. A: We concur with the reviewer that full MRI sequences are more effective to distinguish ischemic tissues, but it does not match our low-quality settings. Based on our colleagues and data from Nigeria and Ghana, stroke imaging in Africa is limited, and most imaging centers cannot perform contrast enhancement, which motivated us to exclude T1C. • Experiment related Q: The test dataset is small [R1] and cross-validation is suggested [R2]. A: We performed a new four-fold validation. The new result (Dice 75.92%) is consistent with the conclusion by the old result (Dice 76.84%) that our method is effective for low-quality and few-shot stroke lesion segmentation. Q: Statistical test was missed [R2&AC]. A: We performed t-tests to show the statistical significance between the proposed modules (IDN, SDU, IDN&SDU) and the baseline method. The p-values on Dice are 0.01, 0.04, and 0.0007, indicating the statistical difference by our methods is significant (p-values < 0.05). More details will be found in our final version. Q: Attention-based baseline experiment and the ablation of “SDU only” [R3]. A: The attention-based baseline (UNETR) achieves Dice 66.22% (lower than 71.69% by our “IDN only”), illustrating our IDN’s superiority in the low-quality segmentation task. The “SDU only” based on the UNETR achieves Dice 68.93%, showing our SDU is generally effective for segmentation networks. Q: 1) test in the SISS blind set to compare with the newest methods; 2) test in the SPES to prove the approach’s consistency [R1]. A: Although these two high-quality datasets do not meet our motivation, we did the experiment. 1) The newest method MCA-DN by Anusha achieved Dice 79.40% on high-quality MRIs, and we obtained 75.92% on low-quality MRIs, showing our method’s competence. 2)The baseline method (BraTS pretrained and SPES finetuned on T1C&T2) achieves Dice 34.25%, and our method achieves 56.46%, showing our method is consistent to transfer complex information. Due to the website not responding, all the above results (including MCA-DN) were obtained on the public visible data. • Presentation related Q: Lack of network architecture and training details [R1]. A: Due to the page limitation, we simplified these details in the initial submission. We have open-sourced our code to avoid future confusion. Q: Lack of clarity of comparison methods [R1]. A: We chose the three testing methods because they are the latest few-shot medical segmentation methods based on well-known algorithms (MAML, etc.; over 6k citations). Moreover, MLDG illustrates it is generalized (line 15, page 1) instead of only for the spine.
All other minor comments will be carefully considered and addressed.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The topic of the paper is interesting and the authors have adequately addressed the main comments stemming from the reviews. Such a revised version should be an interesting paper to be published in the main proceedings
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
6
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The original paper was received positively with a number of concerns on intuition and clarity. The authors have satisfactorily addressed the issues as raised by the reviewers and I am glad to accept the paper for presentation at MICCAI.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
8