Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yicheng Wu, Zhonghua Wu, Hengcan Shi, Bjoern Picker, Winston Chong, Jianfei Cai

Abstract

New lesion segmentation is essential to estimate the disease progression and therapeutic effects during multiple sclerosis (MS) clinical treatments. However, the expensive data acquisition and expert annotation restrict the feasibility of applying large-scale deep learning models. Since single-time-point samples with all-lesion labels are relatively easy to collect, exploiting them to train deep models is highly desirable to improve new lesion segmentation. Therefore, we proposed a coaction segmentation (CoactSeg) framework to exploit the heterogeneous data (i.e., new-lesion annotated two-time-point data and all-lesion annotated single-time-point data) for new MS lesion segmentation. The CoactSeg model is designed as a unified model, with the same three inputs (the baseline, follow-up, and their longitudinal brain differences) and the same three outputs (the corresponding all-lesion and new-lesion predictions), no matter which type of heterogeneous data is being used. Moreover, a simple and effective relation regularization is proposed to ensure the longitudinal relations among the three outputs to improve the model learning. Extensive experiments demonstrate that utilizing the heterogeneous data and the proposed longitudinal relation constraint can significantly improve the performance for both new-lesion and all-lesion segmentation tasks. Meanwhile, we also introduce an in-house MS-23v1 dataset, including 38 Oceania single-time-point samples with all-lesion labels. Codes and the dataset are released at https://github.com/ycwu1997/CoactSeg.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43993-3_1

SharedIt: https://rdcu.be/dnwM2

Link to the code repository

https://github.com/ycwu1997/CoactSeg

Link to the dataset(s)

https://github.com/ycwu1997/CoactSeg

https://portal.fli-iam.irisa.fr/msseg-2/data/


Reviews

Review #2

  • Please describe the contribution of the paper

    In this paper, the authors present a deep-learning based method for the detection and segmentation of new lesions as observed in Multiple Sclerosis. They propose to use heterogenous data some with longitudinal follow-up and some without in a model that produces both all lesion segmentation and new lesion segmentation. New lesion segmentation is learnt from the differences image between two time points. The learning is performed in two stages: first the learning of the overall lesion segmentation and second a regularisation to ensure longitudinal consistency in the segmentation of new lesions

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Very clear motivation and presentation Interesting idea regarding both the heterogeneity and the regularisation component of the loss. Good experimental design (comparison to SOTA + to human raters + ablation study)

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Absence of statistical analysis in the results, absence of indication of variation in the proposed metrics No mention of non deep learning methods for detection of new lesions (some exist though!) No justification for the removal from the F1 metric of lesions smaller than 11 voxels which seems odd when some of the appearing lesions may be quite small. No consideration for disappearing lesions that are also part of the MS process No consideration or discussion on more than two time points. Clinically looking at more than 2 time points is also important.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Currently not reproducible before the code and data are made available but all supposed to happen at publication. Details on training is given but limited information (except if present in the repo) of the data used in training / validation / testing

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Regarding the evaluation of the results, there should be adequate statistical analysis across the methods / raters and presentation of range of performance to be able to reach any conclusion. Also please check that the distributions are normally distributed before presenting mean / sd - median / IQR may be better suited if this is not the case.

    • In terms of the choices of metrics, please note that JAccard and Dice are mathematically related and do not bring any different perspective. Similarly using twice a distance measure may not be the best choice. For the F1 an indication of the strategy employed for localisation and assignment should be specified. Please consider using the information in Maier-Hein, L., & Menze, B. (2022). Metrics reloaded: Pitfalls and recommendations for image analysis validation. arXiv. org, (2206.01653).

    • Given the potential influence on the results, please do justify why you would choose not to consider (and only for the F1 calculation) the smallest lesions.

    • There is very limited information on the registration between time points and the potential problem of atrophy especially present in MS. In addition having information on the time lag between time points may be important to declare. Choices in registration strategies and reference may introduce a lot of bias in the analysis - see Reuter, M., Schmansky, N. J., Rosas, H. D., & Fischl, B. (2012). Within-subject template estimation for unbiased longitudinal image analysis. Neuroimage, 61(4), 1402-1418. for explanations of these considerations in the context of dementia

    • Some have attempted to solve the question of new lesions without any deep learning strategy - it would be good to mention these methods as well when setting up the context of the problem

    • MS is a pathology in which multiple follow-up measurements are usually performed. Mentioning in the discussion the limitation of only using two time points and the possibility for extension would be really valuable.

    Minor typos:

    • when mentioning the augmentation strategies, instead of “like” that may refer to the fact that more have been used but are not specified stating something like: Among common augmentation strategies, we chose to employ XX and XX by applying XX with random XXX
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The presented idea was well constructed and presented but the analysis (especially the absence of statistical testing, inadequate choice of metrics and removal of small lesions) slightly dampened my enthusiasm

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors proposed a deep learning model that can be trained on both new-lesion annotated two time point data and single time point data for segmenting multiple sclerosis lesion (MS) with an ability to identify new MS lesion on brain MR scans. It was demonstrated that the proposed regularizer controlling longitudinal relation can enhance the segmentation performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed method technically sounds and is well described.
    • In the proposed method, only one network may be used for segmenting both all MS lesions and new lesions. This may make model training and development easy, and improve clinical workflow.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Small training and validation datasets
    • No separate testing dataset: difficult to judge the generalizability of the proposed method
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code and data would be released upon acceptance. Validating reproducibility will be straightforward.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Is the longitudinal brain difference computed with spatially aligned scans? Please clarify.
    • Fig 1: Not clear what the arrows specifically mean
    • Sect 4:
      • “MS lesions are always small” - Can we be more specific? Some MS lesions can be diffused and relatively large.
      • How were the competing methods trained and optimized?
      • “The simple stage-by-stage training strategy” - Please explain this.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • Clear description and technically sounding method
    • Potential for improving clinical workflow and facilitating model development easier
  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    This paper presents a deep-learning based, multi-task framework for MS lesion segmentation and activity from heterogeneous data. The propose Coaction Segmentation (CoactSeg) framework has showed a good performance on segmenting the new lesion longitudinally. Last but not least, the authors are planning to public the dataset used for part of their experiment.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This work presents a unified model that can be trained on heterogeneous data, either new-lesion annotated two-time-point data and all-lesion annotated single-time-point data. The annotated data for medical imaging task is precious and may have different types of annotations according to the research protocol, thus may not be suitable for a model with only one type of data. The proposed method solves this problem for general MS tasks (lesion segmentation and lesion activity).

    The evaluation has been made on one of the recent public dataset on lesion activity (MICCAI-MSSEG2 challenge) and the proposed method achieved the best performance for all the metrics reported compared to two of the top teams participated in the challenge.

    The proposed method is designed specifically for MS related task. The experiments and evaluation are complete with details and ablation studies. The authors mentioned that they will public the dataset from a specific region which will improve the diversity of the MS patient data source.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    My major comments is not about the proposed method but on the task which the authors try to solve. MS lesion segmentation is a challenging task which has been proved in recent survey paper and real-world practice. This can also been proved by the evidence in the paper (Table 2) that the performance of four different human experts compared to their consensus segmentation masks can varies from 58.56 to 77.52 in terms of Dice score while the performance of compared methods varies from 53.07 to 63.82 in Dice score. Lesion segmentation and activity is a task which is strongly subjective to the annotator thus the trustworthy of the consensus ground truth mask is questionable.

    The MICCAI-MSSEG2 challenge did not receive the testing dataset until the challenge day. The performance of the top participating teams was close and there was no team can beat all others on all the metrics. The experiment setting of the compared/proposed methods on this dataset needs to be specified. Only the results of two participating teams were included may because other methods are not publicly available but the top team should be included for a complete evaluation.

    The presentation of the paper needs to be improved, which will be discussed below.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method is easy to follow and reproduce. Authors will public the code at a later time.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Authors include the general CV metrics to evaluate the proposed method, however, as a clinical application, authors may try to evaluate their results from clinical perspective. If the authors would like to focus on new lesion generation, a metrics other then Dice (and related) should be considered. Authors can reproduce the top method from MSSEG2 and include that in the evaluation results.

    In term of the paper presentation, Fig. 1. is hard to understand even if it has a long description. I guess the authors are trying to state that their method can help saving the cost of expert annotation and data acquisition, but the presentation is really confusing. Also, different axial slices are used for each column makes it even more confusing. The authors should consider removing or rearranging the elements this figure to keep a clear presentation of the paper.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    My overall score for this paper is based on its completeness and specifically designed to solve the common heterogeneous data problem in MS studies though the MS-related tasks are challenging due to their subjective nature of manual labelling.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper presents an interesting idea to deal with heterogeneous and longitudinal MS data and the method is technically sound. The paper is well motivated and presented and will be of interest to the readers. Please clarify on the points raised by reviewers before final submission.




Author Feedback

N/A



back to top