Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Anamaria Vizitiu, Antonia T. Mohaiu, Ioan M. Popdan, Abishek Balachandran, Florin C. Ghesu, Dorin Comaniciu

Abstract

Longitudinal lesion or tumor tracking is an essential task in different clinical workflows, including treatment monitoring with followup imaging or planning of re-treatments for radiation therapy. Accurately establishing correspondence between lesions at different timepoints, recognizing new lesions or lesions that have disappeared is a tedious task that only grows in complexity as the number of lesions or timepoints increase. To address this task, we propose a generic approach based on multi-scale self-supervised learning. The multi-scale approach allows the efficient and robust learning of a similarity map between multi-timepoint image acquisitions to derive correspondence, while the self-supervised learning formulation enables the generic application to different types of lesions and image modalities. In addition, we impose optional supervision during training by leveraging tens of anatomical landmarks that can be extracted automatically. We train our approach at large scale with more than 50,000 computed tomography (CT) scans and validate it on two different applications: 1) Tracking of generic lesions based on the DeepLesion dataset, including liver tumors, lung nodules, enlarged lymph-nodes, for which we report highest matching accuracy of 92%, with localization accuracy that is nearly 10% higher than the state-ofthe- art; and 2) Tracking of lung nodules based on the NLST dataset for which we achieve similarly high performance. In addition, we include an error analysis based on expert radiologist feedback, and discuss next steps as we plan to scale our system across more applications.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43907-0_55

SharedIt: https://rdcu.be/dnwdC

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a multi-scale self-supervised learning solution for longitudinal lesion or tumor tracking in treatment monitoring workflows and re-treatment planning in radiation therapy. The authors address the challenges posed by complex data, lack of large annotated datasets, and lesion identification by leveraging contrastive learning and pixel-wise feature representations from unlabeled and unpaired data. They also introduce multi-scale embeddings to improve system robustness. Furthermore, the model is designed to benefit from biologically-meaningful points (anatomical landmarks) to ensure spatial coherence of tracked lesion locations.

    The paper’s potential impact in the clinical environment includes improved treatment response assessment, more accurate adaptation of radiotherapy plans, and a reduced burden on medical professionals. However, there are several points that need to be edited in the paper to enhance its clarity and effectiveness.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    In this study, the authors present an optimal self-supervised learning solution for longitudinal lesion or tumor tracking in CT images captured at multiple time points. They have experimentally demonstrated superior performance compared to the current state-of-the-art model (SAM). Although not entirely clear, the novel proposed methods contributing to the performance improvement are as follows:

    • Introducing multi-scale embeddings to enhance system robustness
    • Proposing pixel-wise feature representations derived from unlabeled and unpaired data
    • Designing the model to leverage biologically-meaningful anatomical landmarks for ensuring the spatial coherence of tracked lesion locations
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weaknesses of this study that should be highlighted are as follows:

    • First, the study’s framework is strikingly similar to the state-of-the-art model SAM. Although the authors claim to propose a multi-scale solution, the multi-scale feature embeddings in the decoder are, in fact, equivalent to SAM’s global and local embeddings. The only difference in this paper is the use of three more diverse scales for training, which is not substantial enough to be considered a significant innovation.

    • Second, the lack of appropriate citations in the methods section makes it difficult for readers to discern the authors’ new contributions, as the pipeline is almost identical to that of SAM.

    • Lastly, the absence of an ablation study poses a challenge in determining the factors that led to performance improvement over SAM. It is therefore difficult to assess which specific components contributed to the enhanced results.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    To reiterate, although the main factors contributing to the performance improvement are not entirely clear, if adequately addressed and clarified during the revision process, the results seem to be reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Please add a thorough ablation study to better understand the factors that led to performance improvement over SAM. Without it, it is difficult to assess which specific components contributed to the enhanced results.

    • Utilizing unpaired datasets for training can indeed aid in learning anatomical landmarks distributed across spatially different locations. However, this approach seems to be specifically tailored for follow-up CT scans taken over time rather than other general clinical scenarios (e.g., detecting lesions within a single CT scan as in SAM paper). The generalizability of this algorithm to other clinical scenarios remains uncertain. Further investigation and evaluation would be helpful to comment on the performance of the proposed method in these alternative contexts.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Based on the aforementioned major weak points, it has been determined that it may not be easy to address and resolve these issues, even with revisions. In this study, while there were minor modifications compared to SAM, it has been determined that these contributions may not be sufficient for acceptance at MICCAI.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The author has responded diligently, but has not sufficiently addressed potential weaknesses anticipated during the review, such as sufficient contribution compared to existing research. However, the successful application of their proposed method to a large-scale study is deemed to be a meaningful contribution. Therefore, I have revised my decision.



Review #2

  • Please describe the contribution of the paper

    The authors aim at improvement of the SAM model developed by Yan et al. (https://arxiv.org/pdf/2012.02383.pdf). SAM employs contrastive learning during pre-training for generation of pixel-wise similarity embeddings, a global and a local one, that allow tracking of anatomical locations in the downstream task. For the paper at hand, the authors utilize multi-scale embeddings, intending more fine grained localization. Furthermore, anatomical landmarks are involved in computation of the loss, while for SAM pixels are completely randomly chosen. The model is trained on a private CT dataset involving 52,487 patients and tested on the DLS dataset, also used by SAM, and a subset of the NLST dataset. The authors also retrained SAM on this data and were able to show that their model achieves superior performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. By incorporation of the multi-scale approach and involvement of anatomical landmarks, the authors were able to show that their approach outperforms the SAM model of Yan et al.
    2. A dataset involving 52,487 patients and their respective CT images was collected. This depicts a large data source, not many medical studies are able to acquire.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The differences to SAM are not clear. The structure of the paper limits recognition of novel contributions introduced by the authors. It seems like several steps presented in the methods section were already introduced in SAM. Furthermore, a change of notations in reference to the initial publication, i.e. Section 3.2, makes comparison hard.
    2. Results in Table 1 are adopted from [8]. However, training in [8] was performed on a different dataset. Even though models in [8] were probed on the same test set, Table 1 indicates direct comparability, which is not the case.
    3. In the novelty statement authors write that the “multi-scale approach ensures a high degree of robustness and accuracy for small lesions”. However, no evaluations taking lesion sizes into account and could therefore support this statement are provided.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Models were trained on a private dataset, testing was performed on two public datasets. Code of the model has not been maid available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Please make your own work more clear. Specify which parts were already introduced by Yan et al. and emphasis how your work differs from that.
    2. You list radiotherapy planning as a possible application case. RT planning requires semantic segmentation of tumor regions and organs at risk. Please specify how the model could work in this case.
    3. In Table 1, p-values of 10**(-6) are given for some of the results, but the values seem to be very close (6.5+/-9.6 vs. 5.9+/- 7.1). Which testing method has been applied here?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • improvement over baseline but no clear description of own work
    • large dataset
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    3

  • [Post rebuttal] Please justify your decision

    In agreement with reviewer #1, the paper does not allow for recognition of a novel contribution. For me, this issue could also not be resolved by the author feedback.



Review #3

  • Please describe the contribution of the paper

    This paper proposed a contrastive self-supervised learning framework for lesion tracking task. The comprehensive experiments on two datasets show the effectiveness of the proposed method, the successful small lesion tracking results are impressive.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The lesion tracking is interesting and clinically practical for assiting diagnosis. The experiments comparisions are comprehensive.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed method is not very novel, for example the volume augmentation-based contrastive learning has been widely studied in medical tasks.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    reproduciable

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    None

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The nice small lesion tracking results, however, the less novelty of the method.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper received three reviews with two accept and one reject. From the detailed comments, this is a paper with mixed opinions. So we would invite authors for rebuttal. Overall some questions are raised to be adequately address through rebuttal but this seems a solid work, particularly with large-scale empirical and informative evaluation.




Author Feedback

We would like to thank the reviewers and area chair for their effort in providing a constructive and detailed review. Please find below clarifications to the main points raised.

  1. Lack of clarity in terms of novelty (Reviewer #1, Reviewer #2): Inspired by [5], our proposed method brings two elements of novelty from a technical point of view: (1) the multi-scale approach for the anatomical embedding learning and (2) a positive sampling approach that incorporates anatomically significant landmarks across different subjects. With these two strategies, our goal is to ensure a high degree of robustness in the computation of the lesion matching across different lesion sizes and varying anatomies. Furthermore, a significant focus and contribution of our research is the experimental study at a very large scale: we have (1) trained a pixel-wise self-supervised system using a very large and diverse dataset of 52,487 CT volumes and have (2) tested and evaluated on two publicly available datasets. Notably, one of the datasets, NLST, presented challenging cases with 68% of lesions being very small (i.e., radius <5mm). To better emphasize and clarify these contributions, we will revise the manuscript and include appropriate citations in the methods section.
  2. Importance of including an ablation study (Reviewer #1): We have performed ablation experiments with the two proposed components (see previous paragraph) –both contributing to improved performance. For example, we empirically recognized that our method without the multi-scale component has difficulties tracking small lesions, e.g., lung nodules. We found that adopting a multi-scale approach (instead of the global/local approach as proposed in [5]) can lead to embeddings that better capture the anatomical location and are able to handle lesions that vary in size or appearance at different scales. Moreover, the changes proposed in this work help to alleviate the confusion caused by left-right body symmetries (e.g., the apices of the lungs). This effect challenged the tracking of small nodules in the lungs using [5]. We will add this information into the final version of our manuscript.
  3. Clarification on comparison to other solutions (Reviewer #2): The reference results in Table 1 are from reference [8] – except for reference method [5] (as recognized by Reviewer #2 and indicated in the caption of Table 1). The exact same test set was used to compute the performance of each approach listed in the table; however, some approaches have not been retrained. Given the clear superiority of approach [5] compared to all reference solutions, we focused on achieving a direct comparison against approach [5]; retraining and tuning it on our own dataset (see Section 4.2, first paragraph). To confirm the significance of the improvement achieved by our method compared to [5], we conducted a paired t-test for statistical analysis. The calculated p-values for both evaluation metrics were found to be less than 10^-6. We will better clarify this aspect in the revised manuscript to avoid any confusion.
  4. Missing evaluation on small lesions (Reviewer #2) We indeed focused on evaluating and achieving a high accuracy of our proposed method on small lesions. In the NLST dataset 68% of lesions are very small (radius <5mm). This was mentioned in the submitted manuscript (Dataset and Setup, Page 6; Evaluation, Page 7). Visual results are also shown in the supplementary material. We will better highlight this aspect in the introduction section of the revised manuscript.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After integrating all information and reading the submission, AC agree that this paper presented an interesting Self-Supervised Learning (SSL) method for the lesion tracking problem. SAM is also SSL and this work shows some quantitative improvement on the tracking/matching accuracy. Overall it is sufficient interesting, novel and adequately evaluated.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper studied the longitudinal lesion or tumor tracking problem and proposed a multi-scale self-supervised learning approach. The method is evaluated on two tasks, i.e., lesion tracking on DeepLesion and lung nodule tracking on NLST. The rebuttal provided additional details about method novelty, ablation study, and more comparisons. The rebuttal addresses a number of concerns of the reviews. The major concern is the clarification of the methodology contributions in this work compared with previous methods.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper addresses an important problem of tracking lesions and uses two large public datasets for training and evaluation. The authors have responded to the reviewer’s key concerns and seem to have addressed it. However, one important thing to note is that neither DeepLesion nor NLST are radiotherapy datasets per se. In particular, NLST has precancerous lesions too. This should be clarified in the paper. Given the importance of the topic and that the approach is novel, I recommend this paper for acceptance.



back to top