Authors

Wen Tang, Han Kang, Haoyue Zhang, Pengxin Yu, Corey W. Arnold, Rongguo Zhang

Abstract

Evaluating lesion progression and treatment response via longitudinal lesion tracking plays a critical role in clinical practice. Automated approaches for this task are motivated by prohibitive labor costs and time consumption when lesion matching is done manually. Previous methods typically lack the integration of local and global information. In this work, we propose a transformer-based approach, termed Transformer Lesion Tracker (TLT). Specifically, we design a Cross Attention-based Transformer (CAT) to capture and combine both global and local information to enhance feature extraction. We also develop a Registration-based Anatomical Attention Module (RAAM) to introduce anatomical information to CAT so that it can focus on useful feature knowledge. A Sparse Selection Strategy (SSS) is presented for selecting features and reducing memory footprint in Transformer training. In addition, we use a global regression to further improve model performance. We conduct experiments on a public dataset to show the superiority of our method and find that our model performance has improved the average Euclidean center error by at least 14.3% (6mm vs. 7mm) compared with the state-of-the-art (SOTA). Code is available at https://github.com/TangWen920812/TLT.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16446-0_19

SharedIt: https://rdcu.be/cVRSZ

Link to the code repository

https://github.com/TangWen920812/TLT

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

A transformer based lesion tracker is proposed. In feature selection stage, a sparse selection strategy is chosen for cost reduction. Then a registration augmented cross attention transformer is used to predict the location, followed by a global regression module.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Significant improvement in performance, from 79.5 to 87.7 in CPM@10mm.
2. Clear ablation study, including SSS, RAMM-CAT, global regressor
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The registration benchmark is somewhat out-dated and some new reference could be added. The lesion tracking is similar to video motion tracking, and the searching/template image pairs is similar to 2 frames in the video. Some references in video motion tracking/optical flow: Teed, Zachary, and Jia Deng. “Raft: Recurrent all-pairs field transforms for optical flow.” European conference on computer vision. Springer, Cham, 2020. Qin, Chen, et al. “Joint learning of motion estimation and segmentation for cardiac MR image sequences.” International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2018.
2. Motivation/results mismatch. It seems the largest gain is achieved from SSS, but it seems the motivation for SSS is reducing computation cost. Further discussion why SSS improve performance is appreciated.
3. Some reasoning is missing. It is not clear why affine transformation is used. For lesion deformation, it is more like a non-rigid registration task. Adding reasoning on why affine is used is appreciated.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

In the abstract the author mentioned the code would be published. The dataset is public available as well.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
1. It seems the largest gain is achieved from SSS, but it seems the motivation for SSS is reducing computation cost. Further discussion why SSS improve performance is appreciated.
2. It is not clear why affine transformation is used. For lesion deformation, it is more like a non-rigid registration task. Adding reasoning on why affine is used is appreciated.
3. Add some discussion about dense tracking/object tracking is appreciated.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The novelty and performance is good while got some concerns about the motivation/missing discussion and references.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

5
[Post rebuttal] Please justify your decision

The elastic registration does take more time, while some learning based registration methods are time-efficient. I tend to keep my original rating

Review #2

Please describe the contribution of the paper

The authors present a novel approach using Transformers to propagate the center of a lesion from the baseline to the follow-up scan.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The paper addresses the clinically relevant problem of lesion tracking over time. In general, I enjoyed reading the manuscript. It is well structured, has a clear motivation, and is well written.
- the authors propose a novel approach using Transformer to predict the center of the propagated lesion in the follow-up scan
- The experiments were conducted on the publically available DeepLesion dataset indicating good performance.
- code will be available on Git
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The paper misses a discussion about weaknesses or limitations of the presented work and give more insights into the method (e.g. for which lesion types does the method work well or in which cases does the method fail)
- Overall, in my opinion, there is an imbalance between the sections. The introduction and related work section are relatively long, but the results are only briefly presented and not discussed at all.
- some information is missing: The output is the predicted center coordinate and a classification result. However, after reading the paper, I don’t know what is classified (maybe I missed it?!)
- some related works are missing: Hering, A., Peisen, F., Amaral, T., Gatidis, S., Eigentler, T., Othman, A., & Moltz, J. H. (2021, August). Whole-Body Soft-Tissue Lesion Tracking and Segmentation in Longitudinal CT Imaging Studies. In Medical Imaging with Deep Learning (pp. 312-326). PMLR. Moltz, J. H., D’Anastasi, M., Kießling, A., Pinto dos Santos, D., Schülke, C., & Peitgen, H. O. (2012). Workflow-centred evaluation of an automatic lesion tracking software for chemotherapy monitoring by CT. European radiology, 22(12), 2759-2767.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
An analysis of situations in which the method failed. [Not Applicable] -> It is applicable but not done.
- code will be available on Git
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- please don’t introduce abbreviations in the abstract
- The Learn2Reg image registration challenge paper is a good reference for several registration methods! Questions:
- Why are the images resampled to 2mm for the deeds algorithm? It can also handle larger images and maybe the registration quality is better with a 1mm resolution.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents an interesting method to track lesions over time. The evaluation is based on similar work and shows good results.

Overall, it’s good work, but nothing totally groundbreaking either. A discussion is missing and should be added!
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

6
[Post rebuttal] Please justify your decision

I appreciate the answers that the authors have given. The paper gains further in quality with the addition of a thorough discussion.

However, I don’t see such a big gain in quality that I would further increase the score. The paper is not groundbreaking enough for me to do so either.

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

Transformer Lesion Tracker

This submission tackles the problem of tracking lesions in longitudinal images. The originality resides in leveraging the attention of transformers to improve the localization of lesions across images. This enables local and global information to drive the tracking of lesions. The method further uses sequentially an affine registration module (RAAM) to better extract anatomical features in the attention mechanism (CAT), and a feature selection (SSS) prior to training the transformer in order to reduce the memory burden. The evaluation is performed on a public challenge dataset of about 4000 pairs or images, with a comparison with several variants of state-of-the-art tracking and registration methods.

Two reviews are only available, one is unfortunately missing. After careful reading the paper, I have to agree with the reviewers who both have positive appreciations of the submissions but also find the methodology as not necessarily ground-breaking. One is specifically questioning the motivation and how much the feature selection (SSS) contributes to the overall improvements (R1), this may be assessed, for instance, by extending the ablation study. This is potentially not jeopardizing the value of this contribution, as demonstrated by the results. A second has a valid suggestion in requesting a discussion on when the method could fail (R2). Space could be made available by summarizing the long introduction or the lengthy methodological description. Important technical details are also missing such as what is the nature of the classification output (confusion for R2). Lesions may also grow or shift non-rigidly with respect to its surrounding background. The choice of an affine registration method (confusion for R1) may therefore appear inadequate. All these elements should be clarified in a rebuttal.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

4

Author Feedback

As noted by the reviewers, the introduction and method sections of our paper are long and the discussion could include additional insights. We will compress the introduction and methods sections and add content to the discussion as suggested by the reviewers. 1)Motivation of SSS and why does SSS improve performance? Sorry for the misunderstanding, as we mentioned in the abstract, SSS is presented for both selecting features and reducing memory, rather than just for reducing memory. The ability of selecting features is the reason that SSS improves performance. When we looked at the results, we found that there were many small lesions in dataset, such as lung nodules. If these small lesions are cropped on the original image as a video tracking task and sent to the network, due to downsampling, the feature map will be very small, and even in the last several downsampling processes, will always be one voxel. This could lead to a decline in performance. The SSS solves this problem by selecting voxels on the last feature map. Even if only one voxel on the feature map is selected, this voxel can still obtain more surrounding information in the CNN than without SSS. In addition, the result in Table 2 shows the effect of different thresholds on SSS, which also means that the effectiveness of SSS comes from the variation of the receptive field. Specifically, the CPM increases from 84.02 to 87.37, and the MED from 6.35 to 5.98 for SSS vs. no SSS while enabling the other two modules. We would add this result if space permitted. We are sorry that we have overlooked some necessary discussions and some key descriptions about SSS. Our original intention is that the innovation of the three modules is equally important in the paper. We will revise the corresponding section accordingly. 2)When can the method fail? Based on our observations, we found that when the registration method failed, sometimes our model would fail as well. This is because we use registration to feed anatomical information to the transformer, and anatomical information helps the transformer accelerate convergence, which forms a dependency. In addition, when there are similar lesions in similar locations, such as two solid nodules at the edge of the right upper lung, and only a few layers difference in the z-axis direction, the model will also be confused. This discussion will be added into the paper. We will also add some sample failed cases in Supplementary. 3)Why affine registration? First, we use affine registration method only to give the transformer an approximate lesion location area as coarse attention to help accelerate convergence and improve accuracy. Second, non-rigid registration provides restriction to the attention which may limit the model’s ability to learn for local variation and details. Last, the speed of affine registration is much faster than elastic registration (0.5s per case vs 15s per case). For the whole process, we do not want the registration step to drastically reduce overall speed. A non-rigid transformation has little effect on the determination of the center position of the lesion but may be more helpful for the size and segmentation of the lesion. 4)Other details. The classification head is to classify if a voxel from the output is inside of a lesion. It is very similar to a video tracking framework. We will revisit the technical details to clarify. We followed the same setting as mentioned in the DLT to use 2mm in DEEDS algorithm for a fair comparison. This setting is a compromise solution following DEEDS. In DEEDS, the resolution of CT was resampled to 1.94mm - 2.32mm with the same size of 256*256. Based on the reviewer’s advice, we will mention the potential of 1mm experiments in future work.

For all the above replies, we will add and modify the corresponding sections in the paper to avoid confusion. At the same time, for some related articles mentioned by the two reviewers, we will also add them to the references.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Transformer Lesion Tracker

The rebuttal has clarified the motivation on the contributions from the selection and registration modules, as well as a discussion on possible limitations. The general consensus is positive. However, a serious concern exists as the author proposes a major rewrite of the introduction and method to free up space for the necessary changes. The scientific merit of the paper remains valid. For these reasons, recommendation is towards Acceptance.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

6

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

This paper presented a new approach using Transformer to predict the center of the propagated lesion in the follow-up scan. Though not ground-breaking, the method has some novelty and it achieved superior performance on the public DeepLesion dataset. The rebuttal partially addressed some concerns from the reviewers. This paper has only two reviews. Since both are positive, I recommend accepting this work.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

4

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The authors did a good job of addressing the comments raised by the reviewers. The unanimous decision by the reviewers is for acceptance of the work. The authors should include the important details provided in their rebuttal in the final version of the paper or in the supplemental material.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

5

back to top