Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Xueyang Li, Han Xiao, Weixiang Weng, Xiaowei Xu, Yiyu Shi

Abstract

Colorectal cancer is a prevalent form of cancer, and many patients develop colorectal cancer liver metastasis (CRLM) as a result. Early detection of CRLM is critical for improving survival rates. Radiologists usually rely on a series of multi-phase contrast-enhanced computed tomography (CECT) scans done during follow-up visits to perform early detection of the potential CRLM. These scans form unique five-dimensional data (time, phase, and axial, sagittal, and coronal planes in 3D CT). Most of the existing deep learning models can readily handle four-dimensional data (e.g., time-series 3D CT images) and it is not clear how well they can be extended to handle the additional dimension of phase. In this paper, we build a dataset of time-series CECT scans to aid in the early diagnosis of CRLM, and build upon state-of-the-art deep learning techniques to evaluate how to best predict CRLM. Our experimental results show that a multi-plane architecture based on 3D bi-directioal LSTM, which we call MPBD-LSTM, works best, achieving an area under curve (AUC) of 0.79. On the other hand, analysis of the results shows that there is still great room for further improvement.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_37

SharedIt: https://rdcu.be/dnwJS

Link to the code repository

https://github.com/XueyangLiOSU/MPBD-LSTM

Link to the dataset(s)

https://github.com/XueyangLiOSU/MPBD-LSTM


Reviews

Review #1

  • Please describe the contribution of the paper

    The prediction of liver metastasis from colorectal cancer using muti-phase CT scans has been discussed using the AI model. Working with higher dimension data (time series volumetric scans as well as the contrast enhanced phases) is a challenging task in AI. The authors have attempted to predict the metastasis in liver using multi-phase CT scans.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written in terms of English grammar, technical content, well described patient inclusion and exclusion criteria, results and discussion. Prediction of metastasis is really a challenging task in AI as the cancer spread from primary tumor location is difficult to identify. This work has proposed a solution for such a problem. The developed methodology has lot of clinical potential to help the Radiologists.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Even though a good attempt, the work is deprived of large set of 5D samples and hence the authors have relied on data augmentation. Also, in augmentation, only the rotation is considered and not the other geometrical transformation. This might not help to create diverse dataset which is really required in image pool. This is just an observation.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The experimental setup and the steps to reproduce is clear in the paper. The work can be reproduced with the mentioned system configuration.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. As mentioned in abstract, 5D is not true. As in liver CT, multi- phase itself is done at different times and hence phase is the 4th dimension. Hence up to my understanding the time and phase leads to one dimension i.e. 4th and not the 5th. Author should clarify how 5D arises. If it is follow up study and then again multiphase is done, then follow-up time is 4th D and multiphase is 5th D. Then it is acceptable.

    2. Delayed phase is not mentioned in CECT. Only two phase images are considered. But at some place, three phase images are mentioned.

    3. It is essential to mention the CT image acquisition parameters, diagnostic quality of the images, any quantum noise, computational noise cases, what did you do with such datasets? any noise reduced? In case of the noisy images, poor diagnostic quality images, was there any variation in the results at the end? Did you notice any voxel intensity variation of the tumors when different contrast agents were used in CT scans?

    4. Only the objective assessment of the results is discussed, how about the subjective assessment? did you involve any radiologist? how many? was there any inter observer variation of the results? any bias?

    5. It is not mentioned whether the dataset was retrospective cases or prospective cases.

    6. How was the datasets de-identified? any preprocessing? how missing slices/data was handled? how data completeness was ensured?

    7. Only GPU is mentioned for the training and inference. What are the Software libraries, frameworks, and packages used? Are these models available in any archive like Github? Code Ocean?

    8. Rescaling, flipping, shearing, translation is also equally important in addition to the rotation. What is the reason behind selecting the angle between -30 to +30 degree only?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    With my experience in this domain, I appreciate the author’s work and representation of work in this paper. Merits slightly weigh over weakness. The comments have to be addressed and I would like to see the revised version with a strong proof or justification.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    All comments are implemented except comment #7,



Review #2

  • Please describe the contribution of the paper

    This paper presents a new dataset of time-series CECT scans and propose a new network based on existing SOTA model. The proposed method along with other SOTA methods were evaluated on the new dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Provision of a new dataset.
    • I really like the style of defining the problem, collecting dataset, and building upon existing solutions to solve the problem better.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Although I like the style of how the authors present the method, the model itself lacks sufficient innovation. Therefore the technical merit is limited.
    • Is the dataset introduced in this paper unique in the field? If not, I’d like to experiments on similar datasets.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see the weakness section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    With more analysis and experiments, I think this paper can be a good clinical paper. However, the presented technical merit may not be adequate for MICCAI.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    Although the authors didn’t address all of my concerns in the rebuttal, I think they provided additional essential information for the dataset they proposed. Overall, I think the dataset itself can be a good contribution to the community. I raise my score to accept.



Review #3

  • Please describe the contribution of the paper

    The authors propose a LSTM-based 5D prediction model for Colorectal Liver Metastases. It includes 3D views, time series information, and additionally multiple phases (arterial and venous). These phases have not been used by previous models, but they contain criticall information about normal and abnormal blood vessels ncessary for this clinical diagnosis. A new dataset has been curated, and 2 3D-LSTMs have been employed parallely for better clinical predictions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The problem is clinically quite motivating, and for the first time A and V phases have been considered into modeling Colorectal Liver Metastases prediction. Parallel 3D-LSTMs process each of these phase images individually, and combine them finally with an averaging function. Ablations have been performed wrt number of timepoints and phases.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The dataset is highly imbalanced with a 25% positive rate - how the method handles this? It is written that the number of timepoints can vary from 2 to 6, but the authors experiment with only 3 fixed timepoints (3 x2 x 64 x 64 x 64). I find a discrepancy regarding this - can you please clear it. Also, how do the authors handle this inconsistent timepoint information? I know LSTMs have inherent padding mechanism but no mention. Moreover, are the time intervals between multiple scans irregular or regular? 3D-LSTMs are quite outdated with the advent of ViTs 4 years ago. Any reason the authors do not use any available 3D medical image pretrained transfomer instead of RNN approaches? A 2-modal/phase input into a tranformer is quite common. Positional information of the same CT slice in different phases, would not alter if they are registered scans - are they not? If registered, concatenation may not be a bad choice. I am familiar with all of the compared methods but they are quite outdated except SimpVP - even a simple Temporal ConvNet (TCN) comparison is worth trying out. Adopting the choice of Bi-LSTMs is not quite novel. Mostly people use multple stacks of bi-LSTMs for better abstraction ability. Both w/o multiplane and inter-plane connection results are quite good - is the proposed method performance statistcally significant over them? I dont feel the justification removing inter-plane connections is quite appealing. If A and V phases are mimicing a same image different look setup - why did the authors not pursue contrastive learning with these phases (instead of traditional augmentations)?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The model parameters and experimental settings are provided. Code will be released. Dataset will be available on request.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please address the points in weakness section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Limited novelty with naive adoption of bi-LSTM and averaging function at end to accomodate mutliple phases

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    While the reviewers appreciate the written quality of the paper, addressing a highly relevant clinical problem, and performing a series of ablation experiments demonstrating the potential of the method, R#3 was fairly critical of the paper, pointing out to class imbalance issues in the in-house dataset, novelty of the approach (pointed out by other reviewers as well) and wether the proposed method has indeed clear advantages over existing methods. The authors should discuss these points in the rebuttal phase.




Author Feedback

Q1: 5D data (R1) Yes, we consider the follow-up time as 4th D and multiphase as 5th D.

Q2: Delayed phase is not mentioned in CECT and no subjective assessment (R1) As advised by our radiologists, the delayed phase is primarily used to distinguish existing malignant liver tumors from benign lesions. Yet for our problem, all CT scans captured contain no tumor, i.e., patients had not been diagnosed with colorectal cancer liver metastasis (CRLM) at the time of these follow-up scans. The clinical problem is to predict whether or not CRLM will ultimately occur based on a patient’s time-series scans. This also makes our problem and dataset unique from the existing tumor classification or segmentation works. As this is a retrospective study, the binary patient-level labels indicating whether CRLM ultimately occurred to a patient can be accurately obtained. Therefore, in the context of our problem, the delayed phase would not add value, and no subjective assessment is possible.

Q3: Dataset acquisition parameters (R1) Our study uses CT images with the following acquisition parameters: window width 150, window level 50, radiation dose 120kV, slice thickness 1mm, and slice gap 0.8mm. The images underwent manual quality control to exclude any scans with noticeable artifacts or blurriness and to verify the completeness of all slices. We used the same contrast agent for all scans.

Q4: Data de-identification (R1) Each patient was assigned a unique number for anonymization.

Q5: Model availability (R1) All models are available on Github.

Q6: Data augmentation (R1) We tried most of the suggested augmentation techniques. However, only the rotation between -30 and +30 and the Standard Scale Jittering (SSJ) best boost the performance.

Q7: Discrepancy of timepoints (R3) Our released dataset contains 269 patients with 2-6 CT scans each. The interval between scans are slightly irregular as we do not have precise control on the timing of the follow-up visits. In our study, we used 170 of them that have 3 or more scans. For those with more than 3 scans, the first three are used. The complete dataset could be of interest to the community for example to study how to further utilize the data from the patients who have only 2 scans.

Q8: Data imbalance (R3, R4) The imbalance in the dataset is induced naturally in the patients participating in the study. To handle this issue, we oversampled 60% of positive cases using Standard Scale Jittering in training.

Q9: Novelty of the approach (R2, R3, R4) We would like to clarify that the focus of this article is not on introducing a novel method, but rather, a unique and valuable dataset of a challenging and important clinical problem. Briefly speaking, 1) our dataset is 5-D, which presents challenges in computation and opportunities in feature extraction; we have to skip some of the SOTA models due to memory limits; we tweaked some simpler models to handle the 5-D, which works fine but leaves room for further improvement. 2) Our dataset does not contain any tumor and yet we need to predict whether tumors will occur in the future based on the time-series scans. This differs from the classical tumor classification or segmentation datasets.

Q10: Positional info of registered images (R3) The images are not registered.

Q11: Comparison with Temporal ConvNet (TCN) (R3) Initially, we contemplated the use of TCN. However, it required substantial memory (>32 GB) and exceeded our machine’s capacity. After pruning it to below 24 GB, the AUC is 66.5, lower than our model.

Q12: Statistical significance over model w/o multiplane and w/ inter-plane connection (R3) The proposed model yielded the best AUC score compared with other models. The removal of inter-plane connections was a decision aimed at achieving an efficiency and performance balance as it significantly speeds up the training process. Again, these are all baseline methods and we hope with our datasets better algorithms can be developed in the future.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Given the clinically relevant problem, originality of the method and clarity of the paper, the overall reviews tends to the positive side. The author’s rebuttal provided a a very detailed, point-by-point answer to the majority of R3’s points of criticism, including class imbalance, timepoints and novelty, which from my perspective are adequate (R3 has not followed up). Using time-series data (CECT) capturing tissue changes for predicting outcomes in colorectal liver mets is very challenging and timely problem and would tend to accept the paper.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Key Strengths:

    • Building of a 5D CECT dataset for prediction of liver metastases
    • Clear writing and presentation

    Key Weaknesses:

    • Novelty of proposed prediction approach (stacking of 3D LSTM modules)
    • Limited comparisons in experiments

    The rebuttal helped clarify a number of concerns, as acknowledged by reviewers, including some methods details regarding sampling/augmentation and the explanation for models compared (substantial memory). I feel that the primary contribution is really the dataset itself, presuming it is shared - in the rebuttal, the authors state that focus is the unique and valuable dataset, but actually I still don’t see explicit assurance it will be shared. (Also regarding focus - the title of the paper suggests the method, not the dataset, is the focus.) Given the potential dataset contribution and enthusiasm by most reviewers, I recommend accept.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The work intends to address an important and challenging clinical problem by building a new dataset for the purpose. The rebuttal is satisfactory and has addressed most of the concerns. Though the technical novelty is limited, as the authors pointed out, the contribution of the work focuses on the introduction of the dataset to the community and will bring benefits to future research on the problem. I think that the work would be of interest to the community.



back to top