Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Peng Yang, Yuchen Zhang, Haijun Lei, Yueyan Bian, Qi Yang, Baiying Lei

Abstract

In treating acute ischemic stroke (AIS), determining the time since stroke onset (TSS) is crucial. Computed tomography perfusion (CTP) is vital for determining TSS by providing sufficient cerebral blood flow information. However, the CTP has small samples and high dimensions. In addition, the CTP is multi-map data, which has heterogeneity and complementarity. To address these issues, this paper demonstrates a classification model using CTP to classify the TSS of AIS pa-tients. Firstly, we use dynamic convolution to improve model representation without increasing network complexity. Secondly, we use multi-scale feature fu-sion to fuse the local correlation of low-order features and use a transformer to fuse the global correlation of higher-order features. Finally, multi-head pooling attention is used to learn the feature information further and obtain as much im-portant information as possible. We use a five-fold cross-validation strategy to verify the effectiveness of our method on the private dataset from a local hospital. The experimental results show that our proposed method achieves at least 5% higher accuracy than other methods in TTS classification task.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_54

SharedIt: https://rdcu.be/dnwH0

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The author proposed a DL architecture to predict TSS using CTP. Several tech involved: use dynamic conv to replace regular conv layer; feature fusion at first 3 stage to achieve multi-scale feature fusion of the perfusion maps and self-attention at last 2 stages; add multi-head pooling attention at the end.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This is the first work to use CT perfusion only to predict TSS using DL, to the best of my knowledge. CT imaging for LVO and TSS is key to EVT triage. Some thoughts have been put to design the network that tailored to small sample size and CTP high dimensionality. Very thorough literature review.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The clinical usage is limited. It is rare to have CTP available within EVT treatment window in many first-line primary stroke centers. NCCT/CTA is instead the common protocol that needs to be investigated for TSS.
    2. multi-scale feature fusion for multi-sequence modeling is quite common. What’s the point of use self-attention for just 4 feature maps? probably should provide a comparison to just use learneable weighting of each feature map given the computational overhead of self-attention to see if it’s necessary.
    3. What is your original data information? basic acquisition info and original resolution, voxel size, slice thickness should be presented.
    4. Given your input is 256x256x32, is the backbone a 3D or 2D conv model? I can’t find this in description. Also what’s your basic backbone? Is it based on resnet or VGG?
    5. Directly comparing CT and MR performance in table 4 does not make sense. Also many of those methods are just difussion imaging only.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    code availability checked, data is not available although I believe similar or larger size of CTP data can be easily found at many institutions to try the proposed method. Some parts of the method lack details, need to see the code to double check.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Add comparision with NCCT/CTA performance instead, and also consider fusion of NCCT/CTA + CTP.
    2. better leverage Tmax as it may contains way more useful information
    3. given the small sample size, a 5-fold cv may not be enough, consider more exhaustive experimental approaches and transfer learning/SSL.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The clinical revelance is very low hence for even an application-oriented paper. The author didn’t mention why use CTP only is worthy during the urgent time of triaging AIS patients.
    2. technical wise although the author put efforts to mitigate the limitation of sample size, this further limit the value of the model performance and a larger multi cohort is necessary.
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    The rebuttal only addressed some minor concerns of mine. The work itself has limited innovation, so I look for more clinical impact. Based on my experience in AIS, the CTP only for TSS prediction has very limited value in real clinical setting and the authors did not address this.



Review #2

  • Please describe the contribution of the paper

    In this paper, the authors propose an end-to-end deep learning architecture for acute ischemic stroke onset time classification from CTP-based perfusion maps. This architecture leverages the principles of dynamic convolution to reduce the network complexity, as well as multi-modal fusion to combine information from four different CTP maps, resulting in more accurate and robust classifications. The authors compare the proposed architecture with other variations of itself, as well as with state-of-the-art methods using different imaging modalities, showing superior performance based on AUC, accuracy, and sensitivity. The authors conclude that by using complementary information from CTP-based perfusion maps, their proposed approach could assist medical practitioners in better estimating the ischemic stroke onset time, enhancing the clinical decision-making process.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors of this study propose a novel and well-justified architecture for classifying ischemic stroke onset time. While previous works have also added attention-based modules to enhance the classification of the ischemic stroke onset time (dois: 10.1111/jon.13043; 10.1016/j.compmedimag.2021.101926), the proposed model takes a more sophisticated approach by employing diverse modules that effectively integrate local and global relationships across multi-scale feature representations. Notably, the proposed approach is applied to CTP imaging for the first time to the best of our knowledge, whereas most works focus on DWI or FLAIR images. The authors conducted a comprehensive evaluation of their proposed method against various state-of-the-art methods that use other imaging modalities, and performed an ablation study of their model by comparing it against multiple variations of itself. To ensure statistical significance, proper statistical analysis was performed on the results. Through this rigorous approach, the authors demonstrate the contribution of their proposed multi-modal fusion strategy for accurately classifying the time since stroke onset.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    One of the main weaknesses of the paper is that the authors evaluate their method by “comparing it with other approaches on the same dataset.” However, the 11 approaches chosen for comparison were not originally developed for ischemic stroke onset classification, which may have biased the results. More to this point, modifications made to the comparison approaches should be described at least briefly so that it can be assessed whether their relative worse performance is due to the merit of the proposed method or rather deficiencies in the modified comparison methods.

    Overall, several aspects of the clinical motivation, data collection, preprocessing, and experimental analysis are not sufficiently described. Specific examples are provided in box 9: constructive feedback. In addition, the authors did not acknowledge the limitations of their study nor discuss directions for future work.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors have provided a comprehensive description of the model, which includes mathematical formulations of the attention modules, as well as a clear graphical illustration. A clear description of the data and its preprocessing is warranted.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Regarding the introduction, it is unclear how the CTP-based perfusion maps are used in clinical practice to calculate the time since stroke onset. Are thresholding methods typically used for this purpose? More to this point, I think this paper lacks a justification for why CTP maps are assumed to be the best features for ischemic stroke onset time classification, especially compared to other imaging modalities (e.g., DWI, FLAIR) or even the CTP image sequences directly. It would be useful to discuss whether this is the first time that CTP maps have been used for ischemic stroke onset time classification or if previous studies have explored this modality.

    Moreover, the authors suggest in the introduction that “CNNs are not capable of effectively extracting features from high-dimensional CTP data”. However, some deep learning-based works have been successfully used to extract features from raw CTP in the past (dois: 10.1016/j.media.2019.101589, 10.1016/j.media.2022.102610).

    Regarding the methods section, it is not clear whether the analysis of the CTP maps is done in 2D or 3D and a description of the MLP is missing. The paper could also benefit from a more thorough explanation of the differences between normal and dynamic convolutions, as well as the benefits of using the latter besides reducing model complexity. To my understanding, dynamic convolutions are typically used when input features have different sizes or dimensions, so it is not entirely clear how they help in this particular case where the input images are the same size. Additionally, the authors state that dynamic convolutions “cannot increase the model complexity.” Thus, providing evidence that supports this statement is necessary, such as a table that shows the total number of parameters for each model in the ablation study, corresponding to those from Table 2.

    Regarding the experiments section, some key details missing or ambiguous could be improved. In particular, I would like to read about the perfusion analysis, including the selection of the AIF and deconvolution algorithm (if applicable). If CTP maps were approximated directly from tissue curves rather than the deconvolved residual curves, a justification would be warranted. A more detailed description of the data and its preprocessing steps is missing. Were all patients treated with mechanical thrombectomy? Furthermore, it is not clear how the authors handled the class imbalance during training and evaluation. For example, did they use any oversampling or weighting to address this issue?

    As mentioned in section 6: Main Weaknesses, the authors evaluate their method by “comparing it with other approaches on the same dataset.” However, the 11 approaches chosen for comparison were not originally developed for ischemic stroke onset classification. For future works, authors should consider replicating a state-of-the-art method (doi: 10.1111/jon.13043) and training it on their dataset for a fairer and more informative comparison.

    Moreover, the process by which the model hyperparameters were chosen is not clearly specified, which may concern some readers. Starting with such a low learning rate of 0.00001 is usually appropriate when fine-tuning a pre-trained model, but this is not the case. Furthermore, it is unclear whether the choice of 50 epochs was based on any analysis of the convergence behavior of the model. While 50 epochs may be sufficient to train some models, it is possible that the model has not fully converged at this point, especially if the model is complex. It would be useful for the authors to provide more information on how they determined that the model had converged after 50 epochs. An Early Stopping callback could be used in future experiments to monitor the validation loss and stop training when it no longer improves to ensure the model was not overfitting.

    Using t-SNE for analyzing features does not provide new insights into the paper. Thus, it may be reasonable to consider removing this part from the paper. On the other hand, expanding the discussion on the chart analysis can help highlight the models’ strengths and weaknesses.

    In the future, it would be interesting to investigate the relevance of each perfusion map for the network’s decision-making process. Using XAI methods, such as saliency maps (doi: 10.48550/arXiv.1706.03825), the authors could potentially identify which specific parts of each perfusion map are most informative.

    Apart from that, there are a few formatting errors that should be addressed:

    • Introduction: the term “neural image” refers to images used to study the structure and function of the nervous system at the cellular and molecular levels and is perhaps not the intended word of choice. Would substitute with “neuroimaging”.
    • Moving Figure 2 and 3 earlier in the paper would be beneficial for the readers.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper shows novelty in its particular approach to fusing information from CTP-based perfusion maps for classifying time since stroke onset. The major weaknesses of the paper are mostly points of clarification related to the clinical motivation, data preprocessing, and the choice of comparison methods. Overall, the proposed model represents a promising contribution to the field and has the potential to advance the identification of acute ischemic stroke patients who are within the recommended treatment-window.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This work proposes a classification model for determining the time since stroke onset in acute ischemic stroke patients using computed tomography perfusion (CTP). The model uses dynamic convolution and multi-map fusion to address the small sample size and high dimensionality of CTP data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper presents a novel approach to TSS prediction with deep learning, which has the potential for high clinical impact by reducing the number of patients that are not appropriately treated due to insufficient information about the time of the stroke onset.
    2. The proposed neural architecture is carefully tailored for use in situations where only a small number of samples are available for model training. This regime is very common in research involving clinical data, so the proposed architecture could have applicability in many areas of research.
    3. The authors perform a careful ablation study to test each component of the architecture and to compare with SOTA methods. This is especially important since the paper is concerned with a private dataset for which other results are unavailable.
    4. The authors use a robust evaluation procedure.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper has many clarity issues throughout in the following categories: (a) typographic and grammatical errors (e.g. TTS vs TSS, repeat of “cerebral blood volume” in first intro paragraph), (b) confusing phraseology (e.g., “the network layers are too deep to cause over-fitting”), (c) insufficient explanations (e.g. in 2.2, last paragraph the explanation of how certain feature maps are treated as sets of tokens needs more motivation), (d) and low quality plots/table (figure 2 is unreadable and figure 3 has no use, table 2 has single wrapped digits).

    While these issues are individually minor, they accumulate to make the paper very difficult to follow, harming reproducibility and detracting from the otherwise strong work.

    1. Some of the conclusions of the paper do not appear to be supported by the reported data given the measured dispersion in performance metrics. For example, in the abstract the authors claim that “the method achieves at least 5% higher accuracy than other methods…”. Looking at Table 1, the central value of accuracy (82.99) for the proposed method is 5.43 percentage points above the runner up (ResNet34 at 77.56% accuracy). However, the reported standard deviations are 4.14% and 4.57% for proposed and ResNet34, respectively (I am assuming the +/- represent std. dev.; it is not stated as far as I can tell). No t-test or other statistical test is put forward to make the claim of “at least 5% greater”.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper contains sufficient details about the methods, but the lack of clarity significantly detracts from the reproducibility. The authors indicate in their reproducibility checklist that they will make source code available, but this is not indicated in the manuscript. The dataset is private.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    I recommend that the authors carefully edit the entire manuscript for readability. Many long sentences can be expanded into multiple sentences for greater clarity. For example, the difficult passage “for multi-map information fusion of low-order features, considering the small area of acute stroke focus, a multiscale…” can be unpacked as “In the first three stages, we fused low-order multimap features with a multiscale attention module. This compensates for the small area of acute stroke focus.”

    I recommend that the authors remove figures 2 and 3, which serve no real purpose in the narrative and use the space to give clearer explanations of their motivations and process.

    The following sections could benefit from expansion: (a) motivation and literature. Currently 1 very cramped paragraph is devoted to the former and 1.5 to the latter. In the motivation section, it would be helpful to talk about how an automated system such as proposed would integrate with clinical care. (b) In the literature section, it would be helpful to elaborate on the various categories of prior art and why their drawbacks motivate your work. Which aspects of the previous literature does your work draw upon? The 2nd literature paragraph transforms into a methods paragraph halfway through. The details mentioned are redundant with those given in the methods section. Better to comment on the high-level idea of the method in the literature section rather than stepping through the architecture. Specifically, help the reader understand what you mean about the importance of “low-order” and “high-order” features and why your architecture has the necessary characteristics relative to prior art. (c) As noted above, the last paragraph in 2.2 needs to be broken into 2 or maybe 3 paragraphs to explain the tokenization idea and why this is the right thing to do.

    In the abstract, last sentence, TTS should be TSS.

    “cerebral blood volume” is listed twice in the first intro paragraph. The first instance should be “cerebral blood flow”.

    The authors use “core” to refer to the kernel of the convolution kernel. This is unconventional. Probably better to use “kernel”. Translation error?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Though the paper has many structural issues, the underlying methodological innovation and rigorous experimental design merit acceptance.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors introduce end-to-end deep learning architecture to predict time since stroke onset (TSS) using computed tomography perfusion (CTP). To address the small sample size and high dimensionality of CTP, the presented architecture work utilizes principles of dynamic convolution, multi-modal multi-scale feature fusion, self-attention, etc. On a cohort of 200 AIS patients, the pipeline demonstrated high sensitivity/specificity (and other metrics) and high performance compared with different methods. Additional evaluation using ablation studies is documented. The paper is well-written and organized, has novel technical contribution and potential applicability in many areas of research, I align with all the reviewers about that the work contribution and the relatively complete analysis with sufficient experiments and discussions. Few points are missing to enhance the work including that the authors need to (1) clarification/justification of some points (e.g., provide details on how the CTP maps is generated and a description of the MLP) data preprocessing, and the choice of comparison methods (R1,R2,R3); (2) revise the paper conclusion to align with the presented results; (3) results validation should be expanded beyond the 5-fold cross validation (R1) and other competing methods for developed for ischemic stroke onset classification should be added (R2) (3) add details on hyperparameters settings and optimization for reproducibility (R2). Please also pay attention to other important comments to improve the readability of the work




Author Feedback

We sincerely thank all reviewers for their valuable comments. We have made revisions to the grammar and content issues raised one by one. There are our feedbacks for the major weaknesses. (1) CTP maps (R1, R2, R3): CTP can dynamically reflect changes in brain tissue blood perfusion. This is cutting-edge technology in the field of CT application, which has important value in the early diagnosis and guidance of thrombolytic therapy for ischemic cerebrovascular diseases. Therefore, we research the TSS classification on CTP. Continuous dynamic scanning of the area of interest is performed to obtain CT images, obtaining the time density curve of each pixel in the selected layer. Select the MCA artery M1 segment as the AIF curve. Mathematical modeling is performed using perfusion map calculation methods to obtain hemodynamic parameters and perfusion image performance. Since this part is not our research focus, we did not elaborate too much. (2) Data preprocessing (R1, R2, R3): The number of slices in CTP data perfusion map varies, so we uniformly resized the slices to a size of 256×256×32 and normalized through Mix-Max normalization to obtain our preprocessed data. (3) Method details (R1, R2, R3): Our method is a 3D network, where the preprocessed CTP is directly input into the network, to consider the correlation between intra-slice and pre-slice. MLP is a multi-layer perceptron. We use a simple MLP (three-layer structure) as our classification module to obtain predictive tags. (4) Comparison methods (R1, R2, R3): Our comparison methods are 3D methods, which are basic 3D networks and newer 3D networks on GitHub. We have also found some SOTA TSS classification methods to compare. (5) Results validation should be expanded beyond the 5-fold cross validation (R1): We removed the T-SNE section and added effectiveness of method fusion. We used a T-test to validate the effectiveness of our method, and due to space constraints, p-value<0.05 was mentioned in the title of Table 1. Compared with five different feature fusion methods, the experimental results show that the fusion method used has the best performance. (6) Other competing methods developed for ischemic stroke onset classification should be added (R2): In Table 4, we have chosen relevant works for comparison. Due to different data types and the lack of open-source code, we did not replicate these methods on our dataset for experimentation. (6) Details on hyperparameters settings and optimization (R2, R3): Training setting: The parameters are optimized using the Adam optimizer, with the learning rate set to 0.00001. In the process of network training, the pre-training model is not used. a fixed step decay learning strategy is adopted, where the step size is set as the step decay learning strategy, and the step size is set to 15, γ is set to 0.8. The number of iterations for training is 50. Dynamic convolution setting: The initialization weights are randomly generated. Use attention blocks to update weights and treat batch as a dimensional variable for group convolution. Because the weights of group convolution are different, the weights of dynamic convolution are also different. Transformer fusion setting: We set the number of Transformer layers to 8, with each layer having 4 parallel attention heads. Multi-head pooling attention setting: We set the number of layers to 1, with 8 parallel attention heads. (7) Limitations and prospects (R2): Our paper has some limitations that were not mentioned in the paper. As mentioned by the reviewer, CTP is not widely used in the diagnosis of AIS at present, and its research has some limitations. In the future, we will try to use the fusion of more commonly used datas (e.g. NCCT/CTA, FLAIR, DWI) and CTP to achieve TSS diagnosis. We can try to combine the segmentation of lesion areas and the classification of time windows to help doctors develop auxiliary plans faster.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed the comments concisely. They need to add the results for the TSS competing methods, and to comment on the clinical value of the TSS prediction clinical setting the discussion (point #7) in their rebuttal



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Most of the major concerns (such as CTP maps, data preprocessing, methods details) have been well clarified.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Thanks for the rebuttal. I have checked the review comments and rebuttal, but some issues still remain. 1) some key components should be further justified, e.g., the use of self-attention for just 4 feature maps in multi-scale feature fusion; 2) the clinical usage needs to be further discussed; 3) the presentation of this paper should be importantly improved.



back to top