Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Haoyuan Huang, Baoer Liu, Lijuan Zhang, Yikai Xu, Wu Zhou

Abstract

Prediction of microvascular invasion (MVI) in hepatocellular carcinoma (HCC) has important clinical value for treatment decisions and prognosis. Diffusion-weighted imaging (DWI) intravoxel incoherent motion (IVIM) models have been used to predict MVI in HCC. However, the parameter fitting of the IVIM model based on the typical nonlinear least squares method has a large amount of computation, and its accuracy is disturbed by noise. In addition, the performance of characterizing tumor characteristics based on the feature of IVIM parameter values is limited. In order to overcome the above difficulties, we proposed a novel multi-task deep learning network based on transformer to simultaneously conduct IVIM parameter model fitting and MVI prediction. Specifically, we utilize the transformer’s powerful long-distance feature modeling ability to encode deep features of different tasks, and then generalize self attention to cross-attention to match features that are beneficial to each task. In addition, inspired by the work of Compact Convolutional Transformer (CCT), we design the multi-task learning network based on CCT to enable the transformer to work in the small dataset of medical images. Experimental results of clinical HCC with IVIM data show that the proposed transformer based multi-task learning method is better than the current multi-task learning methods based on attention. Moreover, the performance of MVI prediction and IVIM model fitting based on multitask learning is better than those of single-task learning methods. Finally, IVIM model fitting facilitates the performance of IVIM to characterize MVI, providing an effective tool for clinical tumor characterization.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16449-1_26

SharedIt: https://rdcu.be/cVRU6

Link to the code repository

https://github.com/Ksuriuri/TRMT

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper focuses on prediction of MVI of HCC. The authors used a transformer approach in the network to obtain deep features of the images and, furthermore, they proposed a cross-attention block to improve the performance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Altought this paper uses well-known techniques and algorithms, the authors compile a robust and (from my point of view) innovative solution. They provide proper methodology and implementation details and compare its results with previous studies.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Maybe, the most significant problem could be the very reduced number of samples. 114 HCC are a really small dataset and I cannot be sure of the performance of this proposal with a dataset of proper dimension for this kind of technique.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

No comments. Everything is clear
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

If possible, I strongly suggest to increase the samples to test the proposed method with a proper dataset. Although, cross validation is really useful for this small dataset (together with removing of outliers or selection of relevant features) I am a bit concern about possible overfitting or other problems related with this kind of very reduded dataset.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed method, the implementation and the compared results of this papers are good. My only concern is related with the very small dataset.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Not Answered
[Post rebuttal] Please justify your decision

Not Answered

Review #2

Please describe the contribution of the paper

In this work, the author proposed a multi-task learning method based on transformer to perform (1) MVI prediction and (2) IVIM parameter fitting at the same time, leveraging the fact that those 2 tasks are related to each other. The method is compared to 2 other multi-task learning methods and single task methods for each of the 2 tasks.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The use of transformer and multi-task learning to leverage information from 2 tasks that are linked is interesting. the comparison to other methods is extensive.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

This paper is an incremental improvement of the method described in [21]. The additional aspects are not clearly stated in the paper which makes it difficult to assess its novelty.

The paper is also not very well written and not easy to read, follow. It would benefit from a careful English proof reading.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

source code seems to be available.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
Some points in the method are not really clear:
- The author claim that Compact Convolutional Transformer (CCT) enable the transformer to work with small dataset of medical images but this is not proven in any experiment.
- The IVIM Model Parameter Fitting Task not clear: How is it self-supervised? The Ground Truth assessment is also not clear
- What would be the results if the method from [21] will not be modified to use ResNet-18?
- The training, evaluation, testing data split are not explained.
No results are shown in the paper. The results shown in the supplementary materials should appear in the paper. I know that space is limited but table 1 and 2 could be merged into 1 single table for example.

Minor points:
- Table 1 and 2: best results should be highlighted for better clarity.
- ref 22: typo: gd-eob-dtpa-enhanced??
- [19,21] are 2 studies so in intro: a recent study –> recent studies
- page 4: both two task-specific embeddings is passed to the task shared transformer block –> both task-specific embeddings are passed to the task shared transformer block
- What is ViT?
- Parameter is Eq 1 are not introduced.
- p7 spacial => spatial
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

incremental work from [21].
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

4
[Post rebuttal] Please justify your decision

The clarity of the method has not been addressed and some unsupported claims are still present after rebuttal. No effort has been done to include results such as IVIM parameter maps in the main paper.

Review #3

Please describe the contribution of the paper

This paper aimed to perform simultaneously IVIM parameter model fitting and MVI prediction using a multi-task learning method based on transformer. The originality of the work is to combine the advantages of convolution network and transformer to jointly achieve IVIM model parameter fitting and MVI prediction, which allowed the authors to obtain better results than those obtained when performing one single task (fitting or prediction).
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Design a transformer-based multi-task learning model
- Perform simultaneous IVIM parameter model fitting and MVI prediction
- Demonstrate better results with multi-task strategy.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Not appropriately justify the methodological motivation of the work
- No IVIM parameter maps have been shown.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

No elements on paper reproducibility could be found in the paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
This paper aimed to perform simultaneously IVIM parameter model fitting and MVI prediction using a multi-task learning method based on transformer. The originality of the work is to combine the advantages of convolution network and transformer to jointly achieve IVIM model parameter fitting and MVI prediction, which allowed the authors to obtain better results than those obtained when performing one single task (fitting or prediction).

However, methodological motivation of the work was not appropriately justified. The authors motivated “the parameter fitting of the IVIM model based on the typical nonlinear least squares method has a large amount of computation, and its accuracy is disturbed by noise.” But, the paper didn’t deal with these two aspects. On the other hand, the authors claimed that their method performs better IVIM parameter fitting; but throughout the paper, the authors have never shown IVIM parameter maps, which is an obscure point.

A few typos::
- “embeddings is passed”: are.
- Fig. 1, “Average polling”: pooling.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
- Design a transformer-based multi-task learning model
- Perform simultaneous IVIM parameter model fitting and MVI prediction
- Demonstrate better results with multi-task strategy
- No IVIM parameter maps have been shown, which are essential in IVIM applications.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Not Answered
[Post rebuttal] Please justify your decision

Not Answered

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The authors present a method to simultaneously estimate IVIM parameters and classify cancer data. The coupling of the tasks yields an improved performance compared to models for each task separately. The concept is interesting and somewhat novel in the context of IVIM prediction. Main reviewers’ comments refer to dataset size, clarity, and some claims that are not really supported by experiments. In my opinion a fair comparison should include a classifier coupled with an auto-encoder from the images to assess the specific value of IVIM here. Further. The practical value of estimating the IVIM parameters rather than using another constraint on the classifier is not clear. Usually, IVIM biomarkers are used as a proxy to classify tumor types.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

6

Author Feedback

Reviewer #1: (1)114 HCC are a really small dataset and I cannot be sure of the performance of this proposal with a dataset of proper dimension for this kind of technique. Re: To our knowledge, there is currently no public dataset for IVIM-DWI data. Although it is a small dataset, we have taken effective measures to ensure the reliability of performance. First, we fully augment the training data. Then, we loaded the weights pretrained on ImageNet for ResNet-18, and used dropout to further avoid over fitting. In addition, we used four-folded cross-validation to reduce the impact of data partitioning on performance evaluation. We also showed the loss and ROC curves in the supplementary materials to demonstrate the reliability of the experiment. Finally, we provided the source code of the model to facilitate readers to verify or apply our model more widely.

Reviewer #2: (1)This paper is an incremental improvement of the method described in [21]. The additional aspects are not clearly stated in the paper which makes it difficult to assess its novelty. Re: Our method is different from [21]. [21] used the intra task attention mechanism, while we propose the inter task attention mechanism, and model the long-range dependency in features by introducing transformer module, which increases the representation ability of the model, and obtain richer information conducive to MVI prediction. In addition, our method further provides more accurate IVIM parameters, which can provide clinical interpretation on tumor perfusion and diffusion, and reflect the microstructure information of tissues.

(2)The author claim that Compact Convolutional Transformer (CCT) enable the transformer to work with small dataset of medical images but this is not proven in any experiment. Re: CCT has been demonstrated to be effective on several small datasets in [5], and we think it is unnecessary to add additional experiments to verify this conclusion. In addition, the experimental results in Table 2 show that CCT can work well on our dataset and bring some performance improvement. The loss curve in Fig. 3 of the supplementary material also shows that the convergence of the CCT based model on our dataset is stable.

Reviewer #3: (1) Not appropriately justify the methodological motivation of the work. (The authors motivated “the parameter fitting of the IVIM model based on the typical nonlinear least squares method has a large amount of computation, and its accuracy is disturbed by noise.” But, the paper didn’t deal with these two aspects.) Re: The above is the motivation of fitting IVIM parameters using deep learning network. The direct motivation of this work is in the second paragraph of the introduction: recent research shows that IVIM parameters are related to MVI [11,20,22]. Therefore, we speculate that MVI prediction and IVIM parameters fitting may be two closely related tasks, which can promote each other through multi-task learning. Similar to the typical multi-task architecture of classification and segmentation, we use classification and fitting to form a multi-task architecture.

Meta-Reviewer: (1)In my opinion a fair comparison should include a classifier coupled with an auto-encoder from the images to assess the specific value of IVIM here. Re: Since IVIM parameters are very sensitive to motion and artifacts in DWI, its performance in lesion classification is limited [21], so it is not suitable to use the classification performance as the evaluation index of the specific value of IVIM parameters.

(2)The practical value of estimating the IVIM parameters rather than using another constraint on the classifier is not clear. Re: IVIM parameters include both perfusion and diffusion information and reflect the microstructure information of tissues. It plays an important role in tumor identification, benign and malignant differentiation and efficacy evaluation. In addition, IVIM parameters can be obtained through self-supervised without additional labels.

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Authors aimed to explain their contirbution better in the rebuttal. While still borderline in terms of proper evaluation, the paper has some interesting contribution which is relevant to the MICCAI community.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

12

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The use of the dot-product attention to exchange information from different tasks is interesting and somewhat novel. The results also show the merits of using CCT and the proposed framework. The authors should carefully proofread the paper. For example, “Cacinoma” in the title is a typo.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

4

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

Overall, the technical contribution is not large, but has some value in application.
After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

Accept
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

6

back to top

Transformer Based Multi-task Deep Learning with Intravoxel Incoherent Motion Model Fitting for Microvascular Invasion Prediction of Hepatocellular Carcinoma