Paper Info Reviews Meta-Review Author Feedback Post-rebuttal Meta-Reviews Back to top

List of Papers By topics Author List

Paper Info

Reviews

Meta-review

Author Feedback

Post-Rebuttal Meta-reviews

Authors

Michal K. Grzeszczyk, Szymon Płotka, Beata Rebizant, Katarzyna Kosińska-Kaczyńska, Michał Lipa, Robert Brawura-Biskupski-Samaha, Przemysław Korzeniowski, Tomasz Trzciński, Arkadiusz Sitek

Abstract

Medical data analysis often combines both imaging and tabular data processing using machine learning algorithms. While previous studies have investigated the impact of attention mechanisms on deep learning models, few have explored integrating attention modules and tabular data. In this paper, we introduce TabAttention, a novel module that enhances the performance of Convolutional Neural Networks (CNNs) with an attention mechanism that is trained conditionally on tabular data. Specifically, we extend the Convolutional Block Attention Module to 3D by adding a Temporal Attention Module that uses multi-head self-attention to learn attention maps. Furthermore, we enhance all attention modules by integrating tabular data embeddings. Our approach is demonstrated on the fetal birth weight (FBW) estimation task, using 92 fetal abdominal ultrasound video scans and fetal biometry measurements. Our results indicate that TabAttention outperforms clinicians and existing methods that rely on tabular and/or imaging data for FBW prediction. This novel approach has the potential to improve computer-aided diagnosis in various clinical workflows where imaging and tabular data are combined. We provide a source code for integrating TabAttention in CNNs at https://github.com/SanoScience/Tab-Attention.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43990-2_33

SharedIt: https://rdcu.be/dnwLO

Link to the code repository

https://github.com/SanoScience/Tab-Attention

Link to the dataset(s)

N/A

Reviews

Review #2

Please describe the contribution of the paper

The author proposed a new deep learning method that introduces a module for attention learning using tabular data, extending Convolutional Block Attention Module to the temporal dimension. Validation of the method on FBW estimation task.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Introduced a module for attention learning based on table data.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

There were no significant differences between the method proposed by the author and some other methods in terms of FBW estimation task.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The author provided a code for integrating TabAttention in CNNs on github. I think the article is reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

1) Table data should be replaced with pictures to provide an intuitive comparison. 2) The author believes that their key advantage is that the method does not require any additional effort from clinicians. I hope you can explain this and compare it with BabyNet.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

There were no significant differences between the method proposed by the author and some other methods in terms of FBW estimation task. But the proposed method is interesting.
Reviewer confidence

Somewhat confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

5
[Post rebuttal] Please justify your decision

some comments have been replied well.

Review #3

Please describe the contribution of the paper

n this work the authors presenet a method for fetal birth weight estimation from 92 fetal abdominal ultrasound video scans and Tabular data - 6 different fetal biometry measurements.

The authors propose to fuse the Tabular data and US data using a attention pipeline consisting of a Channel, Spatial (CBAM) and Temporal (TAM) -Attention Module. The results show that the proposed method can match other baselines.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- combination of multiple datasources namely tabular data and 2D US video is very relevant and can be applied to many other applications.
- Clinical motivation is clear as the weight of the fetal has a direct correlation on the risk of the birth and survival.
- discussion and conclusion section is great to read and contains many valuable insights and limitations such as statistical significance with p values, information overlap of tabular and imaging data
- There are many details about the Method including formulas and figures.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Authors see their work as extension of CBAM but there is only limited ablation and no comparison to CBAM (missing: 3D res + tabular, 3D res + CBAM…)
- as discussed in the limitations the results are only marginally better than other methods and are not statistically significant.
- small dataset with only 92 datapoints and no statistical significance between other baseline methods of proposed work.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
- dataset is private
- code is publicly available and anonymously on github
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- many abrreviations and some not even used: machine learning (ML), ISUOG, Linear Regression is already pretty short
- in abstract: “enhance performance of any CNN” —> this has not been showed for all CNNs
- the precision and recall should be especially high for <3000g and >4000g classes as these are our potential risk classes. It would also be helpful to show the results of the different bins.
- In figure 1. integrate Mc, Ms, Mt and S’ , S’’ …
- The Intermediate temporal feature maps S’ come from the 3D conv. But the output of the Tab attention is also called S’ but in bold. I think this should be a different letter to make it clear that this is the output.
- To process and reshape the 6 tabular data points into the spacial dimension in the Spatial Attention Module sounds unituitive for me. as the spacial dimension is much larger than the information in the 6 tabular data scalars.
- In the temporal attention module the tabular data is added to each step of the sequence and therefore often duplicated. Maybe a 0 feature could be used at t=0 containing the tabular information only once for the sequence instead of adding this information to every T.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper targets a clinically relevant problem of fetal birth weight estimation proposing a pipeline to integrate tabular data to video US 2D scans. The results only show very limited improvements which are not very reliable due to the small dataset size. The method seems sound but the integration of the tabular data into every step in the TAM is creating a lot of duplications and there might be a better way of processing this information.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

4
[Post rebuttal] Please justify your decision

I thank the authors for their rebuttal answer that helped in clarifying many points. As attention can attend to any step in a sequence there is still no rational to me to have the additional information in each step of the sequence. In summary I think this work still comes with some loose ends and both dataset and method could be improved further

Review #5

Please describe the contribution of the paper

This work presents a method, TabAttention, to estimate fetal birth weight using both imaging and tabular data with DL. The method achieves similar results to the current state-of-the-art image and/or tabular-based approaches in estimating FBW.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Well written paper.
- Results are clearly communicated.
- Ablation studies and comparison to SOTA is convincing.
- Limitations are well outlined.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The accuracy difference to SOTA is minimal.
- No study is done to see the effect of different attention modules, e.g. plug in and out CAM, SAP or TAM to see which one has the biggest effect.
- What is the computational resource/time needed for the methods in table 1. Is you method computationally more/less efficient in comparison to SOTA?
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The code is publicly available.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

See weaknesses.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I like the paper overall. However, I would like to see he effect of different modules.
Reviewer confidence

Somewhat confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

5
[Post rebuttal] Please justify your decision

My concerns are mostly addressed. That is why I am increasing my score.

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper introduces TabAttention, a method that combines attention mechanisms with tabular data to enhance CNN performance in medical data analysis, specifically for fetal birth weight estimation. The reviews have mixed opinions, with some strengths and weaknesses identified.

The strengths of the paper include the integration of tabular and imaging data, its clinical relevance, and a clear analysis of results. However, weaknesses were also highlighted, such as limited improvements over baseline methods, the lack of comparison with the Convolutional Block Attention Module (CBAM), issues with integrating tabular data, and the absence of a discussion on computational efficiency.

While the paper addresses a relevant problem, it falls short in demonstrating significant improvements over existing methods and lacks thorough comparisons. The authors should address the following points in their rebuttal:

Clarify the significance of the proposed method compared to existing approaches. Reviewer 2 questioned the added value of the proposed method compared to other methods, while Reviewer 3 highlighted the lack of comparison with CBAM. The authors should provide a clear justification for the significance and novelty of their method, addressing how it extends CBAM and why it offers advantages in the context of fetal birth weight estimation.

Provide a more detailed interpretation of the results. Reviewer 2 and Reviewer 3 both mentioned the lack of significant differences between the proposed method and other baselines in terms of fetal birth weight estimation. The authors should provide a thorough analysis and discussion of the results, including the strengths and limitations of their approach, and explain why the marginal improvements observed are still valuable in the clinical context.

Address the concerns regarding the integration of tabular data in the attention modules. Reviewer 3 pointed out issues with the integration of tabular data in the Spatial and Temporal Attention Modules. The authors should clarify the rationale behind this integration, address the duplication of information, and explore potential alternative approaches for incorporating tabular data effectively.

Discuss the computational efficiency of the proposed method. Reviewer 5 raised a question about the computational resource requirements of the method compared to other state-of-the-art approaches. The authors should provide information on the computational efficiency of their method and compare it to the computational requirements of existing methods to highlight its advantages in terms of efficiency.

Author Feedback

We thank the Reviewers for their valuable feedback. Below are our clarifications and additional evaluations, which we will include in the camera-ready paper.

Lack of significant improvement over SOTA (R2,R3,R5,MR): We have now recalculated both MAE and MAPE and included the concatenation method which reflects the superior performance of our approach with the combined use of imaging (I) and tabular data (T): ResNet+concat in FC layer (I+T): 307, 9.0 BabyNet (I): 294, 8.5 ResNet (I): 289, 8.5 Clinicians (T): 205, 5.9 ResNet+DAFT (I+T): 182, 5.6 BabyNet+DAFT (I+T): 172, 5.2 TabAttention (I+T): 170, 5.0 We have to clarify the oversight in the original submission, where we reported results of two baselines (ResNet, BabyNet) with the DAFT module (therefore trained I+T). Our corrected findings also underscore the limitations of methods based exclusively on imaging data and that simple concatenation of tabular data does not improve performance. This highlights the need for developing more sophisticated methods. The value of our contribution lies in bridging the gap between attention learning and utilizing tabular data in CNNs demonstrating the significant role of both imaging and tabular data in FBW estimation.

Advantages in the context of FBW estimation (R2,MR): Our approach does not impose additional tasks on clinicians since it utilizes data that is already part of standard clinical care. Not only does it surpass the performance of clinicians and models relying solely on tabular or imaging data, but it also competes effectively with other methods, achieving the lowest prediction error.

The clarification of CBAM extension (MR): We extend CBAM into the temporal dimension via a novel Temporal Attention Module (TAM) and improve the performance of all modules through the integration of dynamically learned tabular embeddings within the attention learning process.

Limited ablation study (R3,R5,MR): We acknowledge the need for a more comprehensive evaluation: ResNet(I): 289, 8.5 +CBAM(I): 292, 8.4 +TAM(I): 288, 8.4 +CBAM(I+T): 271, 7.7 +TAM(I+T): 180, 5.5 TabAttention (I+T): 170, 5.0 Incorporating plain CBAM/TAM brings lower performance gains when they use imaging data only. Notably, all attention modules enhanced with tabular data are necessary, and their contributions may vary depending on the dataset used.

Potential information duplication in TAM and rationale behind SAM (R3,MR): Depending on the information in tab. data, different features may hold importance in various sections of the video. The tabular data in TAM is embedded into only one scalar per frame (Tx1x1x1) and our results suggest that TAM is particularly important in the US videos for enhancing the representation of frames with standard planes. In SAM, to create attention maps simultaneously from pooled features and tab. embeddings, we follow the intuition from TAM and methods that encode features into 2D space (e.g. SuperTML).

Computational efficiency (R5,MR): While there is an overhead on trainable parameters due to tab. data embeddings the number of GFLOPs is comparable to other methods resulting in a similar inference time (<1s for all methods) and training time (~2.5h per fold); #parameters[M], GFLOPs: ResNet (33.2, 52.1), BabyNet (20.6, 50.5), ResNet + DAFT (33.2, 52.1), BabyNet + DAFT (20.6, 50.5), TabAttention (51.3, 52.4).

Small and private dataset (R3): TabAttention already outperforms clinicians as shown in the paper and achieves the lowest prediction error. With a bigger dataset, we expect further performance gains. Upon acceptance, we plan to publish our dataset.

Performance on risk cases is critical (R3): Our method achieves the best performance gain for large newborns (>4000g) (MAE 182 vs 291 clinicians). It is robust to outliers as there is no detectable correlation between true FBW and prediction error (Pearson coeff. of -0.029). It matches or exceeds the performance of clinicians for other ranges: <3000g, (213 vs 204), 3000-4000g (160 vs 189).

Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The paper received mixed reviews. The reviewers agreed that the proposed method is interesting, but there remain some concerns about the limited improvement over state-of-the-art methods.

Meta-review #2

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

After rebuttal, only one reviewer (among three) weakly rejects this paper. Along with my reading of the paper and rebuttal, a decision of accept is recommended according to the overall quality of the paper. However, I encourage the authors to revise the paper per the reviewers’ suggestion (especially for that of the reviewer 3) in the official version which helps to enhance the impact of the paper on the MICCAI society.

Meta-review #3

Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

The rebuttal does not fully address the concerns raised by reviewers.

back to top

TabAttention: Learning Attention Conditionally on Tabular Data