Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Nina Weng, Martyna Plomecka, Manuel Kaufmann, Ard Kastrati, Roger Wattenhofer, Nicolas Langer

Abstract

Eye movements can reveal valuable insights into various aspects of human mental processes, physical well-being, and actions. Recently, several datasets have been made available that simultaneously record EEG activity and eye movements. This has triggered the development of various methods to predict gaze direction based on brain activity. However, most of these methods lack interpretability, which limits their technology acceptance. In this paper, we leverage a large data set of simultaneously measured Electroencephalography (EEG) and Eye tracking, proposing an interpretable model for gaze estimation from EEG data. More specifically, we present a novel attention-based deep learning framework for EEG signal analysis, which allows the network to focus on the most relevant information in the signal and discard problematic channels. Additionally, we provide a comprehensive evaluation of the presented framework, demonstrating its superiority over current methods in terms of accuracy and robustness. Finally, the study presents visualizations that explain the results of the analysis and highlights the potential of attention mechanism for improving the efficiency and effectiveness of EEG data analysis in a variety of applications.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_69

SharedIt: https://rdcu.be/dnwzB

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents an attention-based deep learning framework for EEG signal analysis, which allows the network to focus on the most relevant information in the signal and discard problematic channels.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Using a deep learning framework for EEG signal analysis.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main part of the paper, the Attention-CNN, needs more explanation.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    According to the authors’ answers to the checklist as well as the information provided in the article, the reproducibility appears to be possible. The actual code is not included.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Some parts of the article need more explanation. Adding informative figures and diagrams, specially in the methodology section, can help a lot.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper contains innovative points in terms of applying a deep model for EEG signal processing and can be used for some medical applications.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    Attention-CNN used to predict direction and position of eye movements based on EEG. Attention-CNN performs better than standard CNN.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Useful for accounting for the influence of eye movements on the EEG and also determining eye movements from the EEG. Will have general applicability in both scientific and clinical studies

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Difficult to determine how useful the method is from the metrics provided

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Details clear. Data and code not available, although data may become available via another paper on arxiv

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    See weaknesses above

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is clear. The method is somewhat novel. Application is novel. Having more interpretable performance metrics would be good.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes an attention-based CNN for gaze estimation from EEG signals. The model combines convolutional blocks with squeeze-and-excitation blocks and a self-attention block, where the attention blocks are used to weigh EEG electrodes differently. Experiments on the public EEGEyeNet dataset suggests the effectiveness of incorporating attention blocks in the model. Interpretability analyses indicate that the attention blocks are (a) able to detect the electrical difference between left and right brain areas for long saccades and (b) less susceptible to noisy EEG electrodes.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. In general, this paper is clearly written and easy to understand.
    2. Quantitative results and qualitative interpretability analyses suggest the effectiveness of the attention blocks in the model.
    3. There are detailed ablations showing the importance of each attention component in the model (squeeze-and-excitation blocks and self-attention block).
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. There is no comparison to methods in existing studies, e.g., benchmark models in the EEGEyeNet paper.
    2. In general, the method descriptions lack intuition and justifications. For example, in Section 2.2, the authors present the attention-CNN architecture, but there is no justification of (a) squeeze-and-excitation (SE) block and/or self-attention (SA) block and (b) SE blocks are repeated multiple times but SA block is only used once. In the end of page 4, the authors explain the difference between the original transformer and the proposed model (softmax vs global average pooling and sigmoid), but no intuition is provided.
    3. Some important details about the dataset are missing. For example, how many EEG electrodes are there? What’s the sequence length of an 1-second EEG clip?
    4. No details about how hyperparameters were selected in Section 3.1 “Implementation details”.
    5. In Conclusion, the authors claim that “our proposed approach was less susceptible to noise”, which is a general statement. However, only 1 example was given in Figure 4 to support this claim.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Dataset is publicly available. No information about code release is provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Please provide comparisons to existing methods, e.g., models in EEGEyeNet paper.
    2. In Methods, please provide more intuition and justifications behind the model architecture. For example, in Section 2.2, the authors can explain why SE and SA blocks could be useful and why SE blocks are inserted in each residual block but SA block is only needed once after the residual blocks. In the end of page 4, the authors can explain why global average pooling + sigmoid activation are needed to obtain the attention weights.
    3. Please provide more details about the dataset. See my comments above.
    4. Please provide more details about hyperparameter tuning, if any.
    5. Was Figure 3 based on a single sample or all samples? Was it from the held-out test set? Please clarify. If it was from a single sample, it would be good to show more examples and see if the pattern is consistent across examples.
    6. In Figure 3, instead of thresholding the attention weights and showing the electrodes in yellow, it would be more informative to show the attention weights for the electrodes as well.
    7. Similarly for Figure 4, it would be great to see if the same pattern exists across multiple examples. Additional examples can be presented in the Supplement.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper aims to address an important application, but there are several major weaknesses that need to be addressed (see my comments above).

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors have addressed my major concerns in the rebuttal.



Review #4

  • Please describe the contribution of the paper

    The authors propose an interpretable deep learning-based method for two tasks, namely the position task and the direction task, using EEG signals. They construct the network by incorporating convolutional layers, a squeeze and excitation block, as well as self-attention layers, among others. Since the method draws inspiration from the transformer, it proves effective in providing interpretations for the corresponding regions of EEG signals in gaze estimation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors aim to design a model for both the position task and the direction task utilizing the self-attention mechanism, similar to the transformer architecture. Consequently, the regression results can be interpreted by leveraging the self-attention block.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The authors’ claim of providing a comprehensive evaluation of their proposed framework, demonstrating its superiority over current methods in terms of accuracy and robustness, is not adequately supported by the experimental results. The scope of the experiments is very limited and insufficient. For instance, in Table 2, the performance comparison is only made within their own subset of the framework. While their full framework comprises CNN+SE+SA, the baselines considered are only CNN, CNN+SE, and CNN+SA. According to EEGEyeNet[16], current baselines for CNN-based methods include CNN, PyramidalCNN, EEGNet, InceptionTime, and Xception. Therefore, it is not logical to claim that the proposed framework has been verified to achieve superiority over current methods.

    Furthermore, the correlation between the interpretation of signal intensity and eye movement is not clearly established. Although the authors provide case studies for interpretation, only two cases in Figure 3 (d) and (e) are not convincing in delivering meaningful interpretations. Moreover, there is no description of medical interpretation. Instead of solely providing these two cases, the authors could have described the correlation between the amplitude of eye movement and specific electrode locations.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    While the hyperparameters are well-described in the “Implementation Details” section, the utilization of optional blocks, such as the SE block and SA block, is not clearly explained. The authors utilize the residual block four times in their model. However, it is not clear what specific conditions or criteria are used to determine when to incorporate the SE block and SA block within these residual blocks. Additionally, in Section 2.2, it is important to note that the “blocks in blue” depicted in Figure 2 include not only the SA block but also the SE block. Consequently, the diagram in Figure 2 can be somewhat confusing.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    It is necessary to compare the proposed method with more CNN-based networks, as mentioned in point 6 above.

    The authors introduce additional operations for M_att, which is unlikely to be M_att multiplied by V in the transformer. A more detailed explanation is required to clarify the rationale behind performing additional average pooling and sigmoid operations.

    The interpretation of results based on only two cases is insufficient. To enhance the quality of the research and prepare for publication in a journal, the authors should provide more cases and conduct statistical analysis, such as exploring the correlation between the amplitude of saccades and EEG signals or the amplitude of saccades and attention.

    Minor suggestion: “Deep Learning” is not a pronoun, so it can be referred to simply as “deep learning.”

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is not recommended for MICCAI 2023. As mentioned in the weaknesses above, the proposed framework in the paper does not demonstrate superiority over current methods, and the interpretation aspect is limited. Furthermore, there is a lack of description regarding medical interpretation as well.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The model combines convolutional blocks with squeeze-and-excitation blocks and a self-attention block, where the attention blocks are used to weigh EEG electrodes differently. Experiments on the public EEGEyeNet dataset suggests the effectiveness of incorporating attention blocks in the model. Interpretability analyses indicate that the attention blocks are (a) able to detect the electrical difference between left and right brain areas for long saccades and (b) less susceptible to noisy EEG electrodes. The paper’s strengths include clarity, organization, and innovation. However, weaknesses include inadequate explanations and lack of comparisons to existing methods. Please consider addressing the following points (not limited to the following points) in the rebuttal: 1)Some parts of the article need more explanation. Adding informative figures and diagrams, especially in the methodology section; 2) Difficult to determine how useful the method is from the metrics provided; 3) Please provide more details about the dataset and more details about hyperparameter tuning, if any; 4) More interpretation on Fig. 3 and Fig. 4 and it would be more informative to show the attention weights for the electrodes.




Author Feedback

We sincerely appreciate the reviewers’ insightful comments. We are encouraged that they found our paper has novelty and applicability for both scientific and clinical studies. Based on the comments, we’ve revised the paper by adding detailed diagrams, broadening metric to include visual angles, providing dataset details and hyperparameter tuning process; some of which are limited by rebuttal format but will be ready for the final version. Below, we address specific comments:

  1. More explanations about the model design 1.1 Justification of why Squeeze and Excitation(SE) block is repeated while Self-attention(SA) block is not We design to place the attention block near the output’s end of the network where deeper-levels of semantic information could be accessed. The positioning will not lead to data leakage between channels since the convolution layers only operate on time dimension. SE blocks are repeated because they were placed inside the residual block, which way of placing SE blocks has one of the best performance examined by SE network (J. Hu et al, 2017). 1.2 Intuition behind the difference between original transformer and proposed model While the SE represents one type of channel attention, the SA block concentrates on spatial aspects, producing attention in the form of a matrix. In order to vectorize the matrix, we use Global Average Pooling on the channel dimension, which is then passed through a sigmoid function. The reason we prefer sigmoid over softmax is due to its performance in situations where all electrodes carry important information with high weights. Unlike softmax, which could inadvertently even out the weights, the sigmoid function preserves electrode significance.
  2. More explanations on the effectiveness of the proposed model Our methods were evaluated both quantitatively through error measurements, and intuitively by visualizing electrode attentions. We used Euclidean Distance for absolute position tasks and RMSE for direction tasks, measured in pixels and radians respectively. Table 2 reveals a 5-10% improvement in gaze prediction when incorporating attention blocks, highlighting the benefits of electrode-wise attention. A visualization of the electrode weights from attention blocks, represented through Figure 4-c heatmaps, supports these findings, showcasing shifting attentions on the prefrontal area and capability to bypass noisy electrodes. More visualizations of samples will be presented in supplemental materials.
  3. Quantification of explainability Explainability quantification for signal-format data like EEG is challenging due to limited existing research. Nevertheless, we’d like to provide another piece of evidence supporting our method’s validity in explainability. In the Direction Task testset, there are 272 samples with at least one noisy electrode. The attention block’s effectiveness is evidenced by smaller weights on these noisy electrodes compared to non-noisy ones. Results indicate that 42%/26% of noisy electrodes and 19%/3% of non-noisy ones had normalized attention weights below 0.05/0.01, suggesting the attention block’s ability to reduce weights of abnormal electrodes.
  4. Details about dataset and hyperparameter (HP) tuning Our dataset includes data from 128 EEG electrodes, and the length of the EEG clip is 500 (1-second recording, sampling rate of 500Hz). For Hps like the number of convolutional layers, kernel size and hidden feature length, are selected from certain sets based on validation performance. Other Hps like batchsize, epochs, learning rates, are selected based on the GPU power, then carefully examined to prevent overfitting or underfitting.
  5. Comparison to EEGEyeNet benchmark We have run the benchmark models provided by EEGEyeNet, and the performances are approximately 10-15% worse than the proposed method. For brevity, in the previous version of the manuscript, we only showed the comparison with CNN as it has been proved to be the best of the models in the benchmark.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The model combines convolutional blocks with squeeze-and-excitation blocks and a self-attention block, where the attention blocks are used to weigh EEG electrodes differently. Experiments on the public EEGEyeNet dataset suggests the effectiveness of incorporating attention blocks in the model. Interpretability analyses indicate that the attention blocks are (a) able to detect the electrical difference between left and right brain areas for long saccades and (b) less susceptible to noisy EEG electrodes. The paper’s strengths include clarity, organization, and innovation. Useful for accounting for the influence of eye movements on the EEG and also determining eye movements from the EEG. Will have general applicability in both scientific and clinical studies. However, weakness points are also found. For example, there is no comparison to methods in existing studies, many important details about the dataset are missing. After the rebuttal, most concerns are addressed, but the correlation between the interpretation of signal intensity and eye movement is not clearly established. Combining the comments of all the reviewer and myself, it is an interesting paper where merits slightly weigh over weakness.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors address several important aspects in their rebuttal, such as extensive explanation of architectural building blocks as well as they provided a comparison to SOTA (10%-15% better performance). I think this addresses the major points raised by R4, who has not updated scores and was the most critical one before review. I think the paper has merit and is therefore a good candidate to be considered at MICCAI. It is pivotal that the authors include the mentioned points in the camera-ready version of the manuscript.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have satisfactorily addressed some of reviewers’ comments in the rebuttal, leading to reviewers upgrading their scores. I therefore recommend an acceptance at this stage.



back to top