Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Asif Hanif, Muzammal Naseer, Salman Khan, Mubarak Shah, Fahad Shahbaz Khan

Abstract

It is imperative to ensure the robustness of deep learning models in critical applications such as, healthcare. While recent advances in deep learning have improved the performance of volumetric medical image segmentation models, these models cannot be deployed for real-world applications immediately due to their vulnerability to adversarial attacks. We present a 3D frequency domain adversarial attack for volumetric medical image segmentation models and demonstrate its advantages over conventional input or voxel domain attacks. Using our proposed attack, we introduce a novel frequency domain adversarial training approach for optimizing a robust model against voxel and frequency domain attacks. Moreover, we propose frequency consistency loss to regulate our frequency domain adversarial training that achieves a better tradeoff between model’s performance on clean and adversarial samples.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_43

SharedIt: https://rdcu.be/dnwyW

Link to the code repository

https://github.com/asif-hanif/vafa

Link to the dataset(s)

https://www.synapse.org/#!Synapse:syn3193805/wiki/217789

https://www.creatis.insa-lyon.fr/Challenge/acdc/databases.html


Reviews

Review #5

  • Please describe the contribution of the paper

    This paper proposes a method for optimizing deep learning models for medical image segmentation that is resilient to adversarial attacks. The method uses a 3D frequency domain attack tailored to medical imaging data, resulting in a higher success rate while maintaining similar perceptual similarity. The paper also introduces a frequency-domain adversarial training approach that improves the model’s robustness against voxel and frequency-based attacks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper introduces a frequency-domain adversarial training method that enhances the robustness of the volumetric segmentation model against both voxel and frequency-domain based attacks. This training strategy is particularly important in medical image segmentation, where the accuracy and reliability of the model are crucial for clinical decision making.

    • The paper provides a thorough evaluation of the proposed method, including experiments on two different datasets and comparisons with other state-of-the-art methods. The results demonstrate that the proposed method outperforms other methods in terms of both accuracy and robustness against adversarial attacks.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The main issue is that the proposed adversarial attack method in the paper will not be publicly available, making it challenging for non-experts to replicate the method.

    • The datasets used are quite small in scale. The performance over on larger datasets and different modalities should be discussed.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code will be not provided. It should be hard for non-familiar audiences to inplement.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The authors can discuss limitations or potential areas for improvement in their proposed method. It would be helpful to include some discussion of these issues in future work especially on larger datasets and different modalities for medical image segemenation task.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper brings interesting contributions with frequency domain adeverasarial attack and frequency consistency loss with clear and concise writing.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes a min-max objective for adversarial training of volumetric medical image segmentation model in frequency domain. They introduce Volumetric Adversarial Frequency Attack - VAFA in the maximization step, which is a frequency domain-based adversarial attack specially designed for 3D volumetric data. In the minimization step, they design a volumetric adversarial frequency domain training (VAFT) based on consistency loss to achieve adversarial attack robustness. Experiments conducted on two public datasets and two segmentation models.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The completeness of their approach is high. Both attack and method to attack have been proposed for MIS. This paper applied a min-max-based approach to 3D medical images segmentation problem in frequency domain, they showed that their frequency domain-based adversarial attack archives higher fooling rate compared with voxel domain-based attacks, as well as the 3D method is superior to the 2D method for MIS. All the motivations of their components are reasonable.
    2. The experiments are enough to demonstrate the effectiveness of each component in their proposed method. Two public datasets, two widely used models and enough ablation studies.
    3. The paper is well written and easy to follow.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Since the frequency domain is the main topic for this paper, it’s better for the author to give some explanations of why performing attacks in the frequency domain achieve much better (high fooling rate) performance than in the voxel domain.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Although the datasets and the models used in experiments are public, the authors didn’t claim they will open source code in the future. However, I believe the technical details provided in the paper are enough for reproduction.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see the weakness section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The completeness of the proposed approach is high, the experiments are enough to demonstrates their methods’ effectiveness.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper presents a technique for developing a resilient volumetric medical segmentation model that can withstand adversarial attacks in the frequency domain using 3D-DCT. The method employs Volumetric Adversarial Frequency Attack (VAFA) to generate adversarial samples by modifying the DCT coefficients of the original 3D image after applying DCT to carry out the attack. This approach enhances the robustness of the segmentation model against adversarial attacks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    In this paper, the authors propose an adversarial attack method for volumetric medical image segmentation using the DCT. The proposed method demonstrates the effectiveness of frequency-domain attacks for 3D medical image segmentation by showing the highest fooling rate compared to other voxel-based attacks. Additionally, this paper introduces the frequency consistency loss to ensure robustness for both adversarial attack in the frequency domain and clean images for segmentation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Several studies have utilized DCT for adversarial attacks; however, it remains unclear whether the authors of this paper have reviewed such research. Furthermore, the paper may not provide a clear understanding of how the quantization table functions, and the explanation of how a limited quantization threshold prevents all DCT coefficients from becoming zero may be inadequate. Readers may also have concerns about the absence of epsilon in the 10th line of Algorithm 1, which updates the quantization table, and whether this is intentional. Moreover, readers may question whether using the L-infinity norm for limiting the quantization threshold and the L1 norm for the frequency consistency loss is the optimal approach.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Maybe. the paper provides sufficient details regarding the model architecture, hyperparameters, and training dataset to enable reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This study is interesting research utilizing the DCT, and the authors have presented a new method in the field of Volumetric Medical Segmentation. The paper is well-written and well-structured. However, it is regrettable that the explanation of how quantization tables work is insufficient due to the constraint of the paper’s length. Moreover, it would have been better if the adversarial attack presented in this paper had been proven to be effective using a more diverse range of 3D datasets in medical fields such as VerSe, SKM-TEA, and ATLAS, among others. Furthermore, one correction needs to be made. On page 3 of the paper, line 9, “In the minimization step” should be changed to “In the maximization step” to be valid.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper has improved an interesting topic through a novelty method. Adversarial attacks seriously manipulate the security of DNNs, which also has a significant impact on the field of medical image segmentation. This paper introduces a solution to these issues by modifying existing methods to perform the highest fooling rate adversarial attacks and introducing ways to prevent such attacks. Although the paper has some weak explanations that take some time to understand, considering the novelty and reproducibility of the paper, it is acceptable for publication.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    Overview: The paper introduces a technique to create a robust volumetric medical segmentation model resistant to adversarial attacks in the frequency domain using 3D-DCT. The approach incorporates Volumetric Adversarial Frequency Attack (VAFA) to generate adversarial samples by altering the DCT coefficients of the original 3D image. The method includes a min-max objective for adversarial training and introduces frequency consistency loss to ensure robustness.

    Strengths of the work: The proposed method demonstrates high effectiveness of frequency-domain attacks on 3D medical image segmentation, with the highest fooling rate compared to other voxel-based attacks. The incorporation of frequency consistency loss ensures robustness against adversarial attacks in the frequency domain and clean images for segmentation. The paper is well-written, well-structured, and offers a novel method in the field of volumetric medical segmentation.

    Weaknesses of the work: The paper doesn’t adequately explain how quantization tables work and why attacks in the frequency domain achieve better performance. There are concerns about missing details in the algorithm and the choice of norms for limiting quantization threshold and frequency consistency loss. The proposed adversarial attack method will not be publicly available, which hinders replication by non-experts. The work relies on a limited range of 3D datasets, and its performance on larger datasets and different modalities isn’t discussed.




Author Feedback

We are grateful to the reviewers for their time, valuable feedback, and insightful comments on our submission, which we highly appreciate.

Code: We will make our well-documented code and pre-trained model publicly available.

Reviewer # 2 We appreciate the suggestions. While we have reviewed works on frequency domain attacks (e.g. Long et al., ECCV 2022), they primarily focus on 2D natural images. In contrast, our study is the first to investigate and address the adversarial vulnerabilities of 3D medical segmentation models using frequency domain adversarial training. Our approach takes into account the 3D nature of the data in attack and utilizes SSIM loss to maintain proximity between adversarial and clean samples. By employing adversarial training, our method enhances model robustness against different adversarial attacks, while the frequency consistency loss (Eq. 4) achieves a balance between model accuracy on both clean and adversarial samples.

Quantization Table Explanation: The DCT coefficients tensor “D(x)” undergoes element-wise division by a learnable quantization table (q), resulting in “D(x)/q”, (Note: shapes of D(x) and quantization table “q” are same). After the division operation, the values of “D(x)/q” undergo rounding using a differentiable rounding operation, resulting in some values being rounded down to zero. We then perform de-quantization, which involves element-wise multiplication of rounded values of “D(x)/q” with the same quantization table “q”. This step allows us to reconstruct the quantized DCT coefficients. Adversarial image is obtained by taking inverse-DCT of quantized coefficients.

Quantization Threshold: During quantization, the DCT coefficients tensor “D(x)” undergoes element-wise division by the quantization table “q”, (i.e. D(x)/q). Since quantization table is in the denominator of the division operation, therefore, higher quantization table values increase the possibility of more DCT coefficients being rounded down to zero. To control the number of DCT coefficients being set to zero, we can constrain the values of the quantization table to a maximum threshold denoted as “q_max”. By adjusting the magnitude of “q_max”, it becomes possible to regulate the extent of information drop in the quantization process. The strength of the adversarial attack is directly proportional to q_max (please see Table 2).

Missing Epsilon: Since quantization table is supposed to contain integer values, therefore, we use only sign of loss gradient (+1 or -1) so that updated values of the quantization table are also integers.

L-infinity Norm: L-infinity norm for limiting the quantization threshold (q_max) is used because it constrains the perturbation within a specified maximum limit for each DCT coefficient independently. L1 norm in Eq. 4 helps in preservation of sharp boundaries in prediction on adversarial sample during adversarial training.

Reviewer # 3 Please note that adversarial attacks aim to create samples that closely resemble clean samples. Pixel/voxel domain attacks typically limit the adversarial noise using the l-infinity norm, while our frequency domain attack (VAFA) generates samples without restricting pixel/voxel domain values but maintains visual similarity of adversarial samples to clean samples. Moreover, frequency domain attacks can selectively perturb frequency components that are perceptually less prominent to human observers but important for the model’s predictions.

Reviewer # 5 We will make our code and pre-trained model publicly available. Regarding the future directions of our work, one interesting aspect is to explore the ability to selectively drop information in different frequency regions of the image spectrum, such as the “low,” “middle,” or “high” frequency regions. By examining the effectiveness of adversarial attacks in specific frequency ranges, one can gain insights into the impact of frequency-specific perturbations on the robustness of deep learning models.



back to top