Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Bryar Shareef, Min Xian, Aleksandar Vakanski, Haotian Wang

Abstract

Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification. Although convolutional neural networks (CNNs) have demonstrated reliable performance in tumor classification, they have inherent limitations for modeling global and long-range dependencies due to the localized nature of convolution operations. Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations. In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation using a hybrid ar- chitecture composed of CNNs and Swin Transformer components. The proposed approach was compared to nine BUS classification methods and evaluated using seven quantitative metrics on a dataset of 3,320 BUS images. The results indicate that Hybrid-MT-ESTAN achieved the highest accuracy, sensitivity, and F1 score of 82.7%, 86.4%, and 86.0%, respectively.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_33

SharedIt: https://rdcu.be/dnwDH

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    (1) The authors propose an architecture that takes advantages of both Swin Transformers and CNNs. (2) The authors devise an attention mechanism called Anatomy-Aware Attention , which improves the representation capability based on the anatomy of the breast. (3) It is a multi-task approach that can perform segmentation and classifcation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) This paper presents an AAA module and demonstrates good improvement of this module. (2) The hybrid multitask CNN-Transformer network outperforms the sota methods on publicly available dataset

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) Poor writing. The authors should double-check their paper. e.g. On Page 4, row 8, refer to ? for implementation. In Fig1. What do the arrow and dash arrow mean? Why yellow arrow is used? (2) Anonymous issue. On page 3, the last row, the authors claim that ‘ MT-ESTAN [3] is a CNN-based multitask learning network developed by our team’. Isn’t that against the anonymous rule? (3) The authors claim that their method is multi-task and can perform segmentation. However, there are no quantitative results or qualitative results for evaluation.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I suggest the authors share their code and log. Though, The reproducibility is satisfactory because the method is not complicated and is introduced in detail.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    (1) The author should evaluate the segmentation performance. (2) The authors claim that a multi-task scheme can improve the generalization performance of the model. This should be further discussed in experiments. (3) Why is the multi-task learning can help regularize the model and prevent overftting. Explain in more details. (4) Double-check the paper and improve the writing.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This method achieves good classification results and the method is rather novel to this area. However, this work is half done. The experiments should be further explored and the writing should be double-checked.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors propose a network architecture for the classification and segmentation of breast ultrasound (BUS) images, which combines CNN and Swin Transformer. It is largely based on their previous work (references [3] and [20] in the paper). As compared to their previous work the contribution of this paper is a novel attention module called Anatomy-Aware Attention (AAA), which is essentially a Swin Transformer extended with a couple of layers (shown in Fig.3), which improves performance of the whole model (presented in Table 3).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The developed approach is compared to other models and it is presented that its performance is better in almost all evaluation measures (except specificity) (Table 2)
    • The ablation study (Table 3) shows the developed attention module is worthy and can bring improvement in performance.
    • The datasets are diverse (Table 1), which makes the results convincing.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The paper lacks technical quality. There are many undefined symbols and the part related to, in my opinion, the main novelty of the paper, the proposed attention module (Section 2.2), is unclear and poorly described (see below for details)
    • The paper contains many inconsistencies, typos, and errors (see below)
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    As the technical quality of the paper is low and there is no link to public codes, the work would be hardly reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The paper is well structured, but its technical quality is low. The main problem is the main contribution, AAA Block, is not well described. Practically all equations (1-7) contain undefined or poorly defined symbols. Some of them (the f symbols in eq. 1-4) are relatively easy-to-guess or find in the original paper [22], but the technical details are lacking in the description of the new attention layers in equations 5-7. In particular, it is unclear how exactly AVG and MAX pooling layers work (their sizes) and how the layers were upsampled (i.e., how U works). There are again undefined symbols (\sigma - probably sigmoid and \times - probably point-wise multiplication, by the way, I would not use the cross symbol as it is often used for cross products, which makes no sense here).

    Other comments:

    • The description in Section 2.3 does not correspond to the block for the classification task in Fig. 1. It looks like it was copied from some of the previous papers but not modified appropriately.
    • There should be 562 instead of 62 in Table 1
    • I did not understand what is meant by “height shift (0.2)” and “width shift (0.2)”. What are the units?
    • I would replace “Fig. 1 shows the details of” with “Fig. 1 shows the schema of”
    • There is an undefined reference in the first paragraph of page 4.
    • The contributions in the list at the end of Introduction should use the same language forms (the last two bullets are passive the first one is active form).
    • The text in Fig. 1 should be enlarged
    • There are some typos (e.g. i^th instead of i^{th}, “with with” instead of “with”)
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Even though the main contribution of the paper (AAA module) is not well described the results in Tables 2 and 3 look interesting. I believe the technical quality can be improved and therefore I recommend accepting the paper.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The article presents a multi-task learning model for the classification and segmentation of breast tumors using ultrasound imaging. The authors propose combining a CNN and a Swin Transformer to leverage the advantages of each model, namely long-range dependencies for Transformers and semantic structure for CNNs. The authors also propose a new attention block (Anatomy Aware Attention) for breast tumor classification. The learning model is trained and validated on four public datasets. The classification is compared to eight models from the literature, including both CNNs and Vit approaches. The authors’ multi-task model improves accuracy, sensitivity, F-score, area under the curve, and false negative rate.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The study is well-designed with the use of a large dataset from various sources.

    The comparison of the classification is relevant with enough models from the literature to understand the advantage of the proposed method.

    The attention block proposed by the authors, which integrates and enhances anatomical information, is relevant and well-validated with an ablation study highlighting the addition of this block in breast cancer classification.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The multi-task aspect of the model is not utilized in the study, which raises questions about the usefulness of this method. There is no study conducted on the segmentation of the model, making it impossible to know the quality of the model on this topic.

    Additionally, it would have been interesting to highlight the advantages of the multi-task model over a single classification model.

    The study is based on two assumptions, which are the strengths and weaknesses of Vit and CNN, namely long-range evaluation for Vit and loss of spatial context for CNN. However, there is no citation to confirm or validate these assumptions.

    There is no state-of-the-art on the multi-task model using CNN and Vit for classification and segmentation, which is problematic.

    Upon conducting a quick search, I found this article on the topic:

    TANG, Suigu, YU, Xiaoyuan, CHEANG, Chak Fong, et al. Transformer-based multi-task learning for classification and segmentation of gastrointestinal tract endoscopic images. Computers in Biology and Medicine, 2023, vol. 157, p. 106723.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The proposed method is validated on four public datasets, and the results are provided and can be reproduced. However, it would be beneficial to provide the code for better reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Thank you for your work

    Your method shows promising results for breast cancer classification, and the comparison is exhaustive. However, the fact that segmentation is entirely excluded from the results is a significant issue. This can be approached in two ways:

    • The introduction can be reformulated to highlight the utility of multi-tasking for classification, justifying why the study was only conducted on this topic.
    • A section on segmentation can be added to the results.

    Additionally, a small correction needs to be made on page 4, line 7: [20,?].

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the context and topic are awkwardly defined, namely the definition and highlighting of a multi-task model and the validation solely on classification without any information on the segmentation part, the study conducted on the classification part is well-done, and we can see the added value of the proposed method on this topic.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    Most of the major weaknesses have been addressed by the author, namely the justification of a multiclass model for classification and the justification of the strengths and weaknesses of VIT and CNN. Several publications have been added to support the study’s approach. From a technical standpoint, the study remains interesting with a solid comparison of the results.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This work has a mixed rate. Please prepare a rebuttal to address reviewer concerns. These concers include poor writing, anonymous issue, no evaluation on segmentation, novelty, a lack of technical quality, writing issues, reformulating the introduction to highlight the usage of the multi-tasking for classification.




Author Feedback

We thank the reviewers for their effort and constructive comments.

  1. Poor writing (R1), typos and errors (R1, R2&R3) RSP: Two senior researchers have proofread the manuscript carefully, improved the writing, and fixed typos and errors.

  2. Technical quality and inconsistencies (R2) RSP: We have revised Section 2.2 to improve the technical quality and resolve the inconsistencies. For example, all notations in Eqs. (1 -7), including f, f_^, and sigma, are defined and described; details about the pooling operations are added; and the details of the upsampling blocks are added to Section 2.3.

  3. The segmentation performance evaluation (R1 & R3) RSP: The proposed multitask network has tumor classification (primary) and segmentation (secondary) tasks. The secondary task was introduced to help improve the performance of the primary task. We have added the dice score coefficient (DSC) and Jaccard index (JI) to evaluate the segmentation, and results have been added to Tables 2 and 3, and discussed in Sections 3.4 and 3.5. The proposed network achieves an average DSC of 0.84 and JI of 0.76 on the test dataset.

  4. Why multitask learning and why it prevents overfitting, and ability to improve the generalizability in experiments (R1&R3) RSP: We have added one paragraph in the Introduction section to explain the details of using multitask learning, and discuss why it can regularize the model and prevent overfitting. The discussion of improvement of the generalization of multitask learning has also been added to Section 3.4. “When training a deep multitask learning model on multitasks simultaneously, the shared representations learned by the model can capture common features and patterns relevant to both tasks. By sharing features between tasks, the model can leverage learned representations from one task to benefit the other. For example, the features learned for tumor segmentation are valuable for tumor classification. Similarly, features learned for tumor classification assist in segmentation by providing clues about where tumor boundaries may lie. The sharing of features allows for the transfer and reuse of knowledge between tasks, which improves the model’s ability to generalize and perform well even on small datasets. It introduces inductive bias and reduces the overfitting of models by acting as a regularizer [29] …” (Pages 2-3)

5: The description in Section 2.3 does not correspond to the block for the classification task in Fig. 1. (R2) RSP: This section was not described clearly, but it corresponds to the segmentation and classification branches in Fig. 1. We have revised it below. “The segmentation branch in Fig. 1 outputs dense mask predictions of BUS tumors. It consists of four blocks, Up Blocks 1-4, each with three convolutional layers and one upsampling layer(size (2, 2) and stride (2, 2)). The settings of the convolutional layers in the Up Blocks are adopted from MT-ESTAN[20]. In addition, the blocks receive four skip connections from the MT-ESTAN encoder, i.e., there is a skip connection from each MT-ESTAN block 1 to 4. The classification branch consists of three dense layers, a dropout layer (50\%), and the final dense layer that predicts the tumor class into benign or malignant.”

6: Citation to confirm the two assumptions of ViT and CNN. RSP: We cited the three following papers in the Introduction (paragraph 4, page 2) to support the two assumptions. [6] Ahmed et cl., BTS-ST: Swin transformer network for segmentation and classification of multimodality breast cancer images, 2023 [7] Dosovitskiy et cl., An image is worth16x16 words: Transformers for image recognition at scale, 2020 [28] Tang et cl., Transformer-based multitask learning for classification and segmentation of gastrointestinal tract endoscopic images. 2023

7: Reproducibility and code sharing (R1, R2&R3): RSP: It is our plan to share the code, pre-trained models, and datasets. We will share the website link after the double-blinded review process.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    R3 has upgraded the score from ‘5.weak accept’ to ‘6.accept’, since the rebuttal has almost addressed their concerns. Then, the final ratings of this work is ‘4. weak reject’, ‘5.weak accept’, and ‘6.accept’. After reading the rebuttal, I think the rebuttal is convinced by providing many details. I think this work can be accepted now.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The main mentioned weaknesses have almost been addressed in the rebuttal, such as the verification of a multiclass model for classification, writing issues, segmentation performance evaluation, comparison between CNN and Transformer and so on. I prefer to accept it.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The issue of the weak reject include anonymity - it is too late to resolve this now; the authors should be more careful in future. The issues raised with the writing are quite vague and the authors have taken steps to improve the writing overall. All questions raised appear to have been addressed, particularly by the weak-reject reviewer. The author two reviewers recommend acceptance. I agree that if the paper is revised according to the rebuttal from the authors, this paper can be accepted.



back to top