Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Dong Yang, Ziyue Xu, Yufan He, Vishwesh Nath, Wenqi Li, Andriy Myronenko, Ali Hatamizadeh, Can Zhao, Holger R. Roth, Daguang Xu

Abstract

Neural Architecture Search (NAS) has been widely used for medical image segmentation by improving both model performance and computational efficiency. Recently, the Visual Transformer (ViT) model has achieved significant success in computer vision tasks. Leveraging these two innovations, we propose a novel NAS algorithm, DAST, to optimize neural network models with transformer for 3D medical image segmentation. The proposed algorithm is able to search the global structure and local operations of the architecture with a GPU memory consumption constraint. The resulting architectures reveal an effective relationship between convolution and transformer layers in segmentation models. Moreover, we validate the proposed algorithm on large-scale medical image segmentation data sets, showing its superior performance over the baselines. The model achieves state-of-the-art performance in the public challenge of kidney CT segmentation (KiTS’19). The implementation will be publicly available at [LINK].

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_71

SharedIt: https://rdcu.be/dnwB5

Link to the code repository

N/A

Link to the dataset(s)

https://kits19.grand-challenge.org/

http://medicaldecathlon.com/


Reviews

Review #2

  • Please describe the contribution of the paper

    The paper proposes a NAS method, named DAST

    • DAST learns the relationship between convolutions and transformers within the search space of segmentation networks.
    • Optimizing memory consumption of the searched architecture
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is well-written and organized
    • The proposed architecture looks sound and reasonable with introducing transformer to the search space
    • The experiment is well-conducted with ablations on different memory constraints
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The paper lacks of comparison on other segmentation datasets which make the contribution of DAST not thoroughly validated.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I am uncertain about the reproducibility of the paper

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The author should address the drawback listed above

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think the method and motivation is clear, if the author can explain why they didn’t extend the experiment to other datasets, the paper should be accepted.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The authors propose a Neural Architecture Search-based approach to finding an optimal semantic segmentation network for cross-sectional images which includes transformers in its search space. The results on two publicly-available datasets demonstrate that it achieves very impressive performance, and the attention visualizations reveal interesting long-range dependencies being used during inference.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    With such impressive results in other domains, it has felt inevitable for some time that visual transformers would eventually come to supplant fully convolutional networks for semantic segmentation tasks, but thus far their performance has very often fell short – especially in cross-sectional imaging where nnU-Net still seems to dominate. This paper is the first that I have seen which demonstrates the superiority of a transformer on a very popular cross-sectional segmentation benchmark.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper has few weaknesses, but the argument of this paper could be strengthened by applying this method to other challenge datasets. Two is a good start but there are many others, which would be good candidates.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of this paper is excellent. The authors have committed to release their code and conducted their experiments entirely on publicly available data, with the final results being on the test set of a challenge which is private, and allows for their method to be demonstrably superior on a public leaderboard.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This is a very strong paper. I will be interested to see whether this approach also exceeds the state of the art on other cross-sectional segmentation tasks.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    8

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper uses a novel, transformer-based neural architecture search method to exceed the state of the art on a publicly-available benchmark with more than 2,000 submissions. The approach appears novel and the paper is well-written with fascinating visualizations demonstrating what the attention mechanism is singling out. The authors have also adhered to best practices in reporting and reproducibility.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    Unlike the current approaches on the neural architecture search (NAS) that focus on search using convolutional deep learning components, the paper proposes a new approach to incorporate and learn the relationship between convolutions and transformers in the given search space for the medical segmentation task. The NAS is performed on pancreas dataset from medical segmentation decathlon, and the searched architectures are evaluated on the KiTs’19 dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of the paper are as follows:

    1. Overall, the paper is well-written and easy to follow.
    2. The authors show that they will disclose their code on acceptance.
    3. They achieve state-of-the-art performance on the KiTs’19 challenge testing set.
    4. Good comparison between DiNTS and DAST on the pancreas dataset.
    5. Nice ablation on memory constraints and visualization of the attention mechanism.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weaknesses of the paper are as follows:

    1. The paper claims that existing NAS works are on convolutional networks, while several papers develop transformer-based NAS; see [1] as a summary.
    2. Minor language problems: e.g., Searching algorithms, [include] reinforcement…, have been proposed; [In contrary]; etc.
    3. No ablation study on the additional MSA after the transformer block. What is intuition, and why it works? How would it perform without it?
    4. It would be good to provide more results.
    5. Please see detailed comments for more.

    [1] Chitty-Venkata, K.T., Emani, M., Vishwanath, V. and Somani, A.K., 2022. Neural architecture search for transformers: A survey. IEEE Access, 10, pp.108374-108412.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Reproducible

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. several other papers are working on transformer-driven NAS. Can the authors include them in the literature, distinguishing your contributions from theirs? I understand your model structure is different, but they are not compared to other works in the results section either.
    2. The paper reports only the DSC as the metric, which is understandably the most common metric in segmentation; however, reporting other metrics (hd95, recall, precision, etc.) would support the performance comparison.
    3. The search was performed on the pancreas dataset and validated only on the KiTs’19 dataset. The DiNTS and DAST comparison can be elaborated and discussed in more detail.
    4. Fig. 2. Can you improve the caption to include what DiNTS and DiNTS-160 (and similarly DAST and DAST-96) mean?
    5. Intuition behind using additional MSA after transformers are poorly explained. No ablation on that is provided.
    6. Ablation on the Attention mechanism section discusses the attention inside the transformer blocks; am I right? If so, can you clarify that part so that it is not confusing with the additional MSA?
    7. Discussion section is poor. Given that there is not much space left for it, I suggest you cut the introduction beginning to give more space for discussion. The first two paragraphs of the introduction are quite general and do not serve a strong purpose. It can be combined as a single paragraph.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Limitation of the validation (only on one KiTs’19 dataset with pancreas being used for algo search) Also, the performance comparison is not on NAS-based models, but rather standalone models. Limited literature review, especially on transformers-based NAS. The proposed approach outperforms other methods on the KiTs’19 challenge testing set.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposed to a NAS strategy to find the optimal semantic segmentation network. Instead of searching the deep learning components, it proposed to explore the relationship between conv and transformers in the given search space for the medical image segmentation tasks. The experimental results show the searched arch can achieve SOTA performance. The paper is well written and easy to follow.

    I have several concerns on this paper. 1. Though the proposed method achieve SOTA, the results in Table 1 is actually not impressing, because it only surpasses the previous methods by 0.5-2%. 2. Validation on more datasets will make the proposed DAST more robust.

    Generally, this paper has proposed some good insights, the method as been validated to be effective on public datasets and it is well written. I incline to conditionally accept it.




Author Feedback

N/A



back to top