Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Zhihao Chen, Qi Gao, Yi Zhang, Hongming Shan

Abstract

While various deep learning methods have been proposed for low-dose computed tomography (CT) denoising, most of them leverage the normal-dose CT images as the ground-truth to supervise the denoising process. These methods typically ignore the inherent correlation within a single CT image, especially the anatomical semantics of human tissues, and lack the interpretability on the denoising process. In this paper, we propose a novel Anatomy-aware Supervised CONtrastive learning framework, termed ASCON, which can explore the anatomical semantics for low-dose CT denoising while providing anatomical interpretability. The proposed ASCON consists of two novel designs: an efficient self-attention-based U-Net (ESAU-Net) and a multi-scale anatomical contrastive network (MAC-Net). First, to better capture global-local interactions and adapt to the high-resolution input, an efficient ESAU-Net is introduced by using a channel-wise self-attention mechanism. Second, MAC-Net incorporates a patch-wise non-contrastive module to capture inherent anatomical information and a pixel-wise contrastive module to maintain intrinsic anatomical consistency. Extensive experimental results on two public low-dose CT denoising datasets demonstrate superior performance of ASCON over state-of-the-art models. Remarkably, our ASCON provides anatomical interpretability for low-dose CT denoising for the first time. Source code is available at https://github.com/hao1635/ASCON.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_34

SharedIt: https://rdcu.be/dnwwN

Link to the code repository

https://github.com/hao1635/ASCON

Link to the dataset(s)

N/A


Reviews

Review #2

  • Please describe the contribution of the paper

    This paper proposed an Anatomy-aware Supervised Contrastive Learning Framework to mine the inherent anatomical semantics of human tissues. Specifically, ASCON consists of two novel designs: an efficient self-attention-based Unet (ESAU-Net) to capture both local and global contexts and a multi-scale anatomical contrastive network (MAC-Net) to improve anatomical consistency at the pixel level. Experimental results on two datasets demonstrate the effectiveness of the proposed method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The innovation of this work is strong. The authors analyze that the current denoising method causes over-smoothing because the level of noise in CT images varies depending on the type of tissues. This work proposed a multi-scale anatomical contrastive network (MAC-Net) that recognizes the anatomical semantics for effectively denoising diverse tissues and Fig. 3 provides anatomical interpretability for LDCT denoising for the first time.
    2. I find the paper to be well-written and straightforward. The methodological details are elaborated and the motivation is verified in the experimental results.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The Mayo datasets contain chest and abdomen data, but the authors only provide results for the abdomen.
    2. Why did the authors extract patches with a size of 256×256 from the mayo2020 dataset for testing instead of using the original image? Stitching the output image patches into the original image may produce boundary effects.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This paper is reproducible with the provided implementation details. Making source code publicly accessiable is deseriable.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. To enhance the clarity and comprehensibility of the motivation, it is suggested for the authors to present examples of noise levels of various tissues in CT images.
    2. The hyperparameters for MAC-Net sampling vary across different datasets. What is the basis for the selection? It is suggested that the authors add some description.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presented a novel method for low-dose CT denoising with anatomical interpretability. This paper also provided sufficient methodological details and experimental evidence.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    This paper presents a novel anatomy-aware supervised contrastive learning framework (ASCON) for low-dose CT denoising. It has two novel designs: efficient self-attention-based U-net (ESAU-Net) and a multi-scale anatomical contrastive network (MAC-Net). The experimental results show that the proposed ASCON has better performance than existing SOTA methods and has anatomical interpretability. Ablation study also validtes the effectiveness of each components.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The introduced ESAU-Net can efficiently leverage self-attention for low-dose CT denoising. Compared to conventional self-attention, the one in ESAU-Net can be trained with patches and tested on full-size images.
    2. The introduced MAC-Net can leverage contrastive and non-contrastive learning to extract anatomical information in a global-local manner, which is better than perceptual loss and does not need adversarial training.
    3. On the result side, the anatomical interpretability provided by the proposed method might be interesting for this low-level task.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Although the authors provide the pseudo algorithm, the entire training looks complicated. Some comments on pseudo algorithms and concise explanation about math equations could improve the readability.
    2. Although the proposed method can introduce semantic interpretability, the utility of it in clinical use is not less clear.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The proposed method may be reproducible since the implementation details, network architecture in supplemental materials, and pseudo algorithm are provided.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Some comments on pseudo algorithms and concise explanation about math equations could improve the readability.
    2. On the right side of Eq. (5), summation symbol was missing.
    3. Further discussion on semantic interpretability should be provided.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper offers two things: an efficient denoising model with self-attention, and an auxiliary loss (model) to measure semantic information. The semantic interpretability seems interesting. The results also demonstrate its effectiveness.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper proposes ASCON, a supervised deep-learning method low-dose CT denoising. ASCON consists of 2 components. One attention-based U-net to generate denoised results, and one disentangled U-net to process both LDCT and NDCT images for contrastive learning. ASCON produced images with better quantitative measurements than other deep learning methods. This is an interesting paper incorporating contrastive learning for low-dose CT denoising.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper proposes to incorporate contrastive learning for low-dose CT denoising and achieve better results over other competing methods.
    2. It explores anatomical information in CT denoising.
    3. The disentangled contrastive component enforces similarities in both patch-level and pixel-level.
    4. The paper demonstrates anatomical semantic information in a CT denoising network for the first time.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. In Fig.2, the authors said Red-CNN produced blurred reconstruction. But, in terms of the smoothness of images, ASCON and Red-CNN produced visually similar results to me.
    2. As shown in Fig.2 and Table 2, it seems like contrastive loss only brings incremental improvements to the results. It would be better if the authors could demonstrate the effectiveness of the proposed contrastive learning framework in a different aspect.
    3. Will the additional MAC-Net significantly increase the testing time over other networks? Or the MAC-net is only implemented during the training process to calculate the contrastive loss?
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Given the clear description of the algorithm, it should be easy to reproduce.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. Add statistical testing to quantitative results.
    2. Seems like the proposed method produced images with even lower noise levels compared with normal-dose images. More clinical-relevant evaluations will be necessary to evaluate the image quality. To make sure the proposed method does not blur out some important features.
    3. Compare the testing time of different methods.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Interesting paper incorporating contrastive learning for low-dose CT denoising, and demonstrate anatomical semantic information for the first time in CT denoising. But improvements over other methods seem incremental. It would be better if the authors could show more cases to demonstrate the effectiveness of the contrastive framework.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposed a novel denoising framework for low-dose CT imaging. The anatomical priors are carefully introduced via the local/global constrastive learning process. Experiments results demonstrate its promising accuracy on public datasets. All reviewers agreed that this paper is with strong novelty and smooth presentation.




Author Feedback

We thank the reviewers and the meta-reviewer for the valuable comments, which help to improve the quality of this paper. We respond to your concerns as follows.

To R1 Q1.1 More details on pseudo algorithms and math equations for improved readability Thanks. We will revise the final paper accordingly.

Q1.2 The utility of semantic interpretability First, the semantic information can improve the interpretability of the deep-learning-based model which is more clinically acceptable. Second, since the noise levels of different tissues vary, anatomical semantic information can guide the network to perform tissue-aware denoising. Third, due to the effective extraction of anatomical semantic features, our denoising network can be easily transferred to other medical imaging tasks, such as segmentation.

To R2 Q2.1 Only reported abdomen results on te Mayo datasets. Our method aims to explore anatomical semantic information of various tissues during denoising, while the diversity of tissue types in chest data is not as rich as in abdominal data.

Q2.2 Concerns on the testing on patches. Thanks for pointing out this issues. We’d like to clarify that for Mayo-2020, we only trained our model with a patch size of 256×256 and still used 512x512 for testing. We will make it clear in final version.

Q2.3 Examples of noise levels of various tissues in CT images. CT imaging is based on differences in the absorption of X-ray photons by various tissues. However, this can also result in varying levels of noise in different tissue regions of the reconstructed image. For example, in the LDCT image of Fig. 2, the standard deviation (SD) of an ROI in the liver is 63.23 HU, while the SD of an ROI in the muscle is 44.73 HU. We will make it clear in final version.

Q2.4 Selection of the hyperparameters for MAC-Net sampling. We selected the hyperparameters empirically based on the selected patch size and followed the previous works we refer to.

To R3 Q3.1 ASCON and RED-CNN produced visually similar results in Fig. 2. As shown in Fig.2, ASCON and RED-CNN produced visually similar results in low-contrast areas after denoising. However, the results of RED-CNN blurred the edges between different tissues, such as the liver and blood vessels, while the results of ASCON smoothed the noise and maintained the sharp edges.

Q3.2 Incremental improvements of contrastive loss.  Although the difference in quantitative results is not significant, it can be seen from Fig. 2 that our ASCON remains structural details and edges more explicitly compared with the ESAU-Net without two contrastive losses. In addition, as shown in Fig. 3, our method can provide anatomic semantic interpretability owing to contrastive loss.

Q3.3 Statistical testing to quantitative results. Thank you for your constructive suggestion. We will report statistical testing in the final version.

Q3.4 More clinical-relevant evaluations. We selected a lesion area and computed the contrast-to-noise ratio (CNR) between the lesion area and its surrounding area: ASCON (0.742), RED-CNN (0.693), EDCNN (0.531), DU-GAN (0.523). Note that our method achieved the best CNR, which gained better low-contrast detectability. This will be added to final version.

Q3.5 Testing time comparison of different methods. MAC-Net is only implemented during training and does not affect the testing time. We computed the testing time for three methods of denoising a single image: ASCON (0.046 s), RED-CNN (0.048 s), and WGAN-VGG (0.046 s). This will be added to final version.



back to top