Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Sutanu Bera, Prabir Kumar Biswas

Abstract

Deep neural networks have been extensively studied for denoising low-dose computed tomography (LDCT) images, but some challenges related to robustness and generalization still need to be addressed. It is known that CNN-based denoising methods perform optimally when all the training and testing images have the same noise variance, but this assumption does not hold in the case of LDCT denoising. As the variance of the CT noise varies depending on the tissue density of the scanned organ, CNNs fails to perform at their full capacity. To overcome this limitation, we propose a novel noise-conditioned feature modulation layer that scales the weight matrix values of a particular convolutional layer based on the noise level present in the input signal. This technique creates a neural network that is conditioned on the input image and can adapt to varying noise levels. Our experiments on two public benchmark datasets show that the proposed dynamic convolutional layer significantly improves the denoising performance of the baseline network, as well as its robustness and generalization to previously unseen noise levels.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_9

SharedIt: https://rdcu.be/dnwjj

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a novel noise encoder for modulating network weights, aiming for a robust and generalizable LDCT method. As described, the encoded noise vector would further finetune network weight according to the noise information, and guide the learning LDCT module accordingly. Experiments on two CT imaging datasets are conducted, and the method achieves better performance, compared with the baseline UNet model. Further ablations confirm the method’s generalizability.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    As described, the proposed noise encoder is novel and easy to be implemented, and the bringing performance improvement seems worthy, balancing the introduced additional parameters and FLOPs.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The method is only verified with the easy backbone UNet, which seems not comparable with other sota methods. The main concern is that if the proposed noise encoder would also bring a similar improvement to other methods.
    2. Except for the direct motivation and numerical experiment verification, I wonder if the noise encoder would represent the real-world noise level. As we know, real-world noise is very complex and important for imaging, how each component of the encoded vector relates to the noise? And if we can further tune it when having accurate prior noise knowledge?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The proposed method seems easy to be implemented in any backbones.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    According to the weekness described above,

    1. it’s better to provide additional experiments with different backbone, to make sure that similar improvement will be achieved with the proposed noise encoder;
    2. the consideration about real-world noise is necessary, and analyze how each component relates with noises.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method seems easy to implement, and effective on UNet backbone, while the pre-describled comments about the simply used backbones and algorithm analysis still makes me concern its further application

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposed a network with weight modulation and dynamic convolution layers to denoise LDCT images acquired from variant tissue densities.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The dynamic convolution layer with weight modulation is interested. However, the experiments should be extended to verify that can effectively denoise another LDCT images acquired from different sites. For example, applying the network trained by a chest dataset to a brain dataset.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The contribution is not significant. The authors did not compare their network to any state-of-art learning-based method. However, the results indicated that improvement is very small comparing with a standard U-Net.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This work is reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The figure 1 is difficult to understand. All notations (wl, sl, … ) should be put on the figure.
    2. All tables should be integrated into a single table.
    3. The author should compare their network to a state-of-art learning-based LDCT method as follows: [1] Z. Huang, J. Zhang, Y. Zhang and H. Shan, “DU-GAN: Generative Adversarial Networks With Dual-Domain U-Net-Based Discriminators for Low-Dose CT Denoising,” in IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1-12, 2022, Art no. 4500512, doi: 10.1109/TIM.2021.3128703.

    [2] Yufei Tang, Qiang Du, Jiping Wang, Zhongyi Wu, Yunxiang Li, Ming Li, Xiaodong Yang, Jian Zheng, “CCN-CL: A content-noise complementary network with contrastive learning for low-dose computed tomography denoising,” Computers in Biology and Medicine, Volume 147, 2022, 105759. https://doi.org/10.1016/j.compbiomed.2022.105759.

    [3] Wang, D., Wu, Z., Yu, H. (2021). TED-Net: Convolution-Free T2T Vision Transformer-Based Encoder-Decoder Dilation Network for Low-Dose CT Denoising. In: Lian, C., Cao, X., Rekik, I., Xu, X., Yan, P. (eds) Machine Learning in Medical Imaging. MLMI 2021. Lecture Notes in Computer Science(), vol 12966. Springer, Cham. https://doi.org/10.1007/978-3-030-87589-3_43

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The contribution is not significant.
    2. The authors did not compare their network to any state-of-art learning-based method.
  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes a denoising network based on dynamic convolution for low-dose CT image denoising. Experiments demonstrates that the denoising network using the proposed method in this paper outperforms the baseline for its robustness and generalization.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper provided a noise-conditioned module to solve the side effect of SNR variance in denoise tasks. The introduced noise-conditioned weight modulation improves the network’s generalization ability and robustness. Experimental results show that the denoising method outperforms the baselines.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper needs to clarify how propose to modulate the denoising network with a weighted vector.

    1. The reason why the anatomy encoder after a second-order Laplacian filter could serves as a modulating signal.
    2. How the noise characteristic is passed to backbone network through the 2 Layer MLP efficiently?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This paper uses public datasets and the baseline network is currently publicly available. The description of the method is relatively clear and the framework is easy to reproduce.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    CNN-based denoising networks fail to perform optimally in medical image denoising when the signal varies significantly due to the data are collected among different parts of the human body, under different conditions, with different devices. The paper uses an input signal related modulating signal to condition the denoising convolutional layer. The proposed method is simple, and easy to train and deploy. Experiments on public databases show the proposed outperform baseline network’s performance, especially on the out-of-distribution data. However, as mentioned in the weakness: the paper needs to clarify:

    1. Why the anatomy encoder after a second-order Laplacian filter could serves as a modulating signal.
    2. How the noise characteristic is passed to backbone network through the 2 Layer MLP efficiently. Minor issues:
    3. Too many Typos, two examples: a) Page 2: in the paragraph of weight modulation: “a D-dimensional embedding” ?D b) Page 8: It can be seen that M5 completely failed to remove noise from these images despite the fact the M5 was trained using the abdominal image. “the fact the”?
    4. It is inconvenient to get the meaning of M1, M2…M8 in Figure 2,3,5 as they are meaningless
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is trying to solve a practicial issue with good result in a simple way. And the paper is relatively well written

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper proposes a denoising networks for low dose CT images. I agree with the reviews. The papers posses some strength however there are some raising concerns from reviewers. I recommend a rebuttal phase for the authors.




Author Feedback

We thank all reviewers for their constructive comments and appreciation of our novel and simple framework for a robust and generalisable low-dose CT denoising framework. We appreciate the reviewers for the constructive comments, and we will implement the necessary revisions to strengthen the paper.

Reply to R1: Our primary goal was to demonstrate the effectiveness of dynamic weight modulation in addressing variations in noise levels in low-dose CT scans. Instead of using a resource-intensive network, we chose a widely accepted and computationally efficient backbone. This decision enabled us to effectively validate our concept without relying on extensive computational resources. However, we recognise the importance of validating our proposed method in other architectural designs. We are confident that our framework can effectively handle more complex architectures and achieve superior performance compared to the specific architecture employed in our study. We will add these experiments in our future extension of the work. The low-dose CT images utilised in our experiments accurately represent real low-dose noise, which is not simply additive but comprises complex signal-dependent noise. These noisy images are generated in accordance with the principles of CT physics and are commonly employed in the literature to validate low-dose CT denoising algorithms. We are confident that our noise encoder can effectively encode real noisy images, as it has successfully encoded noisy images from two extensive low-dose CT image datasets. Understanding the relationship between each component of the noise encoder’s output embedding and the noise itself is a fascinating prospect. However, due to the higher dimensionality of the encoded embedding, analysing this aspect becomes challenging. In our study, we tried to explore this aspect by applying t-SNE projection, which reduces the dimensionality of the embedding. Through this analysis, we observed that the embedding clusters into distinct groups based on the noise level, providing valuable insights into the encoding process and noise characteristics. Next, how to finetune this embedding to use it as prior knowledge is an interesting aspect, which we have planned to explore in the extension of the study. Reply R2: It would be inappropriate to directly compare the proposed method with other state-of-the-art approaches. In this study, our focus is not on introducing a novel architecture for low-dose CT denoising but rather on presenting a dynamic convolution layer designed to address variations in noise levels within low-dose CT images. This dynamic layer can be seamlessly integrated into any network architecture, enhancing the denoising performance specific to that architecture. Therefore, our aim is to demonstrate the effectiveness of this dynamic layer rather than making direct comparisons with other existing methods. Reply to R3: In our study, we employed the Laplacian filter to extract the high-frequency components of the image, primarily focusing on the noise. This extracted noise serves as the input for the noise encoding network. The convolutional layer within the noise encoder generates an output that can be regarded as the features or characteristics of the input noisy image. These features are then passed through a two-layer MLP (Multi-Layer Perceptron) to transform them into an embedding. This embedding is subsequently utilised as an input for the dynamic convolution layer, where it plays a role in modulating the weights of the network.



back to top