Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Chenglong Ma, Zilong Li, Junping Zhang, Yi Zhang, Hongming Shan

Abstract

Sparse-view computed tomography (CT) is a promising solution for expediting the scanning process and mitigating radiation exposure to patients, the reconstructed images, however, contain severe streak artifacts, compromising subsequent screening and diagnosis. Recently, deep learning-based image post-processing methods along with their dual-domain counterparts have shown promising results. However, existing methods usually produce over-smoothed images with loss of details due to i) the difficulty in accurately modeling the artifact patterns in the image domain, and ii) the equal treatment of each pixel in the loss function. To address these issues, we concentrate on the image post-processing and propose a simple yet effective FREquency-band-awarE and SElf-guidED network, termed FreeSeed, which can effectively remove artifacts and recover missing details from the contaminated sparse-view CT images. Specifically, we first propose a frequency-band-aware artifact modeling network (FreeNet), which learns artifact-related frequency-band attention in the Fourier domain for better modeling the globally distributed streak artifact on the sparse-view CT images. We then introduce a self-guided artifact refinement network (SeedNet), which leverages the predicted artifact to assist FreeNet in continuing to refine the severely corrupted details. Extensive experiments demonstrate the superior performance of FreeSeed and its dual-domain counterpart over the state-of-the-art sparse-view CT reconstruction methods. Source code is made available at https://github.com/Masaaki-75/freeseed.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_24

SharedIt: https://rdcu.be/dnwwD

Link to the code repository

https://github.com/Masaaki-75/freeseed

Link to the dataset(s)

https://ctcicblog.mayo.edu/2016-low-dose-ct-grand-challenge/


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper tackles the reconstruction of sparse-view CT images by proposing a FREquency-band-awarE and SElfguidED network, termed FreeSeed. The network contains a FreeNet, using FFC’s embedded in a U-net structure together with a SeedNet which does some artifact refinement. The authors evaluate the method on the 2016 NIH-AAPM Mayo Clinic Low Dose CT Grand Challenge and compare the results to other state-of-the-art networks for sparse-view CT reconstruction. Furthermore, the authors included an ablation study.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    I like the idea of using Fast Fourier Convolutions to tackle the present artifacts. Also the evaluation is comprehensive and includes other state of the art methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I believe the motivation on why to use image-domain only is very weak and is almost absurd at the point where a dual-domain methods including their network (included in the ablation study) outperforms the network that works purely on the image domain.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Based on the information provided in the paper, it seems that the authors have taken steps to ensure the reproducibility of their research. (Except of the DuDoFreeSeed network used in the ablation study)

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Could you please provide references to support the claims of the disadvantages of using sinogram data?

    In your discussion of image-domain methods, you mention that they may suffer from a performance bottleneck due to the complexity of modeling global artifacts in the image domain alone. Can you explain why this is not applicable to your method?

    In the section on contributions, you mention that your network improves performance across all views. Could you please clarify what you mean by “views”?

    In the overview of your method, you state that the SeedNet is involved only in the training phase. However, it is not clear to me how it works, as it is part of the network and contributes to detail recovery. Can you explain how it can improve the output when it is not applied during testing?

    Please refer to Figure 2 within Figure 1 so the reader can look up what the SeedNet is. (Or combine it to one figure)

    In the experiments, you state that 5,410 slices were randomly selected for training and the remaining 526 slices were used for testing, from a total of 10 anonymous patients. Could you confirm that the splitting was not completely random and that the complete data from one patient was used for testing? Random splitting of the 5,410 slices may go against common practice since adjacent slices of one patient could correlate with each other.

    In the ablation study, I noticed that the FreeSeed full version of the network without the bandpass attenuation maps (using normal convolutional layers instead of FFC) was missing. It would be helpful to include this version for comparison. Additionally, a comparison with the application of a normal bandpass filter in the frequency domain would be interesting.

    I am curious about why the authors proposed an image-domain network when the ablation study clearly shows that the dual-domain version leads to better results.

    To improve the clarity and repeatability of the paper, it would be beneficial to provide more detailed information regarding the DuDOFreeSed network.

    Additionally, there is a typo in the paper where it says “third and third rows”, which could be corrected for better readability.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    while I like the idea, the motivation is pretty weak and experiments show that the method should be changed to a dual-domain approach.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This work proposes an image-based artifact removal framework for spare-view CT images based on frequency-band-aware artifact modeling.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Paper is well organized and written, motivation is clear, and results are convincing.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. For the Eq.4, why use L2 loss? Whether L1 loss is better or not?
    2. For Eq.5, how to calculate to get the mask M, i.e., what’s the transformation T? What’s the format of M and T?
    3. For the ablation study, 1) What does the simple masked loss L mean? What does the 1+M mean? 2) For Tab2, why does the performance decrease when comparing variant (4) to (3)? 3) Is there any visualization results for the ablation study to show the effectiveness of each step? 4) Whether the algorithm of calculating mask M will affect the performance? Does the mask quality significantly affect the performance?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Can be reproduced according to the paper if provided some details.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Some details should be provided for better understanding, please see the weakness.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please see the weakness.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    After reading the author feedback with more details provided, I choose to recommend accepting it.



Review #3

  • Please describe the contribution of the paper

    This paper proposed a simple yet effective FreeSeed method, which can remove artifact and recover missing detail from the contaminated sparse-view CT images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The topic is interesting and practical. The motivation is significant.
    2. Fourier domain features are used in CT reconstruction
    3. An additional Reffnement Network is proposed to refine the severely corrupted details.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It looks not reasonable that the proposed method can extract the artifact features.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I think it is reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Though the Fourier convolution block can be used to capture frequency domain features, the extracted features contain not only the artifact features but also original CT slice information. The proposed method cannot strip the background full- view images, and it leads to the \hat{A} maybe inconsistent with the hypothetical phenomenon as Fig 1.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    as in 9

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper tackles the reconstruction of sparse-view CT images by proposing a FREquency-band-awarE and SElfguidED network, termed FreeSeed.

    The motivation on why to use image-domain only is very weak and is almost absurd at the point where a dual-domain methods including their network (included in the ablation study) outperforms the network that works purely on the image domain.

    1. For the Eq.4, why use L2 loss? Whether L1 loss is better or not?
    2. For Eq.5, how to calculate to get the mask M, i.e., what’s the transformation T? What’s the format of M and T?
    3. For the ablation study, 1) What does the simple masked loss L mean? What does the 1+M mean? 2) For Tab2, why does the performance decrease when comparing variant (4) to (3)? 3) Is there any visualization results for the ablation study to show the effectiveness of each step? 4) Whether the algorithm of calculating mask M will affect the performance? Does the mask quality significantly affect the performance?




Author Feedback

We thank the reviewers for their thorough summaries and valuable feedback. Below, we respond to the reviewers in order.

To AC & R1 ① Concern about motivation Great question. We agree that FreeSeed can be changed to a dual-domain method. Considering the potential secondary artifact issues [1,2] and typically limited access to the raw data due to commercial privacy, the image-domain version is more flexible and simple in practice, with fewer parameters that can be trained with relatively small amount of data. We prioritize its potential to outperform previous dual-domain methods and allow a miscellany of extensions for further improvement if raw data is available. [1] 10.3390/s19183941 (Sensors’19) [2] 10.48550/arXiv.1907.00273 (CVPR’19) ② Bottleneck of image-domain methods Previous image-domain methods struggle with limited receptive field for global strike artifacts. Instead, FreeSeed easily achieves image-wide receptive field with FFC. ③ Clarification of expressions Sorry for the confusion. “across all views” means “across different sparse scenarios”. We will fix it and other typos (e.g. “third and third rows” → “third and fifth rows”) in the next version. ④ Role of SeedNet SeedNet is a proxy module that dynamically highlights the area with large residual error and guides the FreeNet to weigh in this area. ⑤ Dataset We apologize for the ambiguity. We confirm that the dataset is split based on patients w/o information leakage between training and testing, and the complete data of one patient was used for testing. ⑥ Ablation Thanks for the constructive suggestions. PSNRs of FreeSeed w/o FFC are 34.49/38.35/42.89/48.64 dB under 18/36/72/144 views, respectively, which are inferior to FreeSeed and validate the effectiveness of FFC. More details will be added to the next version. ⑦ Detail of DuDoFreeSeed The sinogram sub-network (φ) of DuDoFreeSeed is a U-Net for sinogram inpainting, where each encoding stage takes in both sinogram/features and the inpainting mask. Then FBP(φ(S)) and FBP(S) are concatenated and fed to FreeSeed; S is the input sinogram.

To AC & R2 ① Loss function The PSNRs of Eq. 4 with L1 loss were 34.79/38.45/43.06/49.00 dB under 18/36/72/144 views, suggesting that L1 did not ensure stable performance gain when FFC was used. ② Mask calculation As in Sec. 2.3, transformation T converts the predicted residual \hat{A} into a binary mask M using its mean value as threshold, i.e. M_ij=1 if \hat{A}_ij > mean(\hat{A}). Note that the mask is adaptively generated during training and guides FreeNet to focus on most erroneous areas for further refinement. It outperforms fixed-value thresholding that can be affected by mask quality. ③ Setting of simple masked loss Thanks for the valuable feedback. This setting (Variant 4) assesses SeedNet in facilitating FreeNet to refine the erroneous area. A naive way (Variant 4) to do this is to multiply a coefficient map (1+M) on the loss map (A_f – \hat{A}), with a mask M ≥ 0 for extra penalty. As stated in Sec. 3.3, the performance gap between Variants 3 and 4 probably results from discontinuous gradients with such a mask. ④ Visualization These similar variants exhibit subtle differences in the reconstructed results, which were not included in the manuscript. We’d like to draw your attention to Fig. S3 in the supplementary material for visual comparisons of FreeNet and FreeSeed.

To R3 ① Concern about the features Thanks for your insightful comment. We agree that the extracted features contain not only the artifact pattern due to the entanglement of artifacts and details. However, they do capture rich information of artifact patterns, as evident in the learned band-pass maps in Fig. 1. We’d also like to clarify that our method does not strip the background full-view images, as stated by the reviewer. In contrast, the original CT slice information in the features necessitates SeedNet which further refines the residual learning. We hope this can help in re-evaluating the paper.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    In the rebuttal, the authors addressed most of the comments. I believe it would be better to shift the main focus to FreeSeed with dual-domain learning and leave the single image-domain version as a side product.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The idea of using Fast Fourier Convolutions (with learnable band-pass attention maps) to tackle the present artifacts is an innovative approach. The results, both in terms of quantitative and qualitative analysis, clearly indicate the superior performance of FreeSeed compared to existing sparse-view CT reconstruction methods.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Although R1 and R2 recommended to accept this paper, R2 raised serious problem in the initial review phase. However, R2 did not input anything in the post-rebuttal evaluation phase. I looked at the paper and rebuttal carefully. The authors did not well address R2’s concerns, in particular they even did not provide any explanations on the feature extraction with artifact patterns. They just said their method “does capture rich information of artifact patterns” in their response. This statement confused me. What rich information does the method capture? How to quantitize the information is rich? How to validate the information from artifact patterns? So I do not think the paper is in a good shape until the authors clear all concerns. I recommend to reject the paper.



back to top