Authors

Wenjun Xia, Ziyuan Yang, Qizheng Zhou, Zexin Lu, Zhongxian Wang, Yi Zhang

Abstract

Sparse-view computed tomography (CT) is one of the primary means to reduce the radiation risk. But the reconstruction of sparse-view CT will be contaminated by severe artifacts. By carefully designing the regularization terms, the iterative reconstruction (IR) algorithm can achieve promising results. With the introduction of deep learning techniques, learned regularization terms with convolution neural network (CNN) attracts much attention and can further improve the performance. In this paper, we propose a learned local-nonlocal regularization-based model called RegFormer to reconstruct CT images. Specifically, we unroll the iterative scheme into a neural network and replace handcrafted regularization terms with learnable kernels. The convolution layers are used to learn local regularization with excellent denoising performance. Simultaneously, transformer encoders and decoders incorporate the learned nonlocal prior into the model, preserving the structures and details. To improve the ability to extract deep features during iteration, we introduce an iteration transmission (IT) module, which can further promote the efficiency of each iteration. The experimental results show that our proposed RegFormer achieves competitive performance in artifact reduction and detail preservation compared to some state-of-the-art sparse-view CT reconstruction methods.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16446-0_75

SharedIt: https://rdcu.be/cVRUj

Link to the code repository

https://github.com/Deep-Imaging-Group/RegFormer

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

This paper proposed a novel transformer-based iterative reconstruction model for sparse-view CT reconstruction problem. The authors combined convolution and transformer branches to simultaneously extract local and non-local regularization terms. Besides, an iteration transmission module was introduced to further promote the efficiency of each iteration. The experiment results demonstrated competitive performance in artifact reduction and detail preservation.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper introduced a novel unfolded iterative reconstruction model, which combined convolution and transformer branches to simultaneously extract local and non-local regularization terms.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) In section 1, the authors indicated that one of their contributions was replacing back-projection with FBP to improve the performance of network. However, many previous works, e.g., AirNet, have already demonstrated the efficiency of FBP operator. 2) In fig. 4, the PSNR of FISTA-Net was higher than TGV, while the SSIM was lower. Could you please explain the contrary result? 3) In section 4, the authors indicated that independent iterations made the network extract only shallow features. Since the proposed network was built by unfolding iterations, these iteration blocks were cascaded to form the whole network. Thus, the network was supposed to be capable of extracting both shallow and deep features. 4) The manuscript does not give much discussions about why the method could improve the results, and how could the method be further improved.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The manuscript has clearly described the network architecture and the reconstruction workflow. Most of the work is reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

1) In this paper, one of the main contributions was the iteration transmission module. AirNet contained dense connections between iterations, which was a different form of iteration transmission. It is recommended to compare the proposed network with AirNet. 2) Ablation study was needed to verify the effectiveness of the proposed iteration transmission module. 3) As shown in figure 3, local and nonlocal regularizations are alternatively conducted in the proposed method. Does it work better than the workflow using only nonlocal regularization? Please give explanations or make a comparison. 4) In Result section, only 64-view-reconstructions are presented. Could the proposed method handle the reconstruction with fewer projections with little degradation? This should be added to the manuscript. 5) According to the paper, both nonlocal iteration block and the iteration transmission improve the reconstructions. Whether the NLIB or the IT makes more contributions to the reconstruction improvement? Please make the comparison. 6) The authors might modify their writings. There were some grammatical errors, e.g., in section 2.2, ‘After the merged windows are fed into …’
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper proposed a novel hybrid network in unfolding manner, which combined convolution and transformer branches to simultaneously extract local and non-local regularization terms. However, more comparison results and ablation study were still needed to verify their main contributions.
Number of papers in your stack

5
What is the ranking of this paper in your review stack?

2
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

This paper proposed a sparse-view CT reconstruction network incorporated with transformer. The proposed network introduced FBP into the iterative schemes instead of back-projection to accelerate the convergence.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

a. The motivation to introduce transformer is clearly described. b. A new module called iteration transmission is proposed to connect the iterations to improve the deep feature extraction and structure preservation. c. The results look promising.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

a. The filter of FBP is fixed with RL filter, a learned filter may be better. b. The training is supervised learning and unsupervised training will be more clinically valuable. c. More task-based evaluations will better demonstrate the performance.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The implementation of the network looks simple, but we still hope the authors release the codes.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

a. The authors should clarify the motivation of IT module, i.e. why existing networks cannot extract deep features. b. Task-based evaluations should be added. c. Clinical metrics should be adopted.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper introduced transformer architecture into unrolling-based CT reconstruction network, proposed a novel module to extract deeper features, and achieved promising results.
Number of papers in your stack

1
What is the ranking of this paper in your review stack?

1
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

1 combine both local and nonlocal regularizer with neutral networks to achieve general prior. 2 apply IT module to connect the iterations to improve deep feature extraction.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1 design an iteration block to simultaneously learn the local and nonlocal regularizer. 2 IT model is proposed to build the communications between different iterations.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1 2d and small dimension images are reconstructed. Not sure if the method can handle the real and large 3d data volume. 2 using FBP to replace the backprojection destroys the conjugate operation requirement on AAt. The authors are expected to perform careful evaluations. 3 As shown in Figs 4 and 6, subtle image quality improvement is observed. The real impact of the work is thus doubleful.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Well done.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

The authors are expected to provide the evaluation using full dimension dataset, usually with more than a few hundreds of 2d projection data, each of more than 1024*1024 dimension. 3d CT images are also expected to fully demonstrate the calculation speed and memory consumption of the proposed network. The FBP operation is not a common way to perform backprojection though the literature mentioned this operation (ref.28). It destroys the conjugate property of the forward and backward operations. Not sure if the authors derived the formula which may justify the FBP operation. The presented results are in perfect low noise level. The authors are expected to evaluate the algorithm when facing high noise fluctuation.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Similar publications were found in literatures within the past five years. Everyone claimed nice results but not readily usable in clinic. Many authors did not understand the clinical requirement and only focus on algorithm parameter tuning. This manuscript is still following the old way though it includes detailed description of the network structures and results.
Number of papers in your stack

4
What is the ranking of this paper in your review stack?

3
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
Contribution

A transformer-based iterative reconstruction model for sparse-view CT reconstruction problem, which combines convolution and transformer branches to simultaneously extract local and non-local regularization terms. An iteration transmission module was introduced to further promote the efficiency of each iteration. The experiment results demonstrated competitive performance in artifact reduction and detail preservation.
- Novel unfolded iterative reconstruction model, combining convolution and transformer branches to simultaneously extract local and non-local regularization terms.
- Iteration transmission module further promotes the efficiency of each iteration.
- Transformer is well motivated and described.
- Promising results.
Weaknesses to address
- Authors indicate that incorporating filtered backprojection to improve performance is a novel contribution, but many previous works, e.g., AirNet, have already done this. The authors may need to carefully evaluate the effect of destroying the conjugate nature of the operation. They may also see if the optimal filter could be learned.
- Unclear whether the method easily extends to large 3D volumes, and the unsupervised training may be a problem for a method to be applied in a clinical setting. Discussion needed.
- Image quality improvement is subtle, so the real impact may be doubtful. Task-based evaluations may be needed to demonstrate clinical utility.
- Clarify whether code will become available on acceptance.
What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

2

Author Feedback

N/A

back to top

A Transformer-Based Iterative Reconstruction Model for Sparse-View CT Reconstruction