Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Wangduo Xie, Matthew B. Blaschko

Abstract

CT images corrupted by metal artifacts have serious negative effects on clinical diagnosis. Considering the difficulty of collecting paired data with ground truth in clinical settings, unsupervised methods for metal artifact reduction are of high interest. However, it is difficult for previous unsupervised methods to retain structural information from CT images while handling the non-local characteristics of metal artifacts. To address these challenges, we proposed a novel Dense Transformer based Enhanced Coding Network(DTEC-Net) for unsupervised metal artifact reduction. Specifically, we introduce a Hierarchical Disentangling Encoder, supported by the high-order dense process, and transformer to obtain densely encoded sequences with long-range correspondence. Then, we present a second-order disentanglement method to improve the dense sequence’s decoding process. Extensive experiments and model discussions illustrate DTEC-Net’s effectiveness, which outperforms the previous state-of-the-art methods on a benchmark dataset, and greatly reduces metal artifacts while restoring richer texture details.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43907-0_8

SharedIt: https://rdcu.be/dnwb6

Link to the code repository

N/A

Link to the dataset(s)

https://nihcc.app.box.com/v/DeepLesion

spineweb.digitalimaginggroup.ca


Reviews

Review #2

  • Please describe the contribution of the paper

    This paper introduces a Dense Transformer based encoder-decoder solution to metal artifact reduction on CT images. This solution tackles the problem in an unsupervised fashion, which avoids the blocker encountered by supervised counterparts that clean and contaminated image pairs are hard to find. Specific to the architecture, the authors propose the hierarchical disentangling encoder (HDE) to incorporate low level features before decoding. In the decoding phase, they also introduce a second-order disentanglement to enhance the network. Empirical study shows all these components are beneficial and contribute to the SOTA performance in the end.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The encoder retains the low-level features. As illustrated in Fig 3 and Eq 1, the low-level features are preserved both by the skip connection and dense concatenation.
    2. Transformer building block affords the long-range dependency to be learned.
    3. The ablation study is well designed to confirm the effectiveness of each building block, and the final performance pushes the boundary of the SOTA.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Why do low-level features matter? Though conceptually, it makes sense for final image generation. I don’t see either a reference telling that or empirical evidence confirming the lower-level features directly helping. I would suggest the authors at least show some evidence that low-level is one of the key points. For instance, if we remove the X_l connection in Fig 2. (a), would it matter in the end?

    2. Fig. 2 is not clear to me. I have struggled to understand this figure. (1) what do x_a and y_c mean? (2) how are x_{l, h, m, s} and c_{x, y} named? (3) Do these encoders are parallel to each other or each encoder simply corresponds to one DTD module in Fig. 3?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I don’t have concern on reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please address my concerns on the weakness part. I highly suggest the authors to revise Fig 2.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    All building blocks in the solution are well designed. However, they are complicated on the other hand. Performance is satisfying while the scope of this paper is relatively limited.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The method introduces a hierarchical disentangling encoder, to obtain densely encoded sequences with long-range correspondence. Besides, the presented second-order disentanglement method is used to improve the dense sequence’s decoding process. Experiments show the performance is better than ADN.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed disentanglement method is interested, helping the unsupervised MAR capture better features and model the globally corrupted artifacts

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The method seems to only propose a newly designed disentangling method and network structure, while lacking novel insight about the unsupervised setting of the MAR problem. Besides, the experimental results are not complete, such as the model performances across different shapes and sizes of metal implants. The clinical data is also not included, which makes it not practical.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Clear to reproduce.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    To improve the work,

    1. The authors should reconsider the unsupervised setting itself, why the proposed network structure is only suitable for this setting, and why not explore it in the supervised setting?
    2. Complete experiments should be involved, such as testing with different shapes and sizes of metal implants to confirm the model performance.
    3. Additional experiments should be conducted for clinical data since the MAR problem should be considered for practical usage. Therefore, only testing on the simulation data is not enough.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The lacking of contribution on the unsupervised setting of MAR problem, and the lacking of complete experiments.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    The authors provide reasons why unsupervised MAR is important itself, while lacking the motivation of the method, i.e., why the proposed method matches with the setting. Specifically, from the perspective of network structure, although the proposed structure improves the unsupervised MAR performance, will the structure also helps with the supervised setting, which is also necessary to be discussed.



Review #5

  • Please describe the contribution of the paper

    The authors proposed a Dense Transformer based Enhanced Coding Network (DTEC-Net) for unsupervised metal-artifact reduction. The author tries to adopt swin transformer to better recover low-level characteristics. Some results have been obtained in metal artifact removal

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors proposed a DTEC-Net for metal-artifact reduction, Extensive experiments show DTEC-Net outperforms the previous state-of-the-art methods on a benchmark dataset.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed DTEC-Net is rather a combination of known methodologies than an exceptional contribution to the field.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The proposed DTEC-Net is a combination of existing methods, the process is clear, and the release code at the time of publication is mentioned in the abstract. I believe it’s repeatable

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The authors should design the underlying network for metal artifact removal rather than just a combination of known methodologies. 2.Given that the transformer-based comparison method is better than all the other comparison methods for all cases, it would be interesting to use more transformer-based approaches for comparison.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Though the paper may present some interest from the application point of view, it’s rather a combination of known methodologies than an exceptional contribution to the field, so I recommend weak acceptance.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The paper received mixed reviews, with two weak accept and one weak reject recommendations. The area chairs considered the paper and the reviewers’ comments and agreed with the following strengths of the paper, i.e., disentanglement learning is interesting for metal artifact reduction; there are generally lots of experimental evaluations; the paper is generally clear to reproduce. At the same time, there are some concerns with the paper: the contribution for unsupervised metal artifact reduction is not clear, particularly compared with existing methods; some necessary experimental evaluations, including the effectiveness of low-level features are lacking; some parts of descriptions and figures are not clear enough. Based on the reviewers’ comments, the area chair decided to invite the authors to provide a rebuttal to address these concerns.




Author Feedback

We appreciate the reviewers’ comments. We will incorporate this feedback while preparing the camera ready version of our paper.

(R#2) Q1(Low-level features matter): The importance of low-level information in image restoration is indeed supported by section 3.3-(2), section 3.3-(3) of our reference[20] (ICLR2023). Moreover, the “Different Model Architectures” of section 2.1 in the supplementary material of [20] also illustrates this point.
Q2(Description of Fig.2): (1) According to the first paragraph of section 2.1, the x_a means inputs with artifacts. The y_c means the unpaired clean images without Metal Artifacts(MA). (2) According to the second paragraph of section 2.1, the X_l is defined as the “list” of hierarchical information. The x_h means the “high” semantic features and x_m means the “MA” parts in latent space. The x_s represents the overall “structural part” of the image (the part without artifacts). The c_x and c_y indicate the latent feature of the traditional first-order disentanglement latent feature, which is detailed in [8] cited below Fig.2. The subscript x and y is to distinguish the two components. (3) The encoders are parallel to each other.

(R#4) Q1(Contribution to unsupervised setting):

  1. Because it’s almost impossible to obtain a large-scale real clinical data set containing both MA-affected images and clean images representing the same area to complete supervised training. For example, when performing real-time CT imaging of a patient with a cardiac pacemaker, the pacemaker with metal part cannot be removed to obtain a MA-free image. We can only aim at how to improve the performance of unsupervised methods rather than how to improve supervised methods.
  2. We noticed the importance of low-level information and long-range correspondence to removing MA. So we designed the HDE and DTD according to this insight. We also proposed the newly designed disentangling method to make this approach more feasible. We demonstrate that the algorithm is very effective in unsupervised MA removal through extensive experiments, so we achieve our goal of boosting unsupervised methods and giving new insight into unsupervised MA-removal. Q2(Complete experiment for measuring performance): The benchmark dataset is proposed by 21. The dataset and its slight variants are widely adopted for evaluating algorithms’ performance by 8, 9, 11, 15. The test dataset we used indeed includes metal implants with different size: [9,14,41,41,43,90,158,316,321,765] (in pixels). It also contains a series of shapes ranging from arc, circle, to screw, etc. So, it can be well used to measure the performance of different methods. Q3(Clinical experiments): We did experiments on a real clinical dataset. Our method is trained on real datasets in an unsupervised manner and can significantly remove MA at test time. Our method can restore more detail in the metal implants area than the baseline method AND [8] and is sharper at the edges of different tissues. Because the real data set doesn’t have ground truth, the numerical results of PSNR/SSIM/MSE can’t be calculated. We will present the qualitative results in the appendix.

(R#5) Q1(Design the underlying network): Densely encoded sequences with long-range correspondence and second-order disentanglement are our specially designed focus for MA Reduction. The contribution is also detailed in the answer to (R#4)’s Q1. Q2(More transformer-based approaches): The third row in Table 1 indicates that the data using only the transformer is 0.8 dB lower than our method, and MSE is 4.51 higher than ours. At the same time, the ablation study reported that great instability appears in generative adversarial training if only a transformer is used. So, only using the transformer cannot achieve good results in reducing metal artifacts. More transformer-based approaches with high performance and stability can be explored in the future.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper studied the unsupervised metal artifact reduction problem and proposed a Transformer based method to perform artifact reduction based on hierarchical features and disentanglement. The method is evaluated and compared with multiple baseline methods. The rebuttal provides additional details about the method, figure, experiments, etc. The discussion about how the method advances the unsupervised image enhancement domain (a number of methods can deal with unpaired data-based image translation) still require additional analysis.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors addressed some of the key concerns raised by the reviewers. The question regarding unsupervised metal artifact reduction is a bit curious given that the approach used unpaired sets of images and clearly lays out the rationale for not using supervised learning with sinograms in the introduction. There could always be more comparisons, but the authors seem to have done a fairly reasonable job in addressing the reviewers concerns. Assuming the authors will fix the images, which does need to be updated, the paper could be accepted for publication.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper introduces a Dense Transformer based encoder-decoder solution to metal artifact reduction on CT images. This solution tackles the problem in an unsupervised fashion, which avoids the blocker encountered by supervised counterparts that clean and contaminated image pairs are hard to find. The major strengths of the paper are listed: 1) a DTEC-Net for metal-artifact reduction are provided; 2) experiments show DTEC-Net with promising results; 3) The ablation study is well designed. However, this paper lacks the motivation of the proposed method and lacks novel insight about the unsupervised setting of the MAR problem, why the proposed model works well on this topic? Additional experiments are necessary to evaluate the effectiveness of the proposed method. Combining the comments of the reviewer and myself, it is a fair paper with weakness very slightly weigh over merits.



back to top