Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Jiadong Zhang, Zhiming Cui, Caiwen Jiang, Jingyang Zhang, Fei Gao, Dinggang Shen

Abstract

Positron emission tomography (PET) is an important medical imaging technique, especially for brain and cancer disease diagnosis. Modern PET scanner is usually combined with computed tomography (CT), where CT image is used for anatomical localization, PET attenuation correction, and radiotherapy treatment planning. Considering radiation dose of CT image as well as increasing spatial resolution of PET image, there is growing demand to synthesize CT image from PET image (without scanning CT) to reduce risk of radiation exposure. However, existing works perform learning-based image synthesis to construct cross-modality mapping only in the image domain, without considering of the projection domain, leading to potential physical inconsistency. To address this problem, we propose a novel PET-CT synthesis framework by exploiting dual-domain information (i.e., image domain and projection domain). Specifically, we design both image domain network and projection domain network to jointly learn high-dimensional mapping from PET-to-CT. The image domain and the projection domain can be connected together with a forward projection (FP) and a filtered back projection (FBP). To further help the PET-to-CT synthesis task, we also design a secondary CT-to-PET synthesis task with the same network structure, and combine the two tasks into a bidirectional mapping framework with several closed cycles. More importantly, these cycles can serve as cycle-consistent losses to further help network training for better synthesis performance. Extensive validations on the clinical PET-CT data demonstrate the proposed PET-CT synthesis framework outperforms the state-of-the-art (SOTA) medical image synthesis methods with significant improvements.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16446-0_72

SharedIt: https://rdcu.be/cVRUg

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    To obtain high-quality CT images while reducing the risk of radiation exposure, this paper proposed a novel framework that exploits dual-domain information to synthesize CT images from PET images. Specifically, the authors designed a main PET-to-CT synthesis task and a secondary CT-to-PET synthesis task, employing four networks to jointly learn both image and projection domain information. The FP and FBP are employed to connect the image domain and projection domain, thereby constructing a bidirectional synthesis framework with several closed cycles. Furthermore, a two-stage training strategy with dual-domain consistency and cycle consistency is adopted to facilitate network training for superior synthesis performance. The experimental results demonstrate that the proposed method significantly outperforms other state-of-the-art medical image synthesis methods in PET-CT synthesis.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    a) Different from previous cross-modality image synthesis methods that only exploit image domain information, the authors designed a novel dual-domain PET-CT synthesis framework to perform synthesis in both image and sinogram domains. b) To enhance the PET-to-CT synthesis performance, the authors introduce a secondary CT-to-PET synthesis task and a two-stage training strategy with dual-domain consistency and cycle consistency to build a bidirectional mapping framework, encouraging structural consistency and stable convergency. c) Experimental results show the effectiveness of the main contributions and the state-of-the-art image synthesis performance of the proposed method. d) The methodology section including the network architecture as well as the objective functions is clearly described and the entire writing is easy to understand.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    a) The motivation for this paper is not clear. In “Abstract” section, the authors mentioned that “Considering radiation dose of CT image as well as increasing spatial resolution of PET image, there is growing demand to synthesize CT image from PET image (without scanning CT) to reduce risk of radiation exposure.” To the best of our knowledge, CT scanning is cheaper with sufficient training samples in clinical practice, while PET scanning is relatively more expensive with limited available data. In addition, PET scanning also suffers from radiation exposure risk. Why to use PET to synthesize CT? b) Although the authors claimed that the proposed method “is the first time to exploit dual-domain information in cross-modality image synthesis tasks, especially for PET-CT synthesis”, I still doubt that the innovation of this paper is not sufficient. As shown in [7], the dual-domain network for improving CT image quality was proposed as early as 2019. And the bidirectional mapping idea that comes from CycleGAN [4] could also date back to 2019. c) As displayed in Table 1 and Table 2 of the “Experiments” section, the SSIM metric obtained by the Base variant has no significant improvement over the RU-Net, while the authors claim the secondary CT-to-PET task could contribute to the PET-to-CT synthesis task in structural consistency. Does the CT-to-PET task really work? What’s more, in Table 2, the SSIM result of the model incorporating both image domain cycle-consistent loss and cross-domain cycle-consistent loss is even worse than that of the model only employing cross-domain cycle-consistent loss. d) In “Experiments” section, the training details, such as the number of training epochs and whether the cross-validation strategy is adopted, are not stated. e) The proposed method has not been compared with the existing cross-modality image synthesis methods. Furthermore, the comparisons with M-GAN [2], P2PGAN[6], and U-Net[12] are not appropriate, since [2] and [12] are designed for PET image synthesis, and [6] is proposed for natural image translation. It is unfair to compare the proposed framework with these methods that do not aim at the CT synthesis task.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The description of the training details in this paper is insufficient. Moreover, the authors did not give any positive response regarding the code release in the reproducibility checklist. So I think this paper does not have good reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    a) The authors should state the motivation more clearly. To the best of my knowledge, this is the first work to synthesize CT images from PET. I wonder why the authors use PET, which is more expensive to acquire and has limited available data, to generate CT images that are much cheaper and easier to obtain? b) The authors should give a more detailed review of existing cross-modality synthesis methods, especially for CT synthesis. c) “Training four networks in Stage 2 will bring more computational costs with little benefit compared with the case of only training image domain networks.” The paper lacks the analysis of the concrete performance and computational cost gaps induced by training two/four networks. It would be better to add a corresponding ablation study or a comparison experiment. d) To validate the superiority of the proposed method, the authors should compare it with state-of-the-art methods for CT synthesis. e) In the second paragraph of “Ablation Study” section, “necessity of employing tje secondary” should be changed to “…employing the secondary”.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper implemented a PET-to-CT synthesis framework by exploiting both image and sinogram information and constructing bidirectional cycle mapping. The authors have conducted several comparison and ablation experiments to prove the effectiveness of the proposed method and the validity of the key components. However, its disadvantages are also obvious. As mentioned before, the motivation for this article was not stated clearly, since the cost of acquiring abundant PET images for training is much higher than that of CT. In addition, the details of existing CT synthesis methods need to be better described. Another serious problem is that current comparison methods are somewhat outdated and not sufficient to prove the superiority of the proposed method. Finally, the innovation of this paper is not sufficient, since the dual-domain information has been introduced to enhance CT quality before. And the cycle consistent constraint has also been widely used.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    4

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    In the rebuttal, the authors explained their novelty in jointly exploring dual-domain information for two different modalities. How ever, although the authors have tried to answer the questions raised by reviewers, I think that the motivation for this paper is still not clear. Since the cost of acquiring abundant PET images for training is much higher than that of CT.

    To sum up, I could give only one more score.



Review #2

  • Please describe the contribution of the paper

    The authors propose a novel dual-domain PET-CT synthesis framework. They design training strategy learning PET-to-CT mapping jointly in both projection and image domains. Additional CT-to-PET mapping is also learned to help the main PET-to-CT task. Extensive experiments show the new framework outperforms the SOTA methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The method is novel that using dual-domain learning in image-to-image translation. It makes sense to use FP and FBP operators to connect image and projection domain.
    • The experiments extensive to support the conclusion.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The two-stage training strategy is complex and not elegant.
    • Ablation study should compare models with and without domain-domain losses to claim the effectiveness of dual-domain learning.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The dataset is private and the codes are not released.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    see p5

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea is interesting and the experiments partly support the main contributions. It is novel to use dual domain learning in PET to CT transformation and cycle consistent losses help to boost the performance.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    The rebuttal resolves all of my concerns and I increase the original rating.



Review #3

  • Please describe the contribution of the paper

    In this paper, the authors proposed a novel dual-domain (image domain and projection domain) PET-CT synthesis framework. With using forward and backward projection to connect image and projection domain, the proposed two-stage training strategy with dual domain consistency and cycle consistency achieved better synthesis performance. The proposed method outperformed the SOTA methods through extensive validation on clinical PET-CT data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper is generally well written. By exploiting the dual-domain information, the proposed method outperformed the previous SOTA methods substantially in both quantitative and visual evaluations.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    This method is based on 2D image processing. However, volumetric information in 3D data is essential for the majority of medical tasks.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method description is clear with sufficient details.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. In the Introduction, second sentence: “PET records the consumption of glucose in organs, revealing their metabolic characteristics and thus potentially the disease status”. This is not accurate. Only FDG tracer records the consumption of glucose in organs, but there are other types of PET tracers as well.
    2. In the abstract and introduction, the authors claimed that “existing works perform learning-based image synthesis to construct cross-modality mapping only in the image domain, without considering of the projection domain, leading to potential physical inconsistency”. This is not correct. Please refer and cite this paper for accurate literature review: Shi, Luyao, et al. “A novel loss function incorporating imaging acquisition physics for PET attenuation map generation using deep learning.” International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2019.
    3. In Section 3.1, image volumes were sampled into 2D slices and then split into training/validation/testing samples. Did the authors make sure that the training/validation/testing slices are from different patients for each set? If so, please mention that in the text.
    4. This method is based on 2D image processing. However, volumetric information in 3D data is essential for the majority of medical tasks. I can understand that expanding this work to 3D processing can be challenging due to increased computation workload, more complex forward/backward projection and limited data. However, if the authors can show that the proposed method is better than a 3D U-Net or a 3D GAN (with a typical 32x32x32 3D patch size for example), it can make this work more impactful and more convincing for real clinical applications (optional).
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Despite focusing on 2D data processing (which is understandable in early development), the proposed method exploited both image domain and projection domain information and achieved significantly improved results. This might potentially make an impact in clinical applications once the method is adapted to 3D volume data processing.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposed a novel dual-domain (image domain and projection domain) PET-CT synthesis framework with forward and backward projections to connect image and projection domains. The idea is interesting and the experiments partly support the main contributions. However, the motivation was not stated clearly, since the cost of acquiring abundant PET images for training is much higher than that of CT. This method is based on 2D image processing, while volumetric information in 3D data is essential for the majority of PET/CT tasks.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    3




Author Feedback

Thanks for all Reviewers’ positive and thoughtful comments to help us improve paper quality. All questions are answered point by point below.

Q1(Meta-R&R1): Motivation was not stated clearly. A: Our main motivation is to reduce CT radiation risk during PET/CT scanning. In clinical application, PET in PET/CT scanner is acquired to detect cancer in patient, and CT is collected for anatomical localization and PET attenuation correction. To reduce radiation risk from CT scan, we propose a novel medical synthesis method to synthesize a CT image from each PET image for non-diagnosis usage (e.g., attenuation correction of acquired raw PET). Actually, some similar works have been previously explored for PET-CT synthesis task [1,4], as also confirmed by R3 for its clinical impact.

Q2(Meta-R&R3): The method is based on 2D image processing, while volumetric information in 3D data is essential for majority of PET/CT tasks. A: Thanks for great comment. First, we would like to emphasize that our proposed PET-CT synthesis framework can deal with both 2D and 3D cases. But, in this paper, we demonstrate its ability in 2D case, considering 1) large slice thickness (such as 3mm) in our current data and 2) small number of training 3D samples in current stage. Currently, we are collecting more training samples, and will then demonstrate performance of our method in 3D case.

Q3(R1): Dual-domain information and bi-direction mapping in CycleGAN have been proposed before. A: To our limited knowledge, dual-domain knowledge is well exploited for single-modality cases, not jointly for two different modalities (i.e., PET and CT in our paper). Specifically, we combine dual-domain knowledge with two modalities to build a bidirectional closed loop with multiple cycles (i.e., mapping in cycles). More importantly, inspired by the initial idea of CycleGAN, we further design three cycle-consistent losses (one image-domain cycle and two cross-domain cycles) to keep content consistency cross two modalities in dual domains. This novelty is also confirmed by R2 and R3.

Q4(R1): SSIM has no significant improvement by comparing ‘Base’ with ‘RU-net’, indicating the second CT-to-PET task is useless. In Table 2, SSIM drops 0.001 when adding image domain cycle-consistent loss. A: As for the first question, actually our results demonstrated usefulness (instead of uselessness) of the second CT-to-PET task by providing ‘Base+S (Table1)’ with ‘RU-net (Table2)’. In particular, SSIM increases from 95.6% to 98.1%, with p-value being less than 0.05, indicating statistically significant improvement. As for the second question, when we add an image domain cycle-consistent loss, we get comparable performance for SSIM metric (although slight decrease of 0.1%), but great improvement on both PSNR (1.8%) and NRMSE(7.8%) metrics, indicating effectiveness of the image domain cycle-consistent loss.

Q5(R1): It is unfair to compare the proposed framework with MGAN, p2pGAN and Unet that don’t aim at the CT synthesis task. A: It is worth indicating that actually these models have been widely used for medical image synthesis tasks, i.e., PET-CT synthesis in [1,4,a]. Also, most current medical image synthesis papers adopted these models for performance comparison [b]. [a]Arm, et al. Independent attenuation correction of whole body pet using a deep learning approach with generative adversarial networks. EJNMMI research(2020) [b]Wang, et al. Synthesize high-quality multi-contrast magnetic resonance imaging from multi-echo acquisition using multi-task deep generative model. TMI(2020)

Q6(R2): Ablation study should prove effectiveness of dual-domain learning. A: Actually, we have proved effectiveness of dual-domain learning in the last paragraph of ablation study section.

Q7(R1&R3): Lack of details. A: In the final revised paper, we will add more details of training and data, and correct false statement of FDG-PET. We split data by patients. We will release code after acceptance of paper.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposed a novel dual-domain (image domain and projection domain) PET-CT synthesis framework with forward and backward projections to connect image and projection domains. The idea is interesting and the experiments partly support the main contributions. However, the motivation was not stated clearly, since the cost of acquiring abundant PET images for training is much higher than that of CT. This method is based on 2D image processing, while volumetric information in 3D data is essential for the majority of PET/CT tasks. The rebuttal partially addressed some concerns and two reviewers raised their scores. Overall, I think the merits out-weigh the limitations; therefore, I recommend accepting this work.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    5



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I believe the authors have provided answers to most of the major concerns. The application is important to the MICCAI community and the paper should be accepted. The additional information provided in the rebuttal should be included in the final paper.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposed a dual-domain (image domain & projection domain) learning framework for cross-modality (PET-CT) synthesis, which is an interesting attempt. The claimed technical contribution is sound, and the experimental results are in general supportive. On the other hand, the current method is 2D, and the selected comparing methods lack 3D based models. Meanwhile, I have the same question as Reviewer#1 about the motivation of PET-to-CT synthesis: in addition to expense, it seems PET may also need CT for attenuation correction as a common practice. Despite these, the overall quality of the paper is good.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    8



back to top