Authors

Xinyi Yang, Bennett Chin, Michael Silosky, Daniel Litwiller, Debashis Ghosh, Fuyong Xing

Abstract

Deep neural networks have recently achieved impressive performance of automated tumor/lesion quantification with positron emission tomography (PET) imaging. However, deep learning usually requires a large amount of diverse training data, which is difficult for some applications such as neuroendocrine tumor (NET) image quantification, because of low incidence of the disease and expensive annotation of PET data. In addition, current deep lesion detection models often suffer from performance degradation when applied to PET images acquired with different scanners or protocols. In this paper, we propose a novel single-source domain generalization method, which learns with human annotation-free, list mode-synthesized PET images, for hepatic lesion identification in real-world clinical PET data. We first design a specific data augmentation module to generate out-of-domain images from the synthesized data, and incorporate it into a deep neural network for cross domain-consistent feature encoding. Then, we introduce a novel patch-based gradient reversal mechanism and explicitly encourage the network to learn domain-invariant features. We evaluate the proposed method on multiple cross-scanner 68Ga-DOTATATE PET liver NET image datasets. The experiments show that our method significantly improves lesion detection performance compared with the baseline and outperforms recent state-of-the-art domain generalization approaches.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_12

SharedIt: https://rdcu.be/dnwGP

Link to the code repository

https://github.com/xyang258/livdetSDG

Link to the dataset(s)

N/A

Reviews

Review #2

Please describe the contribution of the paper

This paper aims to generalize lesion detection tasks for PET images acquired from different scanners.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper aims to generalize lesion detection tasks for PET images acquired from different scanners by implementing the following strategies: 1) a novel data augmentation technique using multi-scale random convolutions to produce diverse textured PET images, covering various image representations from different scanners; 2) a cross-domain consistency loss for enhancing generalization; 3) a domain classification loss employing patch gradient reversal to achieve domain feature representation invariance with respect to local texture.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

I did not find
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

It is fine
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The paper is well-written, logically structured, and enjoyable to read. I have a few minor comments: 1) It would be beneficial to discuss any limitations associated with the proposed method. 2) While additional experiments may not be necessary, I am curious about the model’s performance when trained on dataset 1 and tested on dataset 2, in order to support the claim of generalization between the two datasets. 3) Adding some transformer-based results to the experiments would make it more intriguing. 4) Please define “gGR”.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is commendable, with substantial contributions and a logically well-written structure.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The authors aim to train a lesion segmentation model from synthesized PET liver images with neuroendocrine tumors (generated lesion from a healthy subject, so no manual annotations). To deal with the domain shift between the synthesized data and the true images, they propose a single-source domain generalization method based on a UNet architecture. The main components of the method include: (a) A data augmentation module is used to generate out-of-domain samples. Then the model is constrained with a cross-domain consistency on feature encoding between synthesized and augmented data. (b) A patch-based gradient reversal mechanism (patch-based version of domain adversarial training [4]) is used to encourage domain-invariant feature representation learning, so that the network is generalizable to unseen domains. The method is evaluated on cross-scanner PET liver NET image datasets and compared with the state of the art.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

To my knowledge, the method is rather novel, with data-augmentation, cross-domain consistency and patch-based reversal for domain invariance. The ablation study and comparison with previous works shows promising results.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

Some parts of the method should be clarified, some better motivated. See detailed comments below.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

I think the code and data are not provided. The description is not sufficient for reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. It could be interesting to compare the model trained on the synthesized data vs trained on one of the real datasets.
2. I think the statistical test is not correct. It is not clear whether different splits are used for the 5 runs used for evaluating the variations. But regardless of this, I think the independence assumption is violated and the test should be corrected, see [1]. [1] Nadeau, Claude, and Yoshua Bengio. “Inference for the generalization error.” Advances in neural information processing systems 12 (1999).
3. “we randomly split the list mode-synthesized dataset and the Real1 dataset into 60%, 20% and 20% for training, validation and testing, respectively. Due to the relatively small size of Real2, we use a two-fold cross-validation for model evaluation on this dataset.” I thought the unseen domains would be kept for testing, I don’t understand the splits.
4. “lesion detection model H = E + D”, it is not correct to represent it as a sum.
5. Why is SDG more realistic than MDG? A model can likely be trained with data from multiple domains.
6. Please clarify and motivate the foreground/background and image intensity inversion of x_M. Including “the voxels that have a distance greater than half of the image…”.
7. With L_{con}, you try to make the model invariant to local textures and other variations in the augmented images. This lacks motivation I think, as saying that the augmented images should have identical semantic content, such as lesion presence, quantity and positions does not necessarily mean it should be invariant to the augmentations.
8. Eq. 5, Maybe it would be cleaner to have a lambda also for L_{det}, and the lambdas sum to one or something like this.
9. “with a threshold (i.e., 0) to binarize the map followed by a connected component analysis.” this is not clear. The prediction map is \hat{y} ? It is continuous, I would expect a threshold of e.g. 0.5 ? And what is the connected component analysis ?
10. Replace SGD by SDG in multiple places.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

It is a good method with promising results, yet the paper requires multiple fixes to clarify the method, motivate some choices, clarify the statistical tests etc.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #4

Please describe the contribution of the paper

This paper presents a UNet-like framework trained using the simulated single-source data, for the hepatic lesion detections in two additional clinical datasets. Existing Random Convolution Operators and GAN-based discriminators are adopted in to improve the data generalization performance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. This study adopts both simulated and clinical data for testing.
2. The experiments and validations are relatively complete.
3. Additional ablation studies improve the credibility of the proposed method.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The technical novelty is relatively limited with the widely-used UNet-like structure and the existing Random Convolution layer.
2. The paper arragenement is somewhat problematic: the important qualitative image results should be put in the main text instead of supplementary material. The comparisons in the main text are all quantitative.
3. The descriptions of the methodlogy are unclear and obscure in some parts.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

I can’t tell. In the submission system, their answer to “releasing training code/model” is “Yes”, but they don’t claim to open the source code/model online in the manuscript.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. What is the logical reasoning of this sentence “… which introduces an additional layer of algorithm variability, and this will be particularly challenging for PET images that typically have a poor signal-to-noise ratio and low spatial resolution … “.
2. The qualitative comparison figures should be put in the main text instead of SI. Instead, the less important Fig. 1 (and even Fig. 3) can be put in SI.
3. Please correct grammatical errors (if any) in this sentence “… Instead of back propagating reversed gradient from a single-value domain-label prediction of the entire input image, we …”
4. What is the meaning of multiplying the gradients of the classfier C with -1? Is this the GAN-like adversarial discriminator?
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. The qualitative comparison figures should be put in the main text instead of SI. Instead, the less important Fig. 1 (and even Fig. 3) can be put in SI.
2. What is the meaning of multiplying the gradients of the classfier C with -1? Is this the GAN-like adversarial discriminator?
3. What is the logical reasoning of this sentence “… which introduces an additional layer of algorithm variability, and this will be particularly challenging for PET images that typically have a poor signal-to-noise ratio and low spatial resolution … “.
4. Please correct grammatical errors (if any) in this sentence “… Instead of back propagating reversed gradient from a single-value domain-label prediction of the entire input image, we …”
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper proposes a novel single-source domain generalisation approach for lesion detection in real PET images. Reviewers appreciated the main strengths and acknowledged the technical novelty of the paper, the effectiveness of the proposed modules via ablation studies, and the promising experimental results via comparisons. However, reviewers also raised some writing issues, such as the unclear motivations of some proposed modules, and some minor grammar errors. These issues should be addressed in the final version.

Author Feedback

We appreciate the valuable comments from the area chair and the reviewers. We will revise the manuscript based on the comments and suggestions when preparing the camera-ready version.

back to top

Learning with Synthesized Data for Generalizable Lesion Detection in Real PET Images