Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert, Julia A. Schnabel

Abstract

Early and accurate disease detection is crucial for patient management and successful treatment outcomes. However, the automatic identification of anomalies in medical images can be challenging. Conventional methods rely on large labeled datasets which are difficult to obtain. To overcome these limitations, we introduce a novel unsupervised approach, called PHANES (Pseudo Healthy generative networks for ANomaly Segmentation). Our method has the capability of reversing anomalies, i.e., preserving healthy tissue and replacing anomalous regions with pseudo-healthy (PH) reconstructions. Unlike recent diffusion models, our method does not rely on a learned noise distribution nor does it introduce random alterations to the entire image. Instead, we use latent generative networks to create masks around possible anomalies, which are refined using inpainting generative networks. We demonstrate the effectiveness of PHANES in detecting stroke lesions in T1w brain MRI datasets and show significant improvements over state-of-the-art (SOTA) methods. We believe that our proposed framework will open new avenues for interpretable, fast, and accurate anomaly segmentation with the potential to support various clinical-oriented downstream tasks.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43904-9_29

SharedIt: https://rdcu.be/dnwG7

Link to the code repository

https://github.com/ci-ber/PHANES

Link to the dataset(s)

https://brain-development.org/ixi-dataset/

https://fcon_1000.projects.nitrc.org/indi/retro/atlas.html

https://fastmri.org


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents an unsupervised anomaly detection (UAD) algorithm that builds on recent works leveraging this simple autoencoder architecture to enhance learning of the latent representation space. The proposed architecture is a two step-process. First, a simple autoencoder architecture is trained in a standard way on normal data to reconstruct anomaly maps on patient images. This allows generating rough lesion binary maps by fixing a low threshold on these maps to encompass both the true anomaly and some false detections. Then a pre-trained inpainting network is applied on the area masked by the lesion binary map to reconstruct pseudo healthy (PH) tissue. Finally, the final lesion map is obtained by subtracting this pseudo-healthy image to the original one. Performance of this model is evaluated first on synthetic lesion data then on a stroke lesion dataset, to evaluate anomaly detection performance. Comparison with state of the art demonstrates in par performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    -UAD is a timely topic in the community. The paper is an interesting contribution attempting to bridge the gap between simple auto-encoders that performs poorly and generate high false detection rate and more complex architectures enabling perfect reconstruction of both normal and pathological tissues, thus lacking sensitivity at inference. -the paper is well written -It includes a comparison with SOTA models

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    -The authors evaluate performance of their model on stroke lesion segmentation tasks. As shown on Figure 4, the dataset includes lesions of variable size, thus mixing both a segmentation task on big lesions (eg first 3 lines of Fig 4) and a detection task on small lesions. As stated in the literature (reference [15] of this paper), UAD architectures may not be useful for segmentation task of major lesions that can be retrieved by not-so-complex thresholding strategies. I would thus recommend the authors to report separate detection performance (eg AUPRC) for “small” and “big” lesions. This would significantly improve the technical soundness of the study. -The paper takes advantage on another model proposed by the authors whose reference is withheld. This model is shown to provide competitive performance, eg in Tables 1 and 2. Numerous citations to this model (reference [23]) without providing any methodological details on this model impairs the evaluation of the soundness and novelty of the proposed contribution. -As stated above, comparison with selected SOTA models is a positive point of the study. However, the choice of these models should be justified; a short description should be provided as well as some implementation details. Also, the authors should consider including comparison with Latent Transformer Models (LTM) based on VQ-VAE or VQ- GAN models that were recently shown to perform well for detection task on the WMH dataset ((Pinaya et al, MEDIA 22), (Pinon et al, MIDL 23)). At least, these references should be included and discussed. -Same comment regarding the insufficient methodological details applies to the deep inpainting model which constitutes the main methodological novel contribution of the proposed pipeline.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors do not provide their code as well as any link to repositories of the SOTA models of their comparative analysis. No information provided on the hyperparameter setting. Reproducibility is thus limited.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please consider addressing the points listed in section 6.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Interesting topic but the paper lacks methodological details as well as a more detailed analysis of the results.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposes a novel unsupervised approach called PHANES for detecting anomalies in medical images. The method uses latent generative networks to create masks around possible anomalies, which are refined using inpainting generative networks to produce pseudo-healthy (PH) reconstructions. The effectiveness of PHANES is demonstrated in detecting stroke lesions in T1w brain MRI datasets and shows significant improvements over state-of-the-art methods. The proposed framework has the potential to support various clinical-oriented downstream tasks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Novel approach: The paper proposes a novel unsupervised approach called PHANES for detecting anomalies in medical images. The method uses latent generative networks to create masks around possible anomalies, which are refined using inpainting generative networks to produce pseudo-healthy (PH) reconstructions. This approach is unique and has not been explored before in the context of anomaly segmentation.
    2. Clinical feasibility: The proposed framework has the potential to support various clinical-oriented downstream tasks. The ability to generate pseudo-healthy versions of images containing anomalies can be a useful tool in supporting clinical studies.
    3. Strong evaluation: The effectiveness of PHANES is demonstrated in detecting stroke lesions in T1w brain MRI datasets and shows significant improvements over state-of-the-art methods. The paper also provides a comprehensive evaluation of the proposed method, including comparisons with several baselines and visualization of the results.
    4. Precise localization: PHANES demonstrates more precise localization, especially for subtle anomalies, compared to diffusion models, which tend to yield more false positives.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Limited evaluation: The evaluation of the proposed method is limited to stroke lesion segmentation in T1w brain MRI datasets. While the results are promising, the generalizability of the method to other types of anomalies and medical imaging modalities is not demonstrated.
    2. Lack of comparison with other state-of-the-art methods: The paper compares the proposed method with a limited set of baselines, and some of the state-of-the-art methods in anomaly segmentation are not included in the comparison. For example, the paper does not compare the proposed method with deep learning-based methods such as Deep SVDD and Deep Autoencoder-based Anomaly Detection, which have shown promising results in anomaly segmentation.
    3. Lack of discussion on the limitations of the proposed method: The paper does not discuss the limitations of the proposed method, such as the sensitivity of the method to the choice of hyperparameters and the impact of the size and quality of the training dataset on the performance of the method.
    4. Lack of novelty in some aspects: While the proposed method is novel in its approach to anomaly segmentation, some aspects of the method, such as the use of generative networks for image inpainting and the use of diffusion models for anomaly detection, have been explored in prior work.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Despite no information on the availability of code and data, it provides sufficient details to facilitate the replication of the experiments and the evaluation of the proposed method.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please see my comments for strengths and weaknesses.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper’s contribution lies in presenting a new approach for anomaly segmentation in medical images that can potentially improve clinical applications. The proposed method is novel, clinically feasible, and shows strong performance in detecting anomalies in medical images. I only concern about the interpretability of the generated data in medical images as we cannot easily determine whether the generated data is indeed a surrogate for real data in practice.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose an unsupervised anomaly segmentation framework called PHANES for reverse anomalies in brain images by preserving healthy issues and replace abnormal regions with pseudo-healthy reconstructions. The authors compared their approach to different state-of-the-arts and validated their models on stroke lesions on brain T1w MRIs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Unsupervised: PHANES can replace abnormal regions with pseudo-healthy reconstructions and requires no expert annotations. Comparisons to other states-of-the-arts: the PHANES was compared to various state-of-the-arts and demonstrated superior performances in stroke lesion segmentation in AUPRC and DICE score.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Limited testing sample: the proposed method tested on a limited sample size with only 30 images.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Models were developed on public datasets to ensure reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Explore the performance of other anatomies and imaging modalities: Evaluate PHANES on additional anatomies and imaging modalities, such as healthy liver/kidney/spleen CT and MRI data from the CHAOS dataset and liver tumors from the LITS dataset or something in a similar setting.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors’ comparison of their approach to different state-of-the-art methods demonstrates the effectiveness of PHANES in stroke lesion segmentation. PHANES is an unsupervised approach. Its superior performance compared to state-of-the-art methods, and reproducibility are in good shape. Although the proposed method was tested on a limited sample size of only 30 images, the authors have presented a strong case for future work to address this limitation

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The performance comparison should be more comprehensive, including detection performance (eg AUPRC) for “small” and “big” lesions. More details on the benchmarking algorithms should be given. Some state-of-art approaches may not be involved for comparison. Discussions on existing methods, limitation and future work are missing.




Author Feedback

We are thankful for the reviewers’ positive feedback on our manuscript and their recognition of our work as interesting, novel, and extensively evaluated on both synthetic (N=30) and stroke (N=655) anomalies. We are grateful for their recognition of our superior performance compared to state-of-the-art (SOTA) methods in our comparative analysis, as well as our method’s ability to reverse abnormalities. Our approach preserves healthy tissues while effectively in-painting anomalous regions with pseudo-healthy reconstructions.

In our manuscript, we have made careful selections, including popular methods like VAE, as well as SOTA methods in medical imaging such as DAE, and the current leading approach in the related industrial defect inspection field (PathCore). Furthermore, we have incorporated recent reconstruction-based methods such as SI-VAE, knowledge-distillation methods (MKD), and the most recent diffusion probabilistic models (AnoDDPM). Additional methods suggested by reviewers were not included due to space constraints but will greatly add to the extensions of this work.

We are making our code available to the community at the time of publication, as an earlier date could have compromised the anonymous submission of a separate, synergistic, yet quite distinct work.

Once again, we sincerely thank the reviewers for their insightful feedback and valuable suggestions. We will incorporate these comments into our revised manuscript to the best of our abilities.



back to top