Authors

Yinghao Zhang, Donghuan Lu, Munan Ning, Liansheng Wang, Dong Wei, Yefeng Zheng

Abstract

The recent success of deep learning relies heavily on the large amount of labeled data. However, acquiring manually annotated symptomatic medical images is notoriously time-consuming and laborious, especially for rare or new diseases. In contrast, normal images from symptom-free healthy subjects without the need of manual annotation are much easier to acquire. In this regard, deep learning based anomaly detection approaches using only normal images are actively studied, achieving significantly better performance than conventional methods. Nevertheless, the previous works committed to develop a specific network for each organ and modality separately, ignoring the intrinsic similarity among images within medical field. In this paper, we propose a model-agnostic framework to detect the abnormalities of various organs and modalities with a single network. By imposing organ and modality classification constraints along with center constraint on the disentangled latent representation, the proposed framework not only improves the generalization ability of the network towards the simultaneous detection of anomalous images with various organs and modalities, but also boosts the performance on each single organ and modality. Extensive experiments with four different baseline models on three public datasets demonstrate the superiority of the proposed framework as well as the effectiveness of each component. The code will be released after the anonymous review.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_23

SharedIt: https://rdcu.be/dnwAU

Link to the code repository

https://github.com/lianjizhe/MADDR_code

Link to the dataset(s)

N/A

Reviews

Review #1

Please describe the contribution of the paper

The paper introduces a methodology for training a system that detect abnormal images combining multiple modalities and organs. The main idea is to use a common encoder for multiple image modalities and organs instead of training separate systems for each modality and organ. The proposed common encoder is a feature vector partitioned into 3 subsets of features: 1) a modality focused subset, 2) and organ focused subset, and 3) a subject specific subset. The system is trained on normal images from the multiple organs and multiple modalities with a combined multi objective loss. Namely, the original loss of the model being adapted to the framework, a categorical loss for the modality, a categorical loss for the organ type, and a center constraint on the subject specific codes. The work shows that combining multiple modalities and sources (organ) images into the training of a single model benefits accuracy of anomaly detection.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Proposal: The idea of combining multiple organ and modalities is intriguing and for the conducting experiments shows some improvement. Simplicity: the example shows the framework is simple to implement and should be a at least as good as the baseline. Good presentation: The paper structure is easy to follow.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

It would be nice to include other baselines with more modalities and organs. It would be interesting how these work would leverage other organs like mammography or heart for which there is also available data. Another issue is the type of image preprocessing which is standard for adapting commonly used architectures in object detection and recognition that may not be the most appropriate for medical images. Reducing the size to 256 x 256 may hide the presence of anomalies which can be of small size compared to the size of the entire image. It would be nice to elaborate on these points.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Enough detail seem to be provided to replicate the experiments. The models are based on existing baselines with simple enough modifications. The only missing parts are hyperparameters and types of optimizer are missing in the main paper.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

It was not clear if the perceptual loss is particular to DPA method or if appears across all other methods. It would be good to make the distinction so it becomes clear if there is a difference between L_rec at the end pf page 3 and L_b in equation (1). Very minor: some misspellings can be found along the documents such as “regrading” which should read “regarding.”
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Clear explanation and simplicity of the approach
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The main contribution of the paper is a model-agnostic framework for detecting anomalies in multiple organs and modalities simultaneously. It is achieved by dividing the latent representation into three parts that represent the organ, modality, and individual information, respectively.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The proposed approach provides a novel way that improves the generalization ability of models for detecting anomalies across multiple organs and modalities simultaneously. This saves considerable time and effort as there is no need to retrain the model for various organs and modalities. Additionally, the fact that the proposed approach does not rely on a specific model further demonstrates its strong generalizability.
2. The experiments are solid and well-designed. The results indicate that models trained using the proposed framework outperform the baseline models.
3. The paper is well-organized and easy to follow.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

My primary concern is that the mathematical part in the method section is insufficiently rigorous and detailed. For example, I didn’t find the mathematical definitions of the three parts of the latenet representations (z_o, z_m, and z_c), and it is also unclear to me how the categorical information is converted into probabilities.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

I consider the paper is reproducible as the authors declare the code will be released after the anonymous review.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. First of all, I would suggest adding mathematical definitions of z_o, z_m and z_c, and a detailed explanation of how the probability distributions P^k(z_0^i), P^l(z_m^i) are computed. Otherwise, it would be hard to fully understand the entire framework.
2. Some notions in Fig.1 seem to be inconsistent with those in the main text. It seems to me Z_c^1 should be z_0, and Z_c^2 -> z_m and Z_s -> z_c?
3. In equation (1), there are L_o(X, Y) and L_m(X, Y), where Y is described as the label. However, it seems that Y may refer to different types of labels in the two terms. In L_o, Y refers to the organ class label, while in L_m, Y refers to the modality class label, as indicated in equations (2) and (3). Further explanation to clarify this difference would be beneficial.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper has strong contributions and the evaluation indicate significant better performance compared with the baselines. Therefore, I believe it should be accepted by the conference.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

This study proposed a model-agnostic deep representation disentanglement (MADDR) framework for anomaly detection in medical images, aiming to train a generic network by combining normal images of various organs and modalities. The idea is simple yet appears to be effective when evaluated on three benchmark datasets and MADDR showed consistent improvements in all metrics compared to state-of-the-art methods.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) The proposed framework is well-motivated, with a generic network combining healthy images of various organs and modalities by imposing organ/modality classification constraints and center constraints on the disentangled latent representation.

2) Although the idea is simple and straightforward, the proposed MADDR showed consistent improvements when combined with baseline anomaly detection methods on three benchmark datasets.

3) The visualization of the encoded latent representation showed improved segregation of two classes.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) The technical novelty of the proposed method is incremental. That being said, the hyperparameter analysis of the loss weights presented in the supplementary should be moved to the main manuscript, as it is an important piece to better understand the proposed framework.

2) The method section is lengthy and contains redundant information. It could be shortened for clarity and leave more space for result analysis.

3) It would be helpful to include statistical analysis to verify that the improvement with respect to baseline methods are significant.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The description is clear and the method seems reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

Please refer to the weakness above.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

In general a good paper trying to solve an important problem. While the technical novelty is limited, the proposed method showed impressive performance on multiple cohorts compared to other SOTAs.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The idea of leveraging multi-organ and multi-modal images for universal anomaly detection is interesting. The method itself is simple but is shown to be very effective, and will be easy to be reproducible. Clarifications on the technical and methodological part of the method are suggested to be included in the final submission.

Author Feedback

N/A

back to top

A Model-Agnostic Framework for Universal Anomaly Detection of Multi-Organ and Multi-Modal Images