Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Ke Yu, Shantanu Ghosh, Zhexiong Liu, Christopher Deible, Kayhan Batmanghelich

Abstract

Creating a large-scale dataset of abnormality annotation on medical images is a labor-intensive and costly task. Leveraging weak supervision from readily available data such as radiology reports can compensate lack of large-scale data for anomaly detection methods. However, most of the current methods only use image-level pathological observations, failing to utilize the relevant anatomy mentions in reports. Furthermore, Natural Language Processing (NLP)-mined weak labels are noisy due to label sparsity and linguistic ambiguity. We propose an Anatomy-Guided chest X-ray Network (AGXNet) to address these issues of weak annotation. Our framework consists of a cascade of two networks, one responsible for identifying anatomical abnormalities and the second responsible for pathological observations. The critical component in our framework is an anatomy-guided attention module that aids the downstream observation network in focusing on the relevant anatomical regions generated by the anatomy network. We use Positive Unlabeled (PU) learning to account for the fact that lack of mention does not necessarily mean a negative label. Our quantitative and qualitative results on the MIMIC-CXR dataset demonstrate the effectiveness of AGXNet in disease and anatomical abnormality localization. Experiments on the NIH Chest X-ray dataset show that the learned feature representations are transferable and outperform the baselines in classification and localization tasks.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16443-9_63

SharedIt: https://rdcu.be/cVRzh

Link to the code repository

https://github.com/batmanlab/AGXNet

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposed an anatomy-guided weakly-supervised abnormality localization model for chest x-rays. Specifically, the model consists of a cascade of two networks, one for identifying anatomical abnormalities and the other one for pathological observations. The model also utilizes an anatomy-guided attention module to guide the observation network to focus on the relevant anatomical regions generated by the anatomy network. Experiment results on two large public chest x-ray datasets show superior performance against other methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The topic of adapting positive-unlabeled learning to chest x-rays is interesting, probably making the contributions useful for medical communities.

    The results appear strong (although some comparisons are missing).

    The paper is well-written.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The novelty is limited. Utilizing reports to aid abnormality localization in chest x-rays has been popular in this area.

    Baselines are not complete and state-of-the-art, especially some existing works that also utilize the medical reports are not considered for comparison.

    Ablation experiments are not complete. For example, this model contains a few hyper-parameters, but many were not discussed in the experiments.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Can be reproduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Please refer to the main weaknesses.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The experiments are comprehensive and the results are strong. But the idea of using reports to aid disease localization in chest x-rays is not novel.

  • Number of papers in your stack

    6

  • What is the ranking of this paper in your review stack?

    5

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    The authors proposed a weakly supervised disease localization approach for chest X-rays by integrating the classification of anatomical mentions (also extracted from the radiology reports) in addition to the disease/observation mentions. The CAM maps from the anatomy classification branch are further utilized as the weight mask of refined regions for the disease classifications. Two large-scale public datasets, i.e., MIMIC-CXR and ChestX-ray8, are employed here for the experiments. The radiologists also annotated bounding boxes of diseases in the MIMIC-CXR datasets. Superior results of the proposed framework (in both classification and localization) are reported compared to some prior arts (e.g., CAM-based localization).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is overall well-prepared and easy to follow
    • The authors explicit model the anatomical mentions in the report for the localization and classification purpose
    • Two large-scale datasets are utilized to demonstrate the effectiveness of the proposed AGX module
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I have a couple of concerns and suggestions as follows:

    1. I found a directly related reference [R1] is missing, which tackles the same problem and also extracted the attributes of disease (including anatomical locations) from the report as a form of supervision.
    2. The comparison to prior arts in weakly supervised localization is relatively weak, where only one previous work based on vanilla CAM is included. There are other methods, e.g., [R2] (another missing reference), which should be included and compared. Also, the localization results for other IOU values shall be included, at least in the supplementary. Especially results on IOU 0.5 are important.
    3. The PU module for the uncertainty learning seems only to be effective in some scenarios, e.g., better results for pneumonia but worse for pneumothorax in MIMIC-CXR. It would be helpful if the authors could further discuss it.
    4. There are public datasets with bounding boxes/segmentation masks for pneumonia and pneumothorax. It will be more convincing to adopt those datasets for the evaluation, considering the current GT set from MIMIC-CXR is a bit small (a couple of hundreds vs. thousands).

    [R1] Bhalodia, R. et al. (2021). Improving Pneumonia Localization via Cross-Attention on Medical Images and Reports. In: MICCAI 2021 [R2] Li, Z., et al.: Thoracic disease identification and localization with limited supervision. In: CVPR 2018

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The results seem to be reasonable and reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    see 5

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Good quality paper with a clear introduction of the method and well-organized experiments, though there are some possible improvements in the comparison study and evaluation data.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    My decision is not changed after reading the rebuttal or ignoring the supplemental material. The authors were only able to address some of my concerns (due to the page limit), and a minor revision is required regarding the experiments and other aspects. However, the manuscript is overall of good quality and above the borderline, as I stated in the previous review.



Review #3

  • Please describe the contribution of the paper

    This paper proposed Anatomy-Guided chest X-ray Network (AGXNet) which consists of two networks: Anatomy Network and Observation Network to make use of both anatomy mentions in reports and image-level label. An anatomy-guided attention (AGA) was then adopted to bridge these two networks. The proposed method achieves competitive results on two public datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The authors incorporated anatomy mentions into Weakly-Supervised Abnormality Localization in Chest X-rays through an anatomy-guided attention (AGA) module.
    2. Positive Unlabeled (PU) learning was used to alleviate the noise of in CXR reports.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. In the experimental part, some state-of-the-art methods need to be compared in Table 1, 3 and 4.
    2. Many important information is missing in the comparative experiment part.
    3. The authors violated the guideline for supplementary submission.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of the paper is credible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. The proposed proposed method should be compared with [23].
    2. More state-of-the-art methods need to be compared on NIH Chest X-ray and MIMIC-CXR dataset.
    3. Authors should give more details in how the contrast algorithms were trained such as RetinaNet.
    4. Which part of DesenNet-121 is the observation feature map f_o?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    1

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    According to paper submission guidelines, for supplementary, authors should not submit text materials beyond figure and table captions, definition of variables in equations, or detailed proof of a theorem. The authors violated this guideline.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    4

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    2

  • [Post rebuttal] Please justify your decision

    submission with more content is obviously not fair to other submissions. A submission with more texts provides more information and an opportunity to describe authors’ ideas and experimental results. The submission guidelines of MICCAI for supplementary material clearly specified that authors cannot submit text materials beyond figure and table captions. Although reviewers are under no obligation to review authors’ supplementary, it potentially leads the reviewers to an unfair judgment. I believe that’s why MICCAI issued the clear submission guidelines. If authors argue that the supplementary guidelines are not reasonable, they can ask the MICCAI program committee to modify the guidelines. I modified my score from 1 to 2, given that authors addressed most of my concerns. Still, I argue that strict reviews will benefit the fairness of the MICCAI community.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The work uses anatomy guided attention as part of a two stage network, which process anatomy and pathology data from reports. This is combined with PU Learning. The writing and evaluations seem solid. The reviewers have highlighted the inconsistent impact of PU. They also have pointed to missing state of the art and MICCAI works in the literature review. The authors should also comment on novelty as one of the reviewers has raised a concern on the topic. Note that the reviewers will have to ignore the supplementary material submitted with this paper given it goes beyond what is allowed in guidelines (although this is not ground for rejecting the paper, it does require ignoring the supplement).

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    10




Author Feedback

We thank all reviewers for their constructive feedback. As stated below, we have listed the major concerns and attempted to address each of them.

R1: Using reports to aid disease localization in chest x-rays is not novel. We have three contribution in this paper: 1, Exploiting Interaction between Radiology Terms: All previous methods for disease localization that utilized report relied on presence/absence of image-level label (i.e., no interaction between terms). While our method utilizes interaction between pathological and anatomical mention as a more granular level of information. Incorporating anatomy is particularly useful for detecting pneumothorax, as seen in Tbl. 1, where recalls increased by 8% and 6% on IoU=0.1 and 0.25, respectively, and in Fig.2, where localization of pneumothorax is more resilient to shortcuts. 2, Handling Noise in NLP pipelines: Due to the nature of NLP pipeline and report taking process, significant noise is introduced to the weak label. We employed PU learning to address that noise, which is the second aspect of novelty in this paper. 3, Transfer Learning for Higher-Level Tasks: Since the anatomy is invariant with respect to domain shift, anatomical features extracted by our method can be used for Transfer Learning for higher-level tasks such as disease classification. Tbl. 3 shows that the model trained with solely anatomical features achieves classification performance comparable to the SOTA model (i.e., Rajpurkar et al.).

R2: Inconsistent impact of PU. Our results show that PU learning is clearly effective when disease labels are noisy (e.g. pneumonia). Pneumothorax’s label noise is inherently low, since it is a life-threatening condition and its presence/absence is typically well-documented in radiology reports. However, this does not mean that using PU learning is harmful for pneumothorax. We have provided standard deviations to Tbl. 1 to show that the results of models with PU learning, e.g., recall=0.74 (0.03), precision=0.43 (0.02) at IoU=0.1, and results of models without PU learning, e.g., recall=0.75 (0.01), precision=0.44 (0.01) at IoU=0.1, do not differ significantly.

R1, 2, 3: Missing SOTA method for comparison. 1, We have added the SOTA semi-supervised method (Ref 1) to Tbl. 3 and Tbl. 4. We want to point out that both this method, as well as other methods (Bhalodia, R. et al. 2021, Li, Z. et al. 2018, Tam, L et al. 2020) suggested by reviewers, used box annotation for training, while our model was only trained with weak labels. Therefore, simply comparing our method’s localization performance to theirs would be unfair. Instead, the SOTA that uses annotation should be viewed as an upper bound. We have revised the experiment section to clarify our selection criteria for comparison and cited the suggested references. 2, For the MIMIC-CXR dataset, we did compare our method with the SOTA supervised baseline RetinaNet. In fact, the baselines used in our comparison are the same as those used in a closely related MICCAI work (Bhalodia, R. et al. 2021), in which RetinaNet and CAM-based methods are also used for comparison. 3, Since the NIH Chest X-ray does not provide paired reports needed to train our proposed models, we performed a transfer learning on it. The main purpose of performing transfer learning is not to achieve SOTA results on the NIH dataset, but rather to demonstrate that our method is capable of learning disease-related and transferable features.

R1: Ablation experiments are not complete. We have performed additional ablation studies on hyperparameter Beta used in the AGA module and showed that Bet=0.1 is the optimal value. We have revised the supplementary material to include this result.

R2: Results on IoU = 0.5 should be included. We have added results on IoU = 0.5 to Tbl. 4.

Ref 1: Han, Y. et al. “Knowledge-Augmented Contrastive Learning for Abnormality Classification and Localization in Chest X-rays with Radiomics using a Feedback Loop.” WACV 2022.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Two of the reviews and meta reviews are positive and they state that they are not influenced by the extra material in supplemental. I vote to accept.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    9



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Two reviewers recommended a weak acceptance. One reviewer recommended a strong reject. One of the main reasons for the rejection decision is based on the violation of the supplementary material guidelines. I think it should be not the major reason for rejection as discussed.

    However, the paper still has weaknesses such as incomplete ablation studies and comparisons with related studies in the experiments (R1, R2, R3), and missing details (R2, R3). The authors partly addressed other points during the rebuttal by promising to improve the manuscript. Even though I do not consider the violation of the supplementary guideline, I think it has slightly more cons, so I recommend reject.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    10



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Two reviewers gave this paper a weak accept and one reviewer gave a strong reject. However, it appears that the basis for this strong rejection is the violation of the supplementary material guidelines. Since we were instructed that this on its own is not grounds for rejection, I have to discount R3 to a large extent. Otherwise, I think the paper’s strengths slightly outweigh its weaknesses as pointed out by R1 and R2.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    8



back to top