Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Chengwei Pan, Gangming Zhao, Junjie Fang, Baolian Qi, Jiaheng Liu, Chaowei Fang, Dingwen Zhang, Jinpeng Li, Yizhou Yu

Abstract

Although deep learning algorithms have been intensively developed for computer-aided tuberculosis diagnosis (CTD), they mainly depend on carefully annotated datasets, leading to much time and resource consumption. Weakly supervised learning (WSL), which leverages coarse-grained labels to accomplish fine-grained tasks, has the potential to solve this problem. In this paper, we first propose a new largescale tuberculosis (TB) chest X-ray dataset, namely tuberculosis chest Xray attribute dataset (TBX-Att), and then establish an attribute-assisted weakly supervised framework to classify and localize TB by leveraging the attribute information to overcome the insufficiency of supervision in WSL scenarios. Specifically, first, the TBX-Att dataset contains 2000 X-ray images with seven kinds of attributes for TB relational reasoning, which are annotated by experienced radiologists. It also includes the public TBX11K dataset with 11200 X-ray images to facilitate weakly supervised detection. Second, we exploit a multi-scale feature interaction model for TB area classification and detection with attribute relational reasoning. The proposed model is evaluated on the TBXAtt dataset and will serve as a solid baseline for future research. The code and data will be available at https://github.com/GangmingZhao/tb-attribute-weak-localization.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16431-6_59

SharedIt: https://rdcu.be/cVD7f

Link to the code repository

https://github.com/GangmingZhao/tb-attribute-weak-localization.

Link to the dataset(s)

https://github.com/GangmingZhao/tb-attribute-weak-localization


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces new large-scale TB dataset for attribute-assisted X-ray diagnosis and help models conduct weakly-supervised TB detection. This paper also proposes an multi-scale attention-based feature interaction module to enhance TB detection.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    4.(1) The proposed dataset provides attribute information which can help model detect TB areas in a weakly-supervised style. This dataset may receive well focus in the future research if provided publicly. (2) The proposed multi-scale feature interaction module effectively leverages the attribute feature to conduct self-attention and cross-attention, resulting a better performance. Such method is useful for the weakly-supervised framework, and have promising potential for wide-spread application.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    5.1) The components in the feature representation module, such as group convolution[1] and multi-head attention[2], are directly transferred from general image recognition framework, which may be lack of novelty to some extent. [1] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25. [2] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.

    (2) The insufficient experiments can not provide evidence to show the effectiveness of some operation in the module, such as relative position encoding.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    CREDIBLE: I believe that the obtained results can, in principle, be reproduced. Even though key resources (e.g., proofs, code, data) are unavailable at this point, the key details (e.g., proof sketches, experimental setup) are sufficiently well described for an expert to confidently reproduce the main results, if given access to the missing resources.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    8.(1) In the ablation study part, please show the effectiveness of relative positional encoding which is adopted in the multi-head attention module. The author needs to discuss why relative positional encoding is needed for TB detection. (2) The loss function for detection branch is not clarified. Please provide specification. (3) The influence of the level of multi-scale feature pyramid is not well studied. Directly using attention module in the low-level with large resolution feature map will cause great computation and memory burden, although a multi-head style and stride down-sampling are adopted. Please show the experiment results under different choices of depth and provide conclusion about the relation between performance and pyramid level, like whether the high-resolution low-level feature maps contributes much less the higher levels.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The whole proposed attribute-guided method is novel for TB detection, but the components need more innovative modification for medical image recognition. And the insufficient experiments can not provide evidence for the effective of some operation.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    It proposes a new large scale tuberculosis (TB) chest X-ray dataset, namely tuberculosis chest Xray attribute dataset (TBX-Att), and establishes a attribute-assisted weakly supervised framework to classify and localize TB by leveraging the attribute information to overcome the insufficiency of supervision in WSL scenarios.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    It presents the TBX-Att dataset for attribute-assisted X-ray diagnosis for TB. It proposes a method to fuse the attribute information and TB information, including the feature pyramid network, the attribute classifier and the feature interaction module. It faces many difficulties to collect the dataset of attribute-based TB X-rays, and it combines the TBX11K dataset to construct a totally new attribute-based TB dataset.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1.Fig.1 and Fig.2 can not explain the usage of features in each step.

    1. The branches of detection and classification are commonly used in precious work and there are not novelty parts in the network.
    2. The method is only evaluated on one dataset, that can not certify the effectiveness.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper is simple with good reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    1.There are some clerical errors, e.g., Loss_seg has no been provided before in equation 4. 2.The attribute feature representation should be proved validity.

    1. The method is not compared with enough current work, so it is not convincing.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1.The dataset proposed combines the former dataset TBX11K. It offers a playground for attribute-assisted X-ray diagnosis for TB.

    1. The model proposed inherits two branches thought without much novelty.
    2. The experiment is not sufficient.
  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    5

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    The paper describes a new chest X-ray (CXR) tuberculosis dataset that contains disease attributes in addition to TB labels. A novel multi-scale feature interaction model for TB attribute detection and classification is also described. Majority of existing TB CXR datasets contain only disease identifying labels (TB, healthy, sick/no TB). However, there are many other clinical features that may be of interest to clinicians during diagnosis. To address this, the paper presents a TB CXR dataset with attributes (e.g., fibrotic streaks, pulmonary cavitation etc.) to facilitate computational analysis and reasoning about different TB properties.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Majority of existing TB CXR datasets contain only disease identifying labels (TB, healthy, sick/no TB). However, there are many other clinical features that may be of interest to clinicians during diagnosis. To address this, the paper presents a TB CXR dataset with attributes (e.g., fibrotic streaks, pulmonary cavitation etc.) to facilitate computational analysis and reasoning about different TB properties.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    While the motivation is compelling, the paper seem to miss important details about the data collection protocol. For instance, what ages, genders, and other demographical information does the dataset represent? How was the dataset annotated? The paper mentions the TBX11K dataset, but it is not clear how the two datasets are combined.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Most details of the proposed computational model are presented, so it should be possible to reproduce it.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    There were several points that I found unclear/missing.

    I did not understand why the presented task is presented as a weakly supervised learning, when the dataset contains both bounding boxes and labels for every data point.

    Also, what is the advantage of the proposed attribute relational reasoning network over more standard region proposal networks (e.g., RCNN, Yolo, etc.)?

    Results in Table 2 also need a lot more description. Is it possible to evaluate the attribute classification and detection separately? Is it also possible to perform analysis on each attribute separately and compare and contrast?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, the paper presents a compelling limitation of previous work and presents a dataset and an approach to overcome it. However, the evaluation and description of both the method and the dataset is quite limited.

  • Number of papers in your stack

    6

  • What is the ranking of this paper in your review stack?

    6

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    The key strength of the paper is the addition of attributes (more fine-grained information about diagnosis) to an existing dataset (TBX11K). This information would be extremely valuable to the community, as TBX11K is one of the largest publicly available imaging TB datasets.

    While there are concerns regarding the modeling presented in the paper, the rebuttal has addressed my questions.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper received three mixed reviews and is recommended for rebuttal. Please address the following comments and suggestions adequately and sufficiently in the rebuttal:

    “1.The dataset proposed combines the former dataset TBX11K. It offers a playground for attribute-assisted X-ray diagnosis for TB.

    1. The model proposed inherits two branches thought without much novelty.
    2. The experiment is not sufficient”

    “Overall, the paper presents a compelling limitation of previous work and presents a dataset and an approach to overcome it. However, the evaluation and description of both the method and the dataset is quite limited.”

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    12




Author Feedback

Q1-Dataset The recruited instances were Chinese people between 25 and 55 years old, and 64% were men and 36% were women. Most are residents of Zhejiang Province, with a few from other regions in China. Two senior radiologists participated in the annotation independently following a list of attributes. If an attribute is present, it is annotated as 1 and otherwise 0, resulting in a sparse vector of attributes. The sample was retained only if the two radiologists labeled the same. The above information will be added in version. For the combination of our dataset and TBX11K, our dataset consists of TB attribute annotations which are used to train the attribute classification branch, while TBX11K dataset consists of TB localization information which provides the supervision for training the detection branch. Our provided dataset with attribute annotations is a supplement to the existing TB dataset, aiming at a novel solution for TB detection (R#2, R#3). We will make it public as soon as possible.

Q2-Novelty Radiologists usually use clinical features like attribute information for diagnosis, while the majority of existing TB chest X-ray datasets contains only disease identifying labels. To address this issue, we present a TBX-Att dataset with attribute labels to expand the existing TBX11K dataset to promote the development of the community. Moreover, we propose a novel multi-scale feature interaction model (R#3), which is novel for TB detection (R#1) and is devised to enhance TB feature representations under the guidance of relational knowledge reasoning. Compared with previous work using classification and detection branches parallelly with little or even no interaction, a multi-scale feature fusion module is designed for fully information flow transfer and interaction between the two branches. Although group convolution and multi-head attention is widely used in image recognition, we consider them from a new perspective. Group convolutions are used to reduce parameters and expected to learn features of each kind of attribute by one corresponding group convolution. In multi head attention, the generation of Q, K, V (shown in Fig. 3) is carefully designed according to specific conditions, which is not the same as the original realization. We improve these technical methods for better applying them to new challenging scenarios.

Q3-Experiment The main motivation of our work is taking advantage of TB attribute information to guide the effective extraction of features for TB detection, which is novel (R#1, R#3). To achieve it, an attribute-classification branch with A2-Attn is designed to learn the representative attribute features under the supervision of attribute labels. Moreover, a feature interaction module with AT-Attn is used to obtain meaningful feature representations for TB detection under the guidance of relational knowledge reasoning. Different from [18], we train the two branches simultaneously considering the advantage of multi-task learning, rather than dividing the training procedure into two stages: The detection branch is trained first and then the backbone network and object detection branch are frozen when training the classification branch. Due to the simultaneous training, we focus on the impacts of three carefully designed components on both attribute classification and TB detection in experiments. As shown in Table 2, when the three components are all used, the highest performance is obtained and exceeds baseline by a large margin. Any pairwise combination will reduce the performance, which can demonstrate the effectiveness of each component implicitly. Some interesting phenomenon that the feature interaction can gain improvements on both tasks (not just TB detection) is found in Table 2. Moreover, the results verify that the two tasks are complementary, where the higher accuracy of attribute classification leads to higher TB detection performance. And the above findings also confirm our original intention.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    After rebuttal (authors did a good job on addressing adequately on the problems that raised), there are at least two acceptances from the reviewers. AC reviewed the paper and agreed with the following assessment. The overall contributions are sufficient as proposing/validating an important technical extension of attribute reasoning for TBX11K in TB detection using X-rays.

    “The key strength of the paper is the addition of attributes (more fine-grained information about diagnosis) to an existing dataset (TBX11K). This information would be extremely valuable to the community, as TBX11K is one of the largest publicly available imaging TB datasets.”

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I don’t think the work is particularly novel. But it is clinically well formulated and comes with a new (promised dataset) for TB. The level of novelty is on par with typical, but not exceptional, MICCAI papers.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    8



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    a successful rebuttal period with convincing results. The work has limited innovation but there is a merit, sure. Also, the experimental results have clinically useful findings that can be beneficial to the society.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NR



back to top