Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Gabriel Bernardino, Anders Jonsson, Filip Loncaric, Pablo-Miki Martí Castellote, Marta Sitges, Patrick Clarysse, Nicolas Duchateau

Abstract

Diagnosis through imaging generally requires the combination of several modalities. Algorithms for data fusion allow merging information from different sources, mostly combining all images in a single step. In contrast, much less attention has been given to the incremental addition of new data descriptors, and the consideration of their costs (which can cover economic costs but also patient comfort and safety).

In this work, we formalise clinical diagnosis of a patient as a sequential process of decisions, each of these decisions being whether to take an additional acquisition, or, if there is enough information, end the examination and produce a diagnosis. We formulate the goodness of a diagnosis process as a combination of the classification accuracy minus the cost of the acquired modalities. To obtain a policy, we apply reinforcement learning, a machine learning technique that based on the data it has recommended to acquire, proposes the next modality to incorporate that maximises the accuracy/cost trade-off. This policy therefore performs medical diagnosis and patient-wise feature selection simultaneously.

We demonstrate the relevance of this strategy on two binary classification datasets: a subset of a public heart disease database, including 531 samples with 11 scalar features, and a private echocardiographic dataset including signals from 5 standard image sequences used to assess cardiac function (speckle tracking, flow Doppler and tissue Doppler), from 188 patients suffering hypertension, and 60 controls.

Our algorithm allows acquiring only the modalities relevant for diagnosis, avoiding low-information acquisitions, which resulted in both a higher stability of the chosen modalities and a better classification performance under a limited budget.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16431-6_56

SharedIt: https://rdcu.be/cVD7c

Link to the code repository

https://github.com/creatis-myriad/featureSelectionRL

Link to the dataset(s)

https://archive.ics.uci.edu/ml/datasets/heart+disease


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces an RL based method to select modalities and input information given a constrained budget

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The motivation and writing is clear- The method is interesting and well supported

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Not that extensive comparison with related literature in theoretical and practical terms. Major assumption that each new modality will give information that is beneficial for the diagnosis and that there is no overlap in these potential new bits of information. A causal analysis of the necessity and sufficiency of the inclusion of the modalities

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Code and experimental settings provided - very good

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    This is a very well written and interesting paper , nice work !

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is a good paper that would benefit the community - provided there is some extra justification on the limitations of this approach the paper should be accepted

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    This paper works on active modality selection for clinical diagnosis and proposes a reinforcement learning (RL) formulation to maximize the accuracy/ cost balance. The proposed Q value-based RL algorithm can actively select the next modality or end the examination to get the diagnosis for a specific patient. Experiments are conducted on a heart disease dataset and an echocardiographic hypertension dataset and show that the proposed RL algorithm is better than the population-based selection method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper works on an interesting topic that can actively help select the next exam modality or end the examination to get the diagnosis and minimize the cost while maintaining high accuracy. The topic is promising as the decision is made for each individual patient.
    • The decision-making process is formulated into an RL problem and solvable by traditional RL algorithms.
    • The authors validate the proposed method’s effectiveness on two datasets and its superiority over the population-based selection method.
    • The authors show that the proposed decision-making framework can indicate which modality/ bio-marker is important for diagnosing on a population level.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The scope of the study: The authors simplify the usable modalities to be represented as valued vectors. However, it is non-trivial to get these representation vectors for some common modalities in real practice, such as images.
    • The number of selected modalities is small, which may be solved by heuristics-based methods. Meanwhile, the cost for the modality specified by the authors is arbitrary. Hence, it is trivial to make a comparison.
    • The decision-making process is quite unclear as the method can only get some value numbers for each modality and does not get any meaningful interpretation for a specific patient. The discussion by the authors in 4.3 is result-driven.
    • Given the simplified assumptions, this work is more of a proof-of-concept. The work needs to be carefully designed for more complex real-world problems.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors have provided implementation details with supplementary code. The evaluation is performed on a public dataset and a private dataset. The authors are encouraged to make their data and code available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • The class predictions given the current state are computed using support vector machine classifiers. However, the authors do not evaluate the classifier’s quality, thus may lead to some bias in the final results. Intuitively, the reinforcement learning algorithm can probe these classifiers during training, which can learn some shortcuts from them. Further, a classifier needs to be learned for each superstate, resulting in many classifiers.
    • In Fig. 1, the authors can show the meaning of each point and explain why the number of points is different for reinforcement learning and the population-based method.
    • In Fig. 2, the authors can explain how the Dice coefficient is computed.
    • The authors can elaborate on why the modality selection can be formulated as a Markov decision process (MDP).
    • Typically, reinforcement learning has a parameter gamma in the MDP formulation, and the authors choose to set the gamma to 1. The authors can give some discussion on it.
    • The authors can give a more precise description of s_{n+1} given s_n and a in section 2.2.
    • The authors can further refine the paper writing and organization, e.g., abstract, notations.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper works on an interesting diagnosis decision-making problem. However, the paper is limited in terms of modality scopes and experiment clearness.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    I agree the paper works on an interesting topic and is a promising proof-of-concept. Some detailed concerns need to be addressed when applied to more real-world clinical situations, e.g., representation of modality and classifier choices. Overall, it showcases the potential for decision-making with RL.



Review #6

  • Please describe the contribution of the paper

    This works presents a reinforcement learning (RL) strategy for modality selection during diagnosis, which accounts for both diagnosis accuracy and modality-specific acquisition cost. The authors have shown the validity of the proposed approach in two datasets (one public and one private), demonstrating the clinical utility of RL over population-wise feature selection.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Novelty: the authors present a novel RL method that extends their previous strategy (ref. [1]) into a modality- and cost-aware method able to handle high dimensional data, further enhanced with strategies to avoid data sampling imputation. Clinical interpretability: the authors have thoroughly analyzed the results of their strategy with respect to the clinical knowledge of the selected application (hypertension), showing the validity (and clinical utility) of the decisions made by the RL method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    SOTA comparison: Despite having adequality shown the superiority of their strategy with respect to a simpler population-wise feature selection (both in terms of prediction error and stability, which I acknowledge them for), comparison with other SOTA approaches were not included (namely those mentioned in Section 1 based on patient-specific feature selection, etc.).

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No major concern regarding reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • Please consider increasing the opacity of the plots in Fig. 4 for better visualization.
    • Please comment on the high similarity between GLS curves across individuals of all groups (and why higher importance was given in the third group).
    • Small typos were found throughout the manuscript (e.g. in page 3, twice appears “an MDP” rather than “a MDP”). Please revise it.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The manuscript presents sufficient methodological novelty in a clinically relevant task. It is well written, presents adequate experiments and a great clinical analysis/discussion of the method and its results.

  • Number of papers in your stack

    6

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Somewhat Confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This is an interesting paper that considers both accuracies and costs for active modality selection under a reinforcement learning framework. This work may lead to practical clinical impact when mature.

    There are concerns from the reviewers need to be clarified:

    1. Are there any non-vector measurements in the datasets? How can this framework be applied to non-vector measurements such as images?
    2. Why there are no comparisons with the SOTA methods, e.g., those mentioned in the Introduction?
    3. What are the limitations of this framework when applying to more complicated real-world problems?
  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2




Author Feedback

We thank the reviewers for their input. We are glad that the interest of our work and clinical potential have been highlighted. Although this is a (promising) proof-of-concept that should fit the page limit, we will try to incorporate the comments to the manuscript, especially those asking for clarifications, and suggestions are considered for future work. Due to space constraints in the rebuttal, we only address here major issues, mostly corresponding to the summary from the metareviewer.

1) Image data: Our data were either vector or scalar values. However, our framework allows using images too, provided one uses dimensionality reduction (such as variational autoencoders) as preprocessing. A second alternative would be to use standardised measurements extracted from the images, as often done in clinical practice, as concatenated scalar features. Another alternative is to substitute both the kernel classifiers and Q-value estimator functions by CNNs, but this would mean large computational footprint to train and execute each estimator in our model.

2) SOTA: We agree that our comparison with the SOTA is preliminary, but we are limited by the amount of material we can present in the paper. Actually, there are very few methods from the SOTA of RL for successive data integration, since it is not a typical RL task: most of RL tasks are inspired by control situations and involve actions that change the state, but keep its dimensionality constant. The competing SOTA methods build a space to represent all possible information, which is used to impute the non-acquired measurements from the already acquired ones, using probability distributions. This has several shortcomings: 1) the use of computationally expensive sampling 2) imputation of non-acquired data is a disputable decision (if we were able to impute the data, acquisition would be unnecessary). In contrast, our method uses independent models for each combination of acquired data, avoiding data imputation. Another benefit of our framework is that it ends in a bounded number of steps, simplifying some of the typical problems encountered in RL (i.e. the need of a discount factor gamma < 1, convergence, use of data generator, etc.). We believe that the use of a different set of hypotheses will make our method preferable to the SOTA in a range of situations, but a proper analysis of which are these situations is out of the scope of this paper. Here, we focus on presenting the framework and verifying it against a basic baseline to show its utility. From a pragmatic point, we didn’t find an open implementation of [5,11] that could be directly used on high-dimensional features, and since their models use NN as estimators, and use sampling-based RL, they are notoriously data-intensive and hard to train.

3) Real-world problems: We used two real datasets (one with real costs), even if they were relatively simple. However, we agree that there is room for improvement to work on more complex cases, in particular: 3.1) Missing modalities at training. It will be difficult to find a dataset ith all modalities were acquired for all patients. Training each estimator with the available data only is possible, but has implications since the data is not missing at random. 3.2) Dealing with a high number of modalities: our framework only works with a reduced number of modalities before suffering an exponential explosion. We will need regularization (to deal with the limited number of samples) and partial graph exploration (to avoid training an exponential number of classifiers). We expect to address this issue in future work.

4) Short answers: R1: avoid redundant modalities: this is partially addressed via the acquisition costs, so the algorithm prefers less acquisitions. R3: patient-specific interpretability: it is possible to interpret each individual decisions using gradient-based interpretability methods on the Q-functions, but we didn’t include in the paper due to space constraints.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This is an interesting paper that considers both accuracies and costs for active modality selection under a reinforcement learning framework. This work may lead to practical clinical impact when mature.

    In the rebuttal, the authors suggest different ways of handling non-vector (e.g., imaging) data. The differences of the proposed method and existing work are also clarified. Furthermore, the limitations on the real-world problems are also presented. Therefore, most major concerns have been clarified.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Well-motavated and interesting work. All reviewer think it should be accepted (one opinions changed after the strong rebuttal).

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    5



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper presents a reinforcement learning approach to modality selection to reach a confirmed diagnosis. While the approach may be generally applicable to other problems in terms of incorporation of resource constraints, it is not an approach I can see easily adopted in clinical workflows. Often, the scheduling of exams takes its own process and the decision need not be done in real-time. Also, many decision support tools can suggest the next test to perform based on information received from previous test and its findings as input by the clinician/staff into the tool. Wouldn’t that be better to emulate even in an RL framework?

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    8



back to top