Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Nahal Mirzaie, Mohammad V. Sanian, Mohammad H. Rohban

Abstract

The COVID-19 pandemic has prompted a surge in drug repurposing studies. However, many promising hits identified by modern neural networks failed in the preclinical research, which has raised concerns about the reliability of current drug discovery methods. Among studies that explore the therapeutic potential of drugs for COVID-19 treatment is RxRx19a. Its dataset was derived from High Throughput Screening (HTS) experiments conducted by the Recursion biotechnology company. Prior research on hit discovery using this dataset involved learning healthy and infected cells’ morphological features and utilizing this knowledge to estimate contaminated drugged cells’ scores. Nevertheless, models have never seen drugged cells during training, so these cells’ phenotypic features are out of their trained distribution. That being said, model estimations for treatment samples are not trusted in these methods and can lead to false positives. This work offers a first-in-field weakly-supervised drug efficiency estimation pipeline that utilizes the mixup methodology with a confidence score for its predictions. We applied our method to the RxRx19a dataset and showed that consensus between top hits predicted on different representation spaces increases using our confidence method. Further, we demonstrate that our pipeline is robust, stable, and sensitive to drug toxicity.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43993-3_65

SharedIt: https://rdcu.be/dnwOa

Link to the code repository

https://github.com/rohban-lab/Drug-Efficiency-Estimation-with-Confidence-Score

Link to the dataset(s)

http://hpc.sharif.edu:8080/HRCE/

https://doi.org/10.6084/m9.figshare.23723946.v1


Reviews

Review #2

  • Please describe the contribution of the paper

    The authors proposed a first-in-field drug efficiency estimation pipeline for COVID 19. In paticualr the framework provides a confidence score for predictions to ensure the model is not confident in predictions when presented with out of distribution input samples. Additionally a metric to detect the reduction in false-positive outcomes was developed and used to improve the confidence in their pipeline.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Exhaustive experiments performed provides more robustness to outcomes and the idea to apply the application to this field will allow trustworthy/confident model development to progress.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The explanation for Equation 6 needs to match the format/flow of your work, h is a representation of positive and negative controls? Then be a bit more explicit in the explanation. Also what is the function c?

    The work is not completely novel, many added losses have been used for confidence but what is lacking is additonal measures to evaluate the models, perhaps calibration - if possible?

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    No code provided, if there is, then it is possible to improve the reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    There is further analysis required wrt the tabulated results and some areas of clarity wrt the equation developed too as mentioned above.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The above mentioned drawbacks contributed to the decision and there are a few methods being reasearched on how to look at confidence but evalution metrics that are relevant to this uncertainty/confidence modelling must be included to give a complete perspective in this field of work.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    I think the importance of the paper and subsquent author responses tend to make me favour an accept. However, some aspects re datasets - I did still feel we should have more than one dataset in this type of work but satisfied as to why they used one. I am happy to see code will be available too and therefore weakly accept.



Review #3

  • Please describe the contribution of the paper

    The paper discusses the challenges of drug repurposing studies in the context of COVID-19, where many promising hits identified by neural networks have failed in preclinical research. The authors propose a weakly-supervised drug efficiency estimation pipeline with a confidence score for its predictions, which is applied to the RxRx19a dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper presents a novel pipeline that integrates drug efficiency estimation with cell shape information, addressing the challenge of unreliable drug discovery methods. The proposed pipeline is robust, stable, and sensitive to drug toxicity, offering a promising approach for improving the accuracy of drug repurposing studies. And the data augmentation with weak labels is effective and useful.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) The paper utilizes a pre-trained Resnet18 model trained on the ImageNet dataset for representation learning to extract cell features. However, it would be more appropriate to fine-tune the model using a published cell-related dataset instead of RxRx19 to ensure a high-quality feature extractor for embedding, as the domains of the datasets are not the same.

    (2) A critical concern is whether the pipeline is effective in detecting the effects of drugs that cause chemical or non-shape changes in cells other than shape changes. While the pipeline focuses on cell shape information, it may not be as effective in detecting such chemical changes. Further investigation and validation are needed to assess the method’s efficacy in identifying drugs that induce non-shape related chemical changes in cells.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The paper would benefit from providing additional citations or references to establish a solid foundation for the study, particularly in relation to cellular morphological changes. This would help to strengthen the scientific basis of the research and provide a more comprehensive overview of the existing literature in the field.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The data augmentation methods employed in this study are innovative and effective, with the potential for broad applications beyond the current research.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    Authors propose a weakly-supervised drug efficiency estimation pipeline with a confidence score for prediction of a COVID-19 treatment with drug repurposing. The literature lacks a confidence score for the prediction, authors propose such a score here.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Simplicity: The architecture is simple and uses a pipeline that could be used in other biomedical applications. Originality: They use a weakly-supervised architecture based upon embeddings. They propose a confidence hit predictor which uses hint prediction when no ground truth is available.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Limitation in contextualisation: This article lacks weakly supervised (few-shot) predictive models literature while the biological field is well described. At some points, the article lacks precisions authors may give in the text. Limitation in validation: Only one dataset is used. Authors should use other biomedical datasets to validate their approach. Limitations in comparison: Only one metric is used (Jaccard). There is an issue in correlation proposed.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Not certain of its reproducibility (only one dataset, one metric to validate, no code furnish).

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Authors should specify the architectures used in Figure1 at each step and highlight their contribution and propose an overview of the three experiments used.

    In the fourth article cited, the only data augmentation strategy mentioned are simple transformations (flip-flop, rotations etc). Authors should specify the strategy adopted or rewrite this sentence to better understand the paragraph referring to data augmentation.

    The random noise should be specified (Gaussian? speckle?), and which kind of noise could mimic a drug efficiency or side effect with a reference article or, the authors should rewrite the sentence for better understanding the following paragraph.

    The number of images in the X set should be given.

    The authors should explain the confidence loss.

    There is an issue in statistical analysis. In the text, correlations are made with Pearson and in Figure2 with Spearman. Prime and new scores are two quantitative variables so a Pearson correlation should be made in Figure 2b. If ordinal qualitative values are used in this part, authors should specify it in the text and rectify the test used and in figure 2a the correlation test is not specified.

    If the figure 2a compared to the annexe figure1a corresponds to the same embedding experiment. Authors should indicate the experiment in the legend Figure2a.

    The similarity between the top 10 score drugs are calculated with only Jaccard similarity, for a strong validation authors should test various metrics.

    Authors should cite the annexes in their work.

    The figure 3 in annexe contains no legends (DNA, RNA, actin cytoskeleton, Golgi apparatus, and endoplasmic reticulum organelles in cells should be indicated). DEEMD and GAN-DL appear in the legend, which one corresponds to each image experiment? Authors should cite the origin of these architecture in the legend. The second element is when do these two architectures have been used in the authors pipeline? Authors should indicate it inside the text.

    In table 1, if the cell type is still a HRCE for the top 15 drugs, authors should indicate it in the legend and avoid a column. The dataset contains HRCE and VERO tissue types, while the top 15 drugs are the most confident for this cell type. Why not discussing this result?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See the weaknesses part.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    Authors reply to all the main issues, if the clarifications are written in the paper, it may be accepted.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper proposes a weakly-supervised drug efficiency estimation pipeline with a confidence score for prediction of a COVID-19 treatment with drug repurposing.

    Key strengths:

    1. a novel confidence hit predictor without ground truth is proposed.
    2. addresses a challenging drug efficiency problem and presents promising results.
    3. exhaustive experiments were conducted.

    Key weaknesses:

    1. lack validation of non-shape related chemical changes in cells.
    2. Only one dataset is used.
    3. the pretrained model should be trained on a more domain related data set such as RxRx19, or at least some ablation study is needed.

    In the rebuttal, please especially clarify the dataset, validation, method description and novelty issues.




Author Feedback

We thank the meta-reviewer (M) and reviewers (R) for their invaluable insights and constructive feedback. We are encouraged that they found our method effective (R3) and useful (R3, R4), and our paper being well-written (R2, R3, R4), and our exhaustive studies are promising more trustworthy models in drug repurposing (R2, R3). In the following we respond (Re) to comments.



R2 3.1) Does h represent positive and negative controls? What is the function c? Re) h represents transformed samples (Eq.4), which includes positive and negative controls, with gamma=0, alpha=0/1. c is a non linear function that maps h to a confidence score (c: H -> [0,1]) indicating proximity to the embedding space of training samples
. R4 6.5) Explain Eq.6. Re) In training, prediction probabilities are adjusted by interpolation between original preds d and the target probability distribution l. To prevent a trivial solution where c always returns 0, a confidence loss (negative log of confidence) is added to the equation 

R2 3.2) Calibration. Re) Mean confidence (c) in disease score (d) intervals (0-1,step 0.1): 0.9 0.8 0.7 0.5 0.4 0.3 0.4 0.7 0.7 0.9 shows model uncertainty and high entropy for 0.3<d< 0.7 This can be due to mixup strategy in the our data augmentation [arXiv:1905.11001] Reliability plot for classification of healthy/diseased cells is similar to linear (point: (0.5, 0.46) (0.6, 0.62) (0.7, 0.73) (0.8, 0.79) (0.9, 0.91) )

 R2 6.1) Analysis on results. Re) We did pathway enrichment analysis for common drug target genes in two settings without-confidence (noC), and with-confidence (C). We identified significant enrichment in the Reactome library for pathways SARS-Cov-2 Infections (p-val=4.0e-09) and Infectious Disease (p-val=1.2e-08) in the C setting, not in noC



R3 3.1 M 3.3) Pretrained model should be on a domain related data. Re) Deep models trained on ImageNet can extract generic visual features from images, and their application for morphological profiling evaluated in studies[10.1101/085118,10.1038/s41467-022-28423-4,10.1101/161422]. Besides, features of models pretrained on the current large scale dataset for bioimage, CytoImageNet, cannot outperform ImageNet pretrained features [arXiv:2111.11646v2] 

R3 3.2 M 3.1) Lack validation in identifying drugs inducing non-shape related chemical changes. Re) The model identifies healthy vs diseased cells. Regardless of drug side effects (shape related or unrelated) as far as the effects lead to resistance against the pathogen our model estimate correctly.



R4 3.3) Only a metric is used. Re) More metrics first noC, and then C. Overlap Coefficient(0.23, 0.33) Dice(0.23, 0.33) Tanimoto(0.10, 0.14) Tversky(a=b=0.5)(0.18, 0.24) show improvements in C setting.

 R4 3.2 M 3.2) Only one dataset is used. Re) RxRx19a was the largest set of human cellular morphological data when it was published. It has diverse biological perturbations (1672 chemicals, 6-8 dosages, 5 conditions, and 2 tissues) for improved model generalization. Robustness results for predicting an unseen class show the model didn’t overfit (MSE loss train 0.016, validation 0.022, test 0.023)

 R4 6.6) An issue in correlations. Re)In stability test order of treatments shouldn’t change drastically so we used Spearman while for the replicate reproducibility test replicates of the same perturbation must have a linear relationship btw their assays so we used Pearson 
R4 6.3)Specification on random noise. Re) Noise is Gaussian. We tested normal, Laplace, and uniform distributions, but chose normal due to smoother learning curves. No ref is available since method is novel. 
R4 6.12) Vero result? Re) Vero dataset is small (32 drugs) compared to HRCE (1672 drugs) Top 3: drug c d: GS-441524  0.8   0.7, Hydroxychloroquine Sulfate 0.7 0.7, Chloroquine  0.5   0.8



Thank you for comments on citations R3 6, R4 3.1, 6.7,9 and clarity R4 6.1,2,4,10,11 We’ll incorporate them in final version

 code in github: miccai2023s3550/code




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes a weakly-supervised drug efficiency estimation pipeline with a confidence score for prediction of a COVID-19 treatment with drug repurposing.

    Key strengths:

    1. a novel confidence hit predictor without ground truth is proposed.
    2. addresses a challenging drug efficiency problem and presents promising results.
    3. exhaustive experiments were conducted.

    Key weaknesses:

    1. lack validation of non-shape related chemical changes in cells.
    2. Only one dataset is used.
    3. the pretrained model should be trained on a more domain related data set such as RxRx19, or at least some ablation study is needed.

    In the rebuttal, some technical details are clarified and source code is provided.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This manuscript presents a confidence-bsaed weakly-supervised learning pipeline for drug efficiency estimation, which uses cellular morphological features from a COVID-19 fluorescence microscopy image dataset. This study aims to tackle a very challenging problem of drug efficiency estimation, and the experimental results are encouraging. The rebuttal has addressed most of the reviewers’ comments, such as those regarding method description, the dataset used, and evaluation metrics. All the three reviewers suggested an acceptance after the rebuttal.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have satisfactorily responded to reviewers comments. Please incorporate the changes to the final version of the paper.



back to top