Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Leo Milecki, Vicky Kalogeiton, Sylvain Bodard, Dany Anglicheau, Jean-Michel Correas, Marc-Olivier Timsit, Maria Vakalopoulou

Abstract

Renal transplantation appears as the most effective solution for end-stage renal disease. However, it may lead to renal allograft rejection or dysfunction within 15%-27% of patients in the first 5 years post-transplantation. Resulting from a simple blood test, serum creatinine is the primary clinical indicator of kidney function by calculating the Glomerular Filtration Rate. These characteristics motivate the challenging task of predicting serum creatinine early post-transplantation while investigating and exploring its correlation with imaging data. In this paper, we propose a sequential architecture based on transformer encoders to predict the renal function 2-years post-transplantation. Our method uses features generated from Dynamic Contrast-Enhanced Magnetic Resonance Imaging from 4 follow-ups during the first year after the transplant surgery. To deal with missing data, a key mask tensor exploiting the dot product attention mechanism of the transformers is used. Moreover, different contrastive schemes based on cosine similarity distance are proposed to handle the limited amount of available data. Trained on 69 subjects, our best model achieves 96.3% F1 score and 98.9% ROC AUC in the prediction of serum creatinine threshold on a separated test set of 20 subjects. Thus, our experiments highlight the relevance of considering sequential imaging data for this task and therefore in the study of chronic dysfunction mechanisms in renal transplantation, setting the path for future research in this area. Our code is available at https://github.com/leomlck/renal_transplant_imaging.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16449-1_24

SharedIt: https://rdcu.be/cVRU4

Link to the code repository

https://github.com/leomlck/renal_transplant_imaging

Link to the dataset(s)

N/A


Reviews

Review #3

  • Please describe the contribution of the paper
    1. They propose the use of contrastive schemes, generating informative manifolds of DCE MRI exams of patients undergoing renal transplantation. Different self-supervised and weakly-supervised clinical pertinent tasks are explored to generate relevant features using the cosine similarity.
    2. They introduce a transformer-based architecture for forecasting of serum creatinine score, while proposing a tailored method to deal with missing data. In particular, Their method is using a key mask tensor that highlights the missing data and does not take them into account for the training of the sequential architecture. Such a design is very robust with respect to the position and number of missing data, while it provides better performance than other popular data imputation strategies.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    According to my knowledge, this study is among the first that propose a novel, robust, and clinically relevant framework for forecasting serum creatinine directly from imaging data. This study proposes a novel transformer based architecture tailored to deal with missing data for the challenging task of serum creatinine prediction 2 years posttransplantation using imaging modalities. First, they show the significant use of contrastive learning schemes for this task. Their trained representations outperform common transfer learning and contrastive approaches. Then, a transformer encoder architecture enables to input the sequential features data per follow-up in order to forecast the renal transplant function, including a custom method to handle missing data. Their strategy performs better than other commonly used data imputation techniques. Those promising results encourage the use of medical imaging through time to assist clinical practice for fast and robust monitoring of kidney transplants.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It’s hard to reproduce.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Negative. The authors did not share the source codes and the results are limited.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    It is difficult to reappear. The authors mentioned a lot method but the results were limited. It was hard to persuade the reviewers. More experiments will be better.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    According to my knowledge, this study is among the first that propose a novel, robust, and clinically relevant framework for forecasting serum creatinine directly from imaging data.

  • Number of papers in your stack

    7

  • What is the ranking of this paper in your review stack?

    5

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    They have twofolds in thier contributions:-

    • They propose the use of contrastive schemes, generating informative manifolds of DCE MRI exams of patients undergoing renal transplantation. Different self-supervised and weakly-supervised clinical pertinent tasks are explored to generate relevant features using the cosine similarity.

    • They present a transformer-based architecture for forecasting of serum creatinine score, while proposing a tailored method to deal with missing data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    I think this is a new novel study for forecasting serum creatinine directly from imaging data.They proposed two contrastive learning schemes to explore meaningful data representations. They showed the significant use of contrastive learning schemes for this task. They used the transformers encoder to input the sequential features data per follow-up to forecast the renal transplant function, including a custom method to handle missing data.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • I see the authors make challenging task of serum creatinine prediction 2 years posttransplantation, however, I see only four different follow-up exams ends with (M12) after one year not two years as they discussed.
    • The authors need to discuss what is the limitations of the proposed framework and when the proposed system can fail.
    • I see the authors used ResNet18 to extract a latent representation from the MRI volumes (size =, 512×512×[64−88] voxels), however, the resenet18 uses an input image of size 224 x 224; this mean a lot of information in the MRI images can be dicaseded during the training process. and this can effect on the training result.

    • testing sample size (20) is very limited in this study also.
    • I see the authors used the only 10-fold cross validations, are you tried different cross validations ? and what was the output ?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    I can’t tell that this method can be reproducible cause they didn’t applied the framework on a benchmark dataset

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    • They need to incrase the test set.
    • The authors need to discuss the proposed system limitations in addition when the system can fail.
    • I see the authors can use their own CNN instead of ResNet18 cause this pretrained network loss a lot of information in MRI images
    • The paper didn’y have figures enough to show how the proposed system works on MRI images and can show these images during the prediction stage over two years
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, the paper is intersted but have a lot of modifications and some experiments

  • Number of papers in your stack

    1

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #5

  • Please describe the contribution of the paper

    Contrastive Masked Transformers for Forecasting Renal Transplant Function This paper presents a methodology for forecasting renal transplant function based on contrastive masked transfoms. Overall, the problem is interesting. However, the paper needs more work to be presented in the prestigious conference.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • Interesting research problem • Up-to-date reference list • The method outperforms related works

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • Results are incomplete • Missing Discussion

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper is reproducible

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. Comparison with conference papers is not appropriate. Strong journal paper should be used instead
    2. Visual assessment of the results is missing
    3. Discussion is missing. How and why the method outperform others. Limitations should be given and why the approach not work well with certain cases
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The methods and results are sufficient. However, it needs more discussions

  • Number of papers in your stack

    1

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors developed and validated a methodology for forecasting renal transplant function based on contrastive masked transforms. The study used contrastive learning schemes to deal with missing data. The pipeline has been trained and tested on 69 and 20 locally-acquired patients’ data, respectively. The obtained results documented high performance over other common transfer learning and contrastive approaches. In total, the study has a strong clinical relevance. The paper is well written and organized, experiments are well-setup and results are very promising. A few suggestions for the final version are to elaborate more on discussing the significance and limitations of the result. Also, add a summary of the obtained quantitative results and the data used to the abstract. Please list/discuss the rationale behind the specific choice of 110 \mu mol/ threshold for serum creatinine level. I am curious about the parameter choice in Section 3.3, were these set empirically or experimentally? Are the DCE-MRIs collected for the same patients during the four follow-up scans (numbers are different in the manuscript)? Please give a reference for the LSTM model used for comparison and add a short description of the radiomic features and serum creatinine statistics in Table 1. More clarification about how the data is represented in Tables 1 and 2 should be given in the text and/or the caption (i.e., what do the numbers 81, 1(10,8) mean). Visual representation of the prediction for good and bad cases (at different time points using e.g., boxplot) is highly recommended to be added to the paper (R2 and R3)

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2




Author Feedback

We would like to thank the reviewers and the area chair for their positive evaluation of our work and appreciate their comments and suggestions. While all the reviewers appreciated the novelty and the promising results of our approach, there were also a number of questions raised during the review process that we would like to address. These questions are focusing on i) clarity about the data presentation and, ii) lack of discussion of the results. Moreover, except for the hyperparameters that are mentioned in the paper (Section 3.3), our source code (for both contrastive and transformer parts) will be made available in the camera-ready version ensuring the reproducibility of our method (R3).

  • Starting with the data concerns (AC, R4), the DCE-MRIs were collected from 89 different subjects for the 4 follow-up exams, including missing exams which resulted in respectively 68, 75, 87, and 83 available scans at each follow-up. Concerning the biological target (R4), the aim of this study is to forecast renal transplant function through the serum creatinine at 2 years post-transplantation from the DCE-MRI data acquired during the first year, so there is indeed a gap of 1 year between the last follow-up exam used and the prediction target. In our work, the rationale behind the specific serum creatinine level threshold (AC) was specified by nephrology experts, as a clinically relevant value to assess normal and abnormal renal transplant function at a specific time point.
  • Due to page limits, we did not provide an extensive discussion on the results sections as pointed by AC, R4, R5. We will incorporate the suggestions of the reviewers into the camera-ready paper (adding a more detailed description of the tables) and we will include some further discussion on the results and limitations of the approach. Indeed, our model seems to misclassify cases where the patient’s serum creatinine is stable and close to the used threshold, during the first two years post-transplantation. Some additional analysis to investigate specific clinical information of each subject would certainly show some insights into those erroneous cases.
  • Moreover, R4 points out a concern about volume sizes and information loss. Indeed, the initial MRI volume has a size of 512×512×[64−88] voxels. However, using an automatic way to crop a region of interest (ROI) [17] around the renal transplant, we fixed the training/inference input size to the biggest ROI (size=192×144×88 voxels), and the ResNet18 was trained from scratch using this input size and our contrastive schemes. As such, we reduce the data dimensionality while no information about the transplant is discarded.
  • AC’s question about the parameter choice in Section 3.3, the training hyperparameters (such as learning rate, and number of epochs) were set empirically during our preliminary experiments, whereas the structural parameters of the model (such as N, h, D_model) were set by grid search using the 10-fold cross-validation.
  • To answer R4’s question about cross-validation, we experimented both with 5-fold and 10-fold cross-validation, where the latter resulted in higher performances due to larger training sets (thus preventing overfitting) and having more models to the ensemble approach at testing time (thus increasing robustness).
  • Finally, for the concerns about the small test set (R4), we want to highlight that according to our knowledge, our dataset is the largest with respect to MRI-based with multiple follow-up exams kidney transplantation datasets in the literature. In future work, we will try to incorporate additional external data, something that is not very trivial due to the nature of our clinical problem and the absence of available datasets.



back to top