Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Davood Karimi, Ali Gholipour

Abstract

Deep learning has a great potential for estimating biomarkers in diffusion weighted magnetic resonance imaging (dMRI). Atlases, on the other hand, are a unique tool for modeling the spatio-temporal variability of biomarkers. In this paper, we propose the first framework to exploit both deep learning and atlases for biomarker estimation in dMRI. Our framework relies on non-linear diffusion tensor registration to compute biomarker atlases and to estimate atlas reliability maps. We also use non-linear tensor registration to align the atlas to a subject and to estimate the error of this alignment. We use the biomarker atlas, atlas reliability map, and alignment error map, in addition to the dMRI signal, as inputs to a deep learning model for biomarker estimation. We use our framework to estimate fractional anisotropy and neurite orientation dispersion from down-sampled dMRI data on a test cohort of 70 newborn subjects. Results show that our method significantly outperforms standard estimation methods as well as recent deep learning techniques. Our method is also more robust to higher measurement down-sampling factors. Our study shows that the advantages of deep learning and atlases can be synergistically combined to achieve unprecedented biomarker estimation accuracy in dMRI.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16431-6_12

SharedIt: https://rdcu.be/cVD4T

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a novel framework to unify atlases and DL for dMRI biomarker estimation. It was used to estimate fractional anisotropy and neurite orientation dispersion from down-sampled data and achieved superior accuracy .

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • extensive comparison with competing methods
    • according to their resuls, the proposed method is faster and more accurate than existing ones;
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • implementations of other DL methods might not be correct
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Although the architecture is simple (U-Net), the full method is rather complex. Without the code, it seems unlikely one can reproduce the experiments. Although the authors mentioned in the reproducibility checklist that the code was made available, they did not mention it in the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    It is never easy to validate implementations of other’s methods. It would be better to use the original code/model for comparison purposes;

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Good paper with solid results. Reproducibility could be improved by sharing the code.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    This paper proposes a novel method that exploits deep learning together with atlases of brain microstructure parameters. This is used for the estimation of microstructure scalar parameters from diffusion MRI.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper proposes to learn from an atlas in addition to the diffusion images to evaluate on a new set of data microstructure scalar parameters. This approach seems novel to me and results are of interest.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • there is no justification as to why learning from an atlas would be a better idea than learning from the individual images themselves, although results tend to show interest (see tables)
    • in methods, the atlas construction process is missing important details. Overall the clarity should be improved.
    • learning only scalar parameter values seems to be the lower end of what could be achieved, why not learning complete models ?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Nothing to say

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    In addition to comments above, the paper lacks details in the methods explanation. In particular, is the atlas built in an unbiased way or towards a given reference ? If the second option, which one ? How re the average tensor images computed, what type of tensor rotation, etc. ?

    Also it seems to me that there are mismatches in notations between section 2.2 and eq 2 (what is \bar{T}_k)

    On a more phylosophical point of view, there is no clear justification as to why DL would work better on atlas than on the input images themselves. Is there also a reason why not considering full models rather than “just” microstructure maps. This is a bit deceiving in terms of novelty. There should be a discussion on those aspects.

    That being said the results section is well constructed and the evaluation shows the difference between results without and with the atlas, showing improved results.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, I find the paper interesting but brought a bit downwards by novelty and some missing parts in the methods, thus my recommendation of weak reject

  • Number of papers in your stack

    3

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    The major contribution of this work is a framework that leverages atlas-based registration methods with deep learning for the estimation of dMRI of FA and orientation dispersion from tensor and NODDI estimation. Specifically the authors propose the use of atlas-based registered features, together with atlas-reliability map (as given by standard deviation) and registration error (proposed formulation). These three are concatenated to the diffusion signal as inputs to the DL fitting model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • Paper contributes with a novel approach to leverage atlas-based derived information with dMRI signal for the estimation of FA/OD scalars. To my knowledge, such idea and approach are both novel. • Authors use dHCP data for atlas-construction (230) and have a “pure” testing set of 70 subjects through different ages (31 to 45 weeks). • Comparison with existing approaches and statistical significance of the results

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • Limited proof that outperformance of the provided framework is due to the atlas • Limited analysis and discussion of the results, eg is there any bias in the results as per age or which is the limit point where atlas is no longer useful? (see my details comments below) • Questionable choice of registration method (also acknowledged by authors)

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Authors state training code will be made available, and already publicly available data is used. Parameters of compared methods are not clearly stated (probably due to space limitations), authors claim were chosen according to original paper description but this does not support reproducibility of SOA results. Unclear if statistical tests are done for the ablation study.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Overall, I have found this paper very interesting and as raised in the strong points there is the novelty of the proposed framework. The paper is clearly written and organized. The proposed framework is methodological sound though some choices are unclear to me. The major drawback of the presented work is the lack of discussion and other analysis that would have been in my opinion valuable to better illustrated the contribution of this work. But I do of course acknowledge the limited space of the MICCAI template. Still, I will raise here below my concerns/comments that could maybe help the authors improve their work in the future.

    The authors did a very nice work by including several existing methods for comparison an adding statistical test on the results.

    I was wondering on why the authors selected the given downsampling factors that indeed ending with very low number of measurements overall. It would have been interesting to explore which is the number of directions if any where the proposed method does not outperform the others. Or is this systematically the case? For instance 32 directions is still plausible setting for new born data and should be included.

    Equally, this reviewer is very curious to understand if the results vary across age of scan? Would have been interesting to see boxplots (in order to understand also if there are outliers) through different age. Maybe if the authors already explored this, some comment can be included.

    Similarly, can the authors discuss if there is any specifc anatomical region where method is outperforming? Given that atlases are less accurate at GM/WM cortical interface given large cortex variability, maybe in those areas atlas-prior is less accurate and then not so relevant? How the accuracy change for WM structure where overall these models are most appropriately defined?

    It would have been interesting also to push further the study of the influence of the registration error. Authors could have included rotation/transaltion errors as to see how this influences the performance of the proposed method.

    I do appreciate the ablation study that is indeed very needed and should be clarify if the results on Table 3 are also statistically significant or not. It is not so obvious though which is the contribution of each of the steps given those values on table 3. At least in my opinion two first columns seem quite equivalent. Please clarify. Also why not including also n= 12 (FA) and n=30 (OD) in that table? Maybe Figure 2 panel c can be removed to give more space for ablation study results.

    As the authors already acknowledge the choice of diffusion tensor registration can be questionable. As for accurate registration the use of T2w image seems more appropriate. Though still T2/dMRI is then needed. If space allows some more discussion /justification on this registration choice could be included.

    Minor

    • Authors could clarify since end of introduction page 2 that the down-sample dMRI refers to diffusion gradient direction downsampling and not spatial resolution.
    • Typo in page 6, 2nd paragraph, twice “to” before refernces 30, 17
    • could the authors clarify if all the methods work at the same spatial resolution?
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think this is a fair conference paper, given the novel contribution of combining different aspects of atlases with DL for solving a widely practical problem of scalar diffusion and microstructure estimation from dMRI. Despite the limitations and comments I raised, this paper is valuable for discussing during the conference.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The use of deep neural network architectures to estimate biomarkers from sparsely sampled dMRI data is a well-established research topic at MICCAI. This manuscript argues that using an atlas as an additional input can further improve the accuracy of the resulting estimates. Two of the three reviewers view this as a fair and solid conference contribution, while Reviewer #2 was not fully convinced by the authors’ claim that adding an atlas “makes much intuitive sense”. I think their concern could be re-expressed by saying that neural network architectures such as the U-Net do not rely on a per-voxel mapping of measurements to biomarkers, but also account for spatial information. As such, it is not obvious why they should not be able to implicitly learn a spatial prior from the training data that could be equivalent to the information contained in an atlas. Even though the authors demonstrate a (small) practical benefit from using the atlas as an additional input, this might rely on their use of a patch-based neural network, which can be expected to prevent implicit learning of a global spatial prior in the network itself and/or the addition of temporal information via the use of age-specific atlases, which could alternatively be captured by training separate networks (or network branches) for different age groups. I believe authors should be given the opportunity to address this major point in a rebuttal, along with the following minor ones:

    • how exactly were the atlases constructed (R2)?
    • can you comment on performance with more moderate downsampling factors (R3)?
    • can you comment on whether the benefit is most pronounced at specific ages or in specific brain regions (R3)?
    • can you comment on statistical significance of differences in the ablation study (R3)?
    • I personally find Eq.(2) to be counter-intuitive. Given that theta takes on random values if at least one involved tensor has low FA, should we not reduce its impact to zero in those cases to achieve a stable computation? Intuitively, for isotropic tensors, the accuracy of the match is not affected by rotating the tensor. The proposed neg-exponential weight expresses the opposite.
    • given that high values in the “confidence maps” indicate low confidence, wouldn’t it make more sense to refer to them as “uncertainty maps”?
  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    1




Author Feedback

We thank all reviewers.

  • The main point raised by Meta-R and R2 is regarding “justification” for use of atlas. As summarized by Meta-R: “neural nets can exploit spatial information; it isn’t obvious why they shouldn’t learn a spatial prior from training data that could be equivalent to atlas”. The answer is that the atlas is built from biomarker maps estimated with fully-sampled data. The DL model, on the other hand, is trained to estimate biomarkers from under-sampled data. We append the atlas information to under-sampled data to form the input to our model. Moreover, we build the atlas via tensor-based diffeomorphic registration, thereby accurately aligning white matter structures simultaneously across multiple subjects. Although DL models can learn spatial information within a subject’s image, they are not capable of encoding the information obtained via tensor-based diffeomorphic deformation that is used in building the atlas. We have partly described this justification in the paper; we can add one or two sentences to further clarify.
  • Related to above comment, Meta-R wrote: “use of a patch-based network … can prevent implicit learning of a global spatial prior”. Note that we used 48^3-voxel patches (at 1.2mm resolution) that, given small newborn brains, cover most of the brain. Use of patches (rather than entire brain) is common practice because it also acts as data augmentation. Our comparisons with other methods and ablation experiments (Tables 1 and 2) clearly demonstrate that superiority of our atlas-powered method is not due to the use of patches. Not only ablation experiments show this for our model, three of the compared models use the entire brain as input.
  • Atlas creation (Meta-R and R2): atlas creation itself is not a contribution of our work and we don’t have space to give details; we can include more references that describe all details.
  • More moderate downsampling (R3): Recent DL works use 6 measurements for DTI and ~20 for NODDI. We have already included more moderate downsampling (12 for DTI and 30 for NODDI). Our method achieves superior results with even more moderate downsampling. We don’t have space to add results but we can comment on it briefly.
  • Dependence on age or brain regions (R3): We haven’t observed any dependence of performance on age or region. This is in part evident from example error maps for different ages in Fig 3. We can add a sentence to stress this.
  • Statistical significance in ablation study (R3): With one exception (FA between first and second columns), all differences are significant (p<0.01) computed with paired t-test. We can add this information to Table 2.
  • Meta-R’s comment on Eq. 2: This equation is meant to capture registration error. Since this is a tensor-based registration, voxels with isotropic tensors should get higher values since tensor-based alignment for those voxels is less reliable. Since major tensor eigenvectors may align by chance (making the first term artificially small) we have added the second term which gives a higher value to tensors with lower FA.
  • Estimating full models (R2): Our method can be used for full models as well. Due to space limit we cannot add more results, but we can mention it.
  • We respectfully disagree with R3 that tensor registration is “questionable”. It is well known that tensor-based registration is superior to anatomical (e.g., T2) registration in dMRI as it can align local white matter. As we mentioned in Conclusion, adding anatomical information might improve the alignment further, but our approach is not questionable.

SUMMARY: As appreciated by all reviewers, our method is entirely novel: no prior work has used spatiotemporal atlases in the context of DL for biomarker estimation. Comparisons and ablation experiments show our method is significantly superior to multiple SOA methods in dMRI biomarker estimation, which is a highly important application. We look forward to sharing our work with the MICCAI community.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Based on the rebuttal, I substantially decreased my previous ranking of this paper. I see no argument why simply making networks age-specific should not be expected to lead to an improvement that is similar to the modest one that is demonstrated here as a benefit of the atlas. Instead, authors argue that the benefit results from the fact that the atlas is generated from fully-sampled data. This does not make sense to me. Why not simply use the same fully-sampled data to train the network, omitting some of the available information from the network inputs but still using them when generating the ground truth for the network output? Starting with the seminal work of Golkov et al., that has been the well-established procedure.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    10



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed the justification of using atlas based features for deep learning. I also agree with the two reviewers about the merits of this novel method for acceptance.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    6



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper proposes to use spatiotemporal atlas built with fully-sampled data to guide DL method to estimate diffusion features in under-sampled neonatal dMRI. The idea is novel, although might not be the best solution. The rebuttal addressed some concerns, but not the question on the alternative solution of training separate networks for different age groups. Overall, this paper is worth to be discussed in MICCAI.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    8



back to top