List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Hamza Kebiri, Ali Gholipour, Rizhong Lin, Lana Vasung, Davood Karimi, Meritxell Bach Cuadra
Abstract
Diffusion Magnetic Resonance Imaging (dMRI) is a powerful non-invasive method for studying white matter tracts of the brain. However, accurate microstructure estimation with fiber orientation distribution (FOD) using existing computational methods requires a large number of diffusion measurements. In clinical settings, this is often not possible for neonates and fetuses because of increased acquisition times and subject movements. Therefore, methods that can estimate the FOD from reduced measurements are of high practical utility. Here, we exploited deep learning and trained a neural network to directly map dMRI data acquired with as low as six diffusion directions to FODs for neonates and fetuses. We trained the method using target FODs generated from densely-sampled multiple-shell data with the multi-shell multi-tissue constrained spherical deconvolution (MSMT-CSD). Detailed evaluations on independent newborns’ test data show that our method achieved estimation accuracy levels on par with the state-of-the-art methods while reducing the number of required measurements by more than an order of magnitude. Qualitative assessments on two out-of-distribution clinical datasets of fetuses and newborns show the consistency of the estimated FODs and hence the cross-site generalizability of the method.
Link to paper
DOI: https://doi.org/10.1007/978-3-031-43990-2_28
SharedIt: https://rdcu.be/dnwLD
Link to the code repository
https://github.com/Medical-Image-Analysis-Laboratory/Perinatal_fODF_DL_estimation
Link to the dataset(s)
https://www.developingconnectome.org/data-release/second-data-release/
Reviews
Review #1
- Please describe the contribution of the paper
The authors proposed a U-net-based framework to predict spherical harmonics coefficients (and later fiber orientation distribution FOD function) from a limited number of diffusion-weighted images. The authors demonstrated the application in case of newborn and fetus dMRI.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Impressive performance in predicting the FOD with as few as 6 DWIs at low b-value. FOD reflects the white matter and often need a large number of DWIs, particularly at high b-value for the signal to be sensitive to intra-axonal diffusion. Thus, the performance of the proposed method is remarkably impressive.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Lack of comparison to other learning-based method.
- Lack of discussion on how the authors achieved such impressive performance. While the results are impressive, the network architecture is simple. Such discussion gives important insights on why the proposed method performs impressively.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Authors stated code will be released upon acceptance.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- Is there any particular reason to use 12 DWIs for fetal instead of 6 DWIs (like in the case of neonate data)? Is this because of the low b-value? If so, what is the lowest number of DWIs the algorithm can accept (for reasonable results)?
- As stated in the manuscript, FOD estimation is vital to tractography. Adding tractography results would highlight the benefit of the proposed method. Note that improvements in FOD estimation might not be equivalent to improvement in tractography as tractography algorithms are designed to ignore spurious/small amplitude FOD.
- The authors should compare the proposed method with other learning-based approaches such as Ref. [18] or [25]. While those approaches use more DWIs than the proposed method, comparison with those approaches using a) more DWIs and b) only 6/12 DWIs will give a more accurate benchmark of the proposed method’s performance.
- MSMT-CSD is a multi-tissue model while CSD is a single tissue (WM) or 2-tissue (WM and CSF) model. Visual comparison of the proposed method (trained from MSMT model) and CSD might not be fair as results from CSD always pool GM as part of WM, creating spurious FOD glyphs. I would suggest comparison with single-shell 3-tissue (SS3T https://3tissue.github.io/doc/ss3t-csd.html)
- How the subsampling from GT to GT1 and GT2 was done? Were the DWIs picked randomly or following any scheme?
- More discussion should be include as the model is simple yet the results are impressive (which is a good thing). But without discussion and comparison with other learning-based method, it’s hard to evaluate the proposed method. Comparisons with CSA, CSD, and SFM is not directly give a clear picture as CSA, CSD, and SFM will always give results different from MSMT (which the proposed method was trained from) because they are different models.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Simple method with impressive results. However, lack of comparison with other learning-based approaches (although mentioned in the manuscript) undermines the validity of the work.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
5
- [Post rebuttal] Please justify your decision
I disagree with the authors on tractography. Improvements in fODFs estimation are not always equivalent to improved tractography and thus comparison using tractography is needed. Especially when compared with existing learning-based method (provided in the rebuttal), the numerical improvement is not big. On the other hand, the merit of the paper is the impressive performance with just a small number of gradient directions. This is highly useful in clinical context.
Review #2
- Please describe the contribution of the paper
The long scan time of diffusion MRI is especially problematic for fetal and neonatal brain imaging where subject motion is inevitable. Subsampling the diffusion gradient scheme, the most direct means to reduce scan time in dMRI, reduces effective SNR and limits the scope for more advanced dMRI analysis such as estimating the fiber orientation distribution (FOD). This paper aims to address this issue by developing CNNs that predict the FOD from subsampled fetal and neonatal dMRI scans. The network is trained in neonatal data from the Developing Human Connectome Project and evaluated in clinical neonatal and fetal data.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Reducing the number of dMRI images required is an important problem, not only for perinatal imaging but for pediatric imaging in general.
- The paper includes an extensive quantitative evaluation of the strengths and weaknesses of this approach.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The predicted FODs in clinical newborns and fetuses, shown in Fig. 3 and Fig. S2, seem to clip at zero in large regions of the brain. The paper suggests that this is a desired effect, and states that CSD overestimated false positive crossing fibers in these regions. However, if that were the case it is very surprising that there is more evidence of early developing WM at 25 weeks GA (fetal; top) than there is at 29 weeks (fetal; bottom). There is well documented evidence of early developing white matter pathways in the fetal brain at this age in the regions shown in this image (Wilson et al., PNAS 2021).
- Training a network to map dMRI to FOD images implies that the network learns the tissue response function(s) otherwise used in (multi-tissue) CSD. Given the dependency of the response function on the b-value, TE and other acquisition settings, there is a risk of bias when using a pre-trained network in data collected with other acquisition settings, such as the clinical data included in this work. I think this potential source of bias warrants a more extensive discussion.
- Please rate the clarity and organization of this paper
Excellent
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The responses to the reproducibility checklist appear correct.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
I have no additional comments.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
6
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Interesting paper with some remarks for discussion.
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #4
- Please describe the contribution of the paper
The paper uses a U-Net inspired approach to estimate the FODs (fiber orientation distributions) for neonate and fetal diffusion MRI data. Downsampling of the number of gradient directions is explored to allow robust estimates from only 6 and 12 gradient directions for neonates and fetal brains, respectively. Densely sampled, multiple-shell dHCP data (300 measurements across 4 b-values) is used to train and test the network (100 neonates, 68 fetal scans). Results are provided in comparison with three methods (CSD, CSA, SFM) and two independent subsets created from the ground truth dHCP data. Qualitative comparisons are also provided with a clinically obtained dataset of 8 neonates and fetal scans, each.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The paper is well written with many details provided for easy reproducibility. It provides a sound evaluation using both public and clinically acquired data. In general, newborn and fetal scans are more challenging due to the lower SNR and shorter acquisitions, as well as the naturally high variability across developing brains. The results look promising, given the challenging nature of the problem. The novelty is mainly in the application to newborn and fetal population (although some related prior work exists, e.g. Karimi 2021 [17]) and a qualitative comparison with clinically acquired scans.
The paper also quantitatively reports the disagreement between three state-of-the-art (SOTA) methods in their reconstructed FODs and the related error metrics. This highlights the overall issue of reproducibility and robustness required from these methods, especially for newborn and fetal data which is more prone to time constraints and motion artifacts.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Some of the SOTA methods used for comparison raise a few questions. Since some of these suggestions may require additional experiments, they are better suited for an extended journal paper. However, their absence makes it harder to fully evaluate the novelty and results of the paper :
-
For comparison, why was CSD constructed only with the b=2600 data, and not with data from other shells (especially since the clinically acquired test data used for comparison, is also at b=1000)?
-
Additionally, it would also be more meaningful to use other deep learning approaches for a more direct comparison: — Lin et. al [25 ] and Hosseini et. al. [13] also downsample to a reduced number of gradient directions, estimate FODs, use deep learning (CNN), and exploit the spatial continuity information by using the neighboring voxels - similar to the approach proposed by the authors. [22] Kopper et. al. also use CNN and downsampling to estimate FODs. Since the primary novelty of the proposed work is application to neonates and fetal scans, this comparison is important to understand the differences required in network learning to adapt the prior work to this population and resulting effects on the performance. — Karimi 2021 [17] use deep learning for direct parameter estimation on fetal population. The authors could compare their estimated FA maps with these.
-
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The paper is easy to reproduce - most important details are provided.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
For validation, adding a benchmark tractography strategy and fiber-related error metrics will further strengthen the quantitative comparison between methods and their estimated FODs. Often, tractography is the next logical step after FOD computation for further analysis and group studies. It will be good to see, if beyond the disagreement in FOD error metrics, do the noisy FODs lead to noisier fiber estimation as well? Or, will a good tractography algorithm wipe out some of these differences reported across the methods based on FODs.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper is easy to read and reproduce, and provides very detailed results and evaluation on a challenging population with promising results. The comparison with competing methods has scope for improvement as it misses out on closely related deep learning methods, reducing the available insights into the adaptation of deep learning to the newborn and fetal population.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
Reviewers agree that this paper addresses an important problem, are impressed by the results, and are generally in favor of acceptance. However, they also point out that the method as such is not very innovative, that the ultimate benefit remains unclear (since no tractography results are presented), and that the experimental comparison does not include the most relevant competitors. This specific combination of concerns has led to substantially lower scores on many other submissions, and I think it is fair to directly compare this submission to those after the rebuttal phase. While answering the reviewer questions to the extent possible, I would encourage prioritizing the following questions:
- If you see novelty in your methodological approach that you believe the reviewers did not fully appreciate, please highlight it.
- Comment on the ability to generalize despite learning a specific response function (R2).
- Comment on plausibility of some of the predicted FODs (R2).
Author Feedback
Novelty: We emphasize that the main contribution of this work lies in the application domain, where our technique allows robust estimation of fiber orientation distribution (FOD) from drastically lower number of measurements than what is required by standard techniques. But, our model is also technically novel. Our network is built upon the powerful U-Net architecture (Ronneberger et al. 2015), but instead of pooling layers, it exploits strided convolutions and extended skip connections, as well as a large field of view with a patch size of 16x16 (Fig. S1). In fact, with these components we gained significantly on performance. We acknowledge that to use our limited paper space most efficiently, we had to organize the paper and allocate enough space to highlight our main contribution in the application domain, which seemingly resulted in an under-representation of the methodological aspect of the work, but as reviewers have noted, the work is reproducible; and we will share the model, algorithm, and code.
Comparison to SOTA: We trained the MLP in Karimi et al. 2021 (which outperformed Lin et al. 2019) in newborns. MLP performance (agreement rate, AR: 65%, 11% & 4%;angular error, AE: 49°, 39°& 37° for 1-,2- and 3-fibers respectively; AFD error: 0.67) is lower compared to our model (AR of 78%, 16% & 4%;AE error: 13°, 24°& 33° for 1-,2- and 3-fibers respectively; AFD error of 0.27). These results can be added to the manuscript.
Response function generalization (R#2): The dHCP and the clinical newborn datasets have the same anatomy (same ages) and were acquired with the same b-value (b1000). As such, the generalization problem is limited. However, the generalization from preterms to fetuses is more challenging and would require extensive validation in fetuses, including pathology, but this was beyond the scope of this work. However, the difference in b-values was small (training on b400 and testing on b500), and we did normalize by b0 before training to reduce b-value dependency. Further, the network was trained on preterms (as in Karimi et al. 2021) with overlapping age ranges with fetuses, and we show that the generated FODs are coherent with the known anatomy.
Fetal FOD plausibility (R#2): It is important to note that FOD visualization does not necessarily correlate with the degree of “fiber maturation”, which is displayed differently in the fetal brains (axonal growth, transient wiring) compared to postnatal brains (predominant myelination of fiber pathways). Also, the regions compared in Fig. 2 (top & bottom rows) are not identical, and the visualization of FODs depends on a magnitude parameter (MRview tool). Lastly, based on histological evidence, the subplate zone primarily contains water and extracellular molecules and increases in volume during the gestational weeks (GW) of 20-30 (Vasung et al. 2016; Kostovic et al. 2020). Only after 33 GW that the size of the water compartment subplate diminishes, and the fibers likely become denser. This observation is partly supported by the increasing mean fractional anisotropy (FA) in Wilson et al. 2021.
Tractography (R#1, R#3): Analyzing tractograms can be very interesting. However, tractography lacks a ground truth, thus its evaluation remains qualitative and suffers from confoundings such as the selected method or any choice of parameters. On the other hand, the correctness of the FODs is necessary to generate a plausible tractography from FODs. Moreover, FODs alone provide valuable local information such as the number of fibers, their orientation, their density (Raffelt et al. 2012, Reisert et al. 2013), the multi-directional anisotropy (Tan et al. 2014), gFA and other rotationally-invariant metrics derived from the SH coefficients.
Others: 1) CSD (R#3): We have used the highest shell (b2600) as recommended in Tournier et al. 2007. 2) Fetal input (R#1): Due to low SNR in fetal data, 6 fetal measurements did not produce satisfying results. Reason why all directions were used.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
It is clear from the rebuttal that the main contribution of this work on the application side, and based on the presented results, there is sufficient reviewer support for this paper. Even though I vote for acceptance based on the reviewer feedback, I still do not agree with the authors’ statement in the rebuttal that “the correctness of the FODs is necessary to generate a plausible tractography from FODs”. In my experience, tractography can be surprisingly robust to local errors in FODs, such as spurious peaks in individual voxels, and I still do not find it obvious how large the practical benefit of the presented improvements will really be.
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
While some of the weaknesses pointed out by the reviewers remain, especially pertaining the limited technical novelty, the clinical and practical relevance is there and the results are interesting. I therefore recommend accept.
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The paper is technically sound and interesting and addresses an important challenge in fetal and neonatal imaging. It may have limited technical novelty; however, its application in clinical setting and good results justify its acceptance as a conference paper.