List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Ziyi Shen, Qianye Yang, Yuming Shen, Francesco Giganti, Vasilis Stavrinides, Richard Fan, Caroline Moore, Mirabela Rusu, Geoffrey Sonn, Philip Torr, Dean Barratt, Yipeng Hu
Abstract
Image registration is useful for quantifying morphological changes in longitudinal MR images from prostate cancer patients. This paper describes a development in improving the learning-based registration algorithms, for this challenging clinical application often with highly variable yet limited training data. First, we report that the latent space can be clustered into a much lower dimensional space than that commonly found as bottleneck features at the deep layer of a trained registration network. Based on this observation, we propose a hierarchical quantization method, discretizing the learned feature vectors using a jointly-trained dictionary with a constrained size, in order to improve the generalisation of the registration networks. Furthermore, a novel collaborative dictionary is independently optimised to incorporate additional prior information, such as the segmentation of the gland or other regions of interest, in the latent quantized space. Based on 216 real clinical images from 86 prostate cancer patients, we show the efficacy of both the designed components. Improved registration accuracy was obtained with statistical significance, in terms of both Dice on gland and target registration error on corresponding landmarks, the latter of which achieved 5.46 mm, an improvement of 28.7\% from the baseline without quantization. Experimental results also show that the difference in performance was indeed minimised between training and testing data.
Link to paper
DOI: https://link.springer.com/chapter/10.1007/978-3-031-16446-0_23
SharedIt: https://rdcu.be/cVRS4
Link to the code repository
N/A
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
This paper proposed a HiCo-Net for prostate MRI registration. The proposed HiCo-Net introduced a hierarchical quantizer with a collaborative dictionary to regularize the global and local feature maps of the CNN to generate a better deformation field. Experimental results showed that the proposed method can outperform other state-of-the-art methods in multiple evaluation metrics.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
This paper is well-written and easy to follow. The network architecture design showed enough technical novelty. The motivation is clearly stated in the introduction section.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The major problem of this paper is the lack of important experiments, which might undermine the main motivation.
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
It is easy to reproduce since the authors provide detailed architecture design in the supp file.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
— The author showed enough ablation study to prove the effectiveness of the model design. However, when compared to existed works, they only compared two works that are not SOTA. Please provide more comparison results with the following SOTA works in registration: They are in unsupervised settings but could be adapted to supervised/weakly supervised setting with public code. [1]. Balakrishnan, G., et al.: Voxelmorph: a learning framework for deformable medical image registration. (2019) [2]. Meng. Y, et. al.: DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images. (2020) [3]. Liu, L., et al.: Contrastive registration for unsupervised medical image segmentation. (2021)
Minor modification: — Please change the name of section 2.6. from Training to Loss Function, since you have another subsubsection called training in section 3.1. — It is better to modify the paper title so that the hierarchical quantizer idea is included in the title.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This is a good submission with clear organization, and enough technical contributions. If the authors solve my major concern above, I will keep my rating.
- Number of papers in your stack
5
- What is the ranking of this paper in your review stack?
2
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
The author solved my main concern regarding the comparision with SOTA methods. Hence, I would like to stick with my original acceptance ranking.
Review #3
- Please describe the contribution of the paper
This paper describes an approach for (prostate) image registration. Conjecturing that deep networks for this task are overparameterized and that the feature space can be clustered into few groups, the authors devise a vector quantization approach improving performance.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The paper is very well-written (relating to language variety, broad content organization, level of detail of explanation and reasoning).
The author’s approach is innovative, well-motivated, well-documented and thoroughly validated. As the approach is complex, the authors provide an ablation over method components allowing the reader insight into how different components work together and influence performance.
The method out-performs other approaches for this task.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
I don’t have major weaknesses to list.
- Please rate the clarity and organization of this paper
Excellent
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The method’s architecture is reproducible. However, as the method is learning-based, it is only fully reproducible with access to the processing pipeline and the model parameters. The dataset seems to be private.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
As is apparent from my other comments, I enjoyed reading this paper very much. I liked the overall organization, the motivation with t-SNE embeddings, the way quantization was introduced and built into the approach. Following the methods section, I expected to see an ablation which is exactly what the authors provided. In addition, their method out-performed other approaches which is a huge plus. I hope that this work finds it’s way into a journal at a later point.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
8
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper is well-written. The approach is well-motivated, well-described, and innovative. The evaluation includes an ablation of method components and a favorable comparison against reference methods. For me, this paper checks all the boxes.
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
1
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Not Answered
- [Post rebuttal] Please justify your decision
Not Answered
Review #2
- Please describe the contribution of the paper
This paper proposes a hierarchical collaborative quantization embedding for the image registration network to address the potential overparameterization problem. The authors also validate their proposed method on a private dataset.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The authors attempt to address a quite challenging but important topic to address the potential overfitting problem for image registration.
-
The idea of using collaborative and hierarchical quantization methods to reduce overparameterization is quite interesting and the formulation seems to be quite novel.
-
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
Format violations. The authors clearly modify the template in several places to gain extra writing space: 1) In section 2, the vertical spacings below section title “Method” and the following subsections including “2.2 Model overview” etc are clearly modified and reduced, which are strictly prohibited. 2) Text is wrapped around Fig.1, Fig. 3, and Fig. 4., which are strictly prohibited as well. 3) The supplementary file also exceeds the limit by 3 pages.
-
The effectiveness of the proposed framework (collaborative + hierarchical) is not well-supported by the experiments. 1) From Table 1, the combination of Dv, Dh, Dc yields lower dice conformance compared with Dv plus Dc, which seems unable to support the effectiveness of the proposed hierarchical quantization. 2) Interestingly, the vanilla quantization (Dv) alone also yields better Dice and DC scores than the proposed combination (Dv, Dh, Dc), which shows that the combination of Dh and Dc’s effectiveness under further exploration. 3) The proposed method also does not show superior performance compared to Dv plus Dh in terms of DSC and CD in Table 1, leaving the effectiveness of pre-trained Dc questionable.
-
Potential misselection of baseline models. From Table 2, the NiftyReg method yields even worse Dice conformance compared to without registration in Table 1. This shows that NiftyReg doesn’t work on the dataset or the authors didn’t implement it correctly. This also disqualifies it from the “state-of-the-art” methods, making the argument of superiority less supported.
-
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The method is validated on a private dataset and thus it is hard to comment on the reproducibility given the marginal improvement.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
-
The authors might consider including the VAE and its variants in the literature as it shares similar goals with the discrete quantization methods.
-
Please double-check the typos and grammar throughout the paper (e.g. section 2.2, decoder G()->D()?).
-
Please add some detailed support when making big arguments. For instance, “However, deep registration models are over-parameterized…” is not supported by any papers.
-
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
2
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The recommendation is based on the format violations and the main weaknesses listed above.
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
4
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
2
- [Post rebuttal] Please justify your decision
Thank the authors for the response. However, my concerns are not addressed.
- The “marginal gap” that the proposed combination is less effective than other combinations claimed by the authors in the rebuttal also applies to their own claimed improvement. As the method is only tested on a private dataset, its reproducibility remains questionable.
- The format violations alone (e.g. supp exceeds the limit by 3 pages, reduced vertical spacing, and wrapped text) should directly lead to desk rejection from the conference guidelines.
Review #4
- Please describe the contribution of the paper
The authors present a prostate MR image registration algorithm based on a feature quantization framework and also introduce a collaborative embedding. The experiments were performed on 216 clinical prostate images and the results indicate a 28.7% decrease in the target registration error.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The main strength of the paper lies in the fact that it uses vector quantization to perform weakly supervised image registration task. The paper is very well written and easy to follow.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The major weakness of the paper is that it lacks timing comparisons for various algorithms compared.
- Please rate the clarity and organization of this paper
Excellent
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The authors have mentioned that the source code will be provided.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
The authors should compare the run times for various algorithms compared.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
7
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper is very well written and the analysis seems thorough. The evaluation is also very convincing. The timing aspect of the analysis was a major reason for my score.
- Number of papers in your stack
7
- What is the ranking of this paper in your review stack?
1
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
7
- [Post rebuttal] Please justify your decision
The authors answered my queries and I will stick with my original rating.
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
Collaborative Quantization Embeddings for Intra-Subject Prostate MR Image Registration
This submission tackles image registration by improving the latent space representations of deformation fields. The originality resides in quantizing this bottleneck space with dictionaries of learned feature vectors, enabled by clustering such space. Evaluation is on a private dataset of prostate imaging. The clustered dictionary could be an original twist to exploiting a latent space in registrations, but the paper may be missing experiments to fully appreciate the novelty. The reviewers scores highly range from a high appreciation to important doubts that needs to be addressed in a rebuttal:
-
quantization - the concept of exploiting the bottleneck layers and quantizing them could indeed be related to the VAE approaches used in learning-based registration approaches (R2). Is there a reason for missing out on these approaches and choosing a baseline from 2010 or u-net?
-
validation - the choice and results from baselines may indicate a possible confusion (R2), with one state-of-the-art method producing severally inferior Dice scores than doing no-registration. Ambiguities in the ablation study may also require further clarifications. I would also add that beyond the Dice score, the quality of the deformation field should also be evaluated as typically done in registration, for instance, using the jacobian maps of the deformation field.
-
reproducibility - the reviewers indicate the use of a private dataset (R1,2,3,4), while there exists public prostate datasets such as the NCI-ISBI challenge.
-
formatting issue - while all reviewers appreciated the paper as well-written, it appears there are important space squeezes in the produced pdf (R2), due to the space limitation, which section would be removed while not jeopardizing the submission?
-
- What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
9
Author Feedback
We thank the reviewers for their valuable comments and apologize for the format issues and typos in the paper. They will be revised in the final version. Specifically, Fig. 3 will be moved to the supplementary as Sec. 2.5 is already self-contained. In addition, the comparison suggested by the reviewers will be added to Table 2. We will reduce the size of the figures and tables in supplementary to respect the page limit.
R1.1 Comparison with SotA. In revision, we will include the three methods under our weakly-supervised setting. However, we claim our original baselines also represent the SotA by producing on-par performance. The results yield VM:DSC=0.76+-0.08, CD=8.84+-3.15, MSE=0.05+-0.01, TRE=8.83+-5.14; Tag:DSC=0.82+-0.08, CD=7.59+-2.90,MSE=0.05+-0.01, TRE=7.45+-4.81; Contrastive:DSC=0.85+-0.11, CD=4.97+-2.40, MSE=0.05+-0.01, TRE=8.21+-4.40. Here, VoxelMorph is a 3D UNet based on its original paper’s setting, while our base model without quantization is an adapted deeper 3D UNet, of which the results are reported in Table 2.
R1.2 The minor issues will be resolved in the revision.
R2.1 The formatting issues and typos have been revised in the final version. (See top of the rebuttal)
R2.2 Effectiveness. It is correct that our full model did not hit the highest Dice score in all ablation studies in Table 1, with a marginal difference. Importantly, TREs on independent landmarks are widely considered as the more important metric for the clinical registration application, while other metrics such as MSE and Dice are provided mainly for reference purposes, as they were used as part of the combined loss. Compared with an unquantized baseline, our model obtained significantly better TRE results, which is consistent with our observation on the qualitative results (Fig.5).
- Dh enhances the abstract ability of low-level network features, benefiting the local metrics such as MSE and TRE. This coins with our results of [DvDc vs. all] and [Dv vs. DvDh].
- Dc improves the model’s semantic awareness of the organ, leading to a better global alignment ability in terms of DSC and CD (see [Dv vs. DvDc]).
- The DSC drop in [DvDc vs. all] shows a trade-off between losses, while the gap is yet marginal. The proposed model with the current metric importance setting in this paper considers both the specific lesion, landmark and organ deformation.
R2.3 NiftyReg obtains undesired results. We agree with R2 that applying classical methods can be challenging, for example, in tuning their large number of parameters for our real-world clinical data that have high variance and diversity. We do not claim that these results, represented by NiftReg, are SOTA or optimum, but are provided as typical examples from these classical methods. We will clarify this in revision.
R2.4 Link to VAE. We stress that our HiCo-Net is functionally different from VAE. VAE is a generative model, while HiCo-Net is for regression. We focus on learning discrete bottleneck features, while a VAE learns stochastic continuous features for generation purpose. Our model has no direct link to VAE also since we are not formulating a probabilistic model of DDF. The only relation is that both VAE and HiCo-Net have an encoder-decoder structure. If one trains a registration UNet with VAE-like reparametrization trick, its inference reduces to a conventional UNet forwarding, while we always keep quantization active during inference. We will report the performance of a VAE-like model with our backbone as follows: DSC=0.86+-0.02, CD=3.62+-2.1, MSE=0.04+-0.01, TRE=7.62+-3.9.
R2.5 Statement support. Our statement on over-parameterization is mainly supported by Fig 1 and it was also frequently reported from the VAE and other compression/quantization literature. We will add a reference to clarify.
R4.1 Computational time. We provide the running time of the methods: VoxelMorph(3D UNet), DeepTag, Contrastive21, VAE-like,our-UNet, and Ours as: 0.69s, 1.95s, 0.31s, 0.72s,0.62s, and 0.68s.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
Collaborative Quantization Embeddings for Intra-Subject Prostate MR Image Registration
The rebuttal has partially addressed the main comments. The remaining major concern is that the rebuttall proposes to add three new methods for comparison with new results, defer one main algorithmic figure to the appendix and further squeeze already small figures and tables. The supplementary material is currently 5 pages, beyond the allowed 2 pages - and is said to be having further supplementary content. Some minor vspace were used initially, but the rebuttal indicates that the authors will need severe major changes. The rebuttal is, therefore, insatisfactory. This is independant from the scientific merit of the paper, which is indeed encouraged for a submission in a clearer form. For the reasons mentioned, the recommendation is towards Rejection.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Reject
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
16
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
There are still concerns about comparison with other methods regarding the significance of performance improvement. While there is room to improve further, rebuttal clarified the proposed method’s novelty compared to related methods (e.g., VAE). The final manuscript should be significantly reformatted as suggested.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
6
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
I believe the paper is very borderline. I am less concerned with the formatting, but in terms of scientific merit, it seems that there is an overall leaning towards acceptance, with this being an interesting paper albeit with some flaws. There are substantial additions the authors promise, but these have helped clarify issues.
The authors must make these changes for the paper to be accepted, but I think the scientific merits are more important.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
6