Authors

Alexander Bigalke, Mattias P. Heinrich

Abstract

Point cloud-based medical registration promises increased computational efficiency, robustness to intensity shifts, and anonymity preservation but is limited by the inefficacy of unsupervised learning with similarity metrics. Supervised training on synthetic deformations is an alternative but, in turn, suffers from the domain gap to the real domain. In this work, we aim to tackle this gap through domain adaptation. Self-training with the Mean Teacher is an established approach to this problem but is impaired by the inherent noise of the pseudo labels from the teacher. As a remedy, we present a denoised teacher-student paradigm for point cloud registration, comprising two complementary denoising strategies. First, we propose to filter pseudo labels based on the Chamfer distances of teacher and student registrations, thus preventing detrimental supervision by the teacher. Second, we make the teacher dynamically synthesize novel training pairs with noise-free labels by warping its moving inputs with the predicted deformations. Evaluation is performed for inhale-to-exhale registration of lung vessel trees on the public PVT dataset under two domain shifts. Our method surpasses the baseline Mean Teacher by 13.5/62.8%, consistently outperforms diverse competitors, and sets a new state-of-the-art accuracy (TRE=2.31mm). Code is available at https://github.com/multimodallearning/denoised_mt_pcd_reg.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_63

SharedIt: https://rdcu.be/dnwxg

Link to the code repository

https://github.com/multimodallearning/denoised_mt_pcd_reg

https://med.emory.edu/departments/radiation-oncology/research-laboratories/deformable-image-registration/downloads-and-reference-data/copdgene.html

Link to the dataset(s)

https://github.com/uncbiag/robot

https://med.emory.edu/departments/radiation-oncology/research-laboratories/deformable-image-registration/downloads-and-reference-data/copdgene.html

Reviews

Review #4

Please describe the contribution of the paper

This paper proposed a novel training scheme for point cloud registration using labelled source dataset (supervised learning) and unlabelled target dataset (unsupervised learning and domain adaptation). Compared with the existing training scheme, this new scheme introduced Chamfer distance to filter out less accurate teacher registrations, and ultilizing teacher predictions to synthesize new data pairs to help training. The results proved that the proposed training scheme sets a new state-of-the-art accuracy.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- This paper presented a novel training scheme for point cloud registration. The first contribution is to adopt Chamfer distance to measure the quality of the student prediction and teacher prediction, and filter out teacher predictions that are less accurate based on Chamfer distance. The second contribution is during mean teacher training scheme, add additional synthetic data pair by warping the moving cloud using teacher predictions and get a synthetic fixed cloud. The proposed scheme is proved to be a new state-of-the-art in the test set when compared with various baselines and existing methods.
- This paper is very well organized, all technical details are very clearly and accurately presented. The code is also published which makes the reproducibility excellent.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Additional proof-reading can be beneficial to fix some minor errors in writing. For example, “For further implementation details, we refer to our public code.”
- Although the proposed training scheme is proved to be useful and can potentially benefit many similar tasks, there are no theoretical or model contribution in this paper, which makes the overall contribution less significant.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

This paper clearly presented technical details of the method and experiments. The code is made public, the dataset used in experiment is public available. The reproducibility is excellent.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- Additional proof-reading can be beneficial to fix some minor errors in writing.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

7
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper presented a very useful training scheme for point cloud registration problem utilizing both supervised learning and unsupervised mean teacher learning. The proposed scheme is proved to be the state-of-the-art in the experiment. The paper is well organized, and the reproducibility is excellent. The only weakness is the contribution is less significant without theoretical or model contribution. Therefore I recommend strong accept for this paper.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

This paper addresses the problem of point cloud registration with deep-learning. The method is evaluated in the registration of lung vessels extracted from chest CT. The method is founded on the Mean Teacher approach, where two networks are simultaneously trained. The roles of the networks are teacher and student and both collaborate to improve the performance of the student which finally will provide the final model. The problem with Mean Teacher approach is that the input provided by the teacher is usually not accurate and may ruin the training process. Therefore, the authors propose two different improvements that combined resulted into an improvement of the accuracy of the models. The authors propose to select the most accurate labels from the source set. Second, the authors propose an automatic generation of new labels using the teacher knowledge. Domain shift is a potential problem in the proposed method, but the authors showed that the combination of Mean Teacher with the propos
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The application is interesting and relevant to Miccai community.
- The proposed approach seems to be the correct direction towards the solution of the problem.
- The results show improvement with respect to the state of the art.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The proposed method is heavily based on the previously proposed Mean Teacher algorithm.
- The proposed improvements are limited to a better selection and generation of imprecise inputs.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The datasets used for evaluation are publicly available.

The authors provided the code and models in an anonymous GitHub repo. However, the details given in the paper regarding the use of the different datasets are confusing to me and I do not feel able to exactly reproduce the training phase.

For the evaluation, the authors provided a clear description of metrics and tendency. Statistical significance was stated when needed.

The average runtime in the testing phase was provided. However, it is important to know the runtime in the training phase and the memory footprint. These magnitudes are not provided.

The clinical significance of the method can be inferred from the introduction. However, the proposed method needs further validation in more different datasets for considering moving to clinical application.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The addressed problem is relevant to Miccai community and the approach selected by the authors seems to be a right way towards the solution of the problem. It will be interested witnessing the maturity of the proposed approach and the competence with respect to image based methods in different applications.

For those not familiar with the Mean Teacher approach, the manuscript is hard to follow and some of the implementation details are not clear. The paper would get benefit of complementary explanations toward a better understanding of the use of source and target datasets and the use of real and simulated data during training.

I believe that, according to Miccai scoring, this is a fair paper with weakness slightly weigh over merits. My impression is that the proposed improvements to the Mean Teacher approach are limited to a better selection of the inputs. In the following, the aurhors can find different appreciations that may help improve the quality of the manuscript.

Introduction. … as established for dense image registration [4] [17] … The citations make reference to TransMorph and LapIRN. The authors claim that these methods are ineffective and confirmed by their experiments. With this claim, the authors cite [20] that is a NIPS 2021 paper. TransMorph is a 2022 paper while LapIRN is a 2020 paper that has not been intended/used for the application of interest. I believe that the citations are not very appropriate here.

Methods. Problem setup and standard Mean Teacher. For readers not familiar with the Mean Teacher paradigm, it can take a while to understand the use of source and target sets. I believe this should be better explained here. In addition, it is not clear when mentioning the generation of the source samples of the fly. I believe that they are the samples proposed in section 2.3. right? I believe it is confusing and the manuscript would get benefit of a better explanation of who is who in the whole process.

Experiments. Experimental setup. Datasets. I believe that it is confusing how the real dataset corresponds with the different datasets involved in the method. Maybe a table with the quantities and the use of each dataset would help the reader to understand how the data was used.

Experiments. Experimental setup. Implementation details. May the authors elaborate more on the use of the multi-scala losses? It is mentioned here for the first time.

Experiments. Results. Providing the relative improvement is OK but it may come with the exact reference to the table for a better interprestation of the results. E.g. Where does the -59.8 / -55.0 come from? Experiments. Results. The authors included the SdlogJ metric for assessing the “quality” of the resulting transformation. This metric was proposed as standard in the Learn2Reg framework. However, the official code in Learn2Reg clamps the negative Jacobians. The metric is measuring the amount of deformation of the transformations and the assumption is that methods with lower SdlogJ are better. In my humble opinion, there is no relationship between this metric and the overall quality of the transformation. I may spend pages and pages explaining my reasons. I believe it would be more informative to show the maximum and minimum achieved Jacobian and the percentage of negative Jacobians.

Conclusions. The authors mention TREs of 7.98 and 4.99 for VoxelMorph and LapIRN. Are these results fairly comparable with the results shown in this article? I believe that the level of fairness should be mentioned.

Experiments. Conclusions. The authors mentioned that the best traditional method is able to show a TRE of 0.83 mm. Again, is this 0.83 comparable with the results shown in this article? Why did not the authors included the comparison with this method in Table 1? Is the only reason to not use or work on improvements on the traditional approach the few minutes it takes to perform a registration? Why is it not worthy to wait for better registration results with a traditional method? I believe interesting insights would arise from trying to reflect on these questions.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

As I said, I believe that, according to Miccai scoring, this is a fair paper with weakness slightly weigh over merits. My impression is that the method is heavily based on the previously proposed Mean Teacher algorithm and the proposed improvements to the Mean Teacher approach are limited to a better selection of the inputs.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #1

Please describe the contribution of the paper

This paper proposed two improvements for domain adaptive point cloud registration to increase the accuracy. Specifically, built on top of Mean Teacher paradigm, it used Chamfer distance to filter pseudo labels to prevent detrimental supervision and used the teacher to synthesize noise-free displacement labels for training.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

(1) This paper is well-motivated and structured; (2) The ideas are solid; (3) experiments are comprehensive and insightful.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

No
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Reproducibility is feasible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

The reviewer would suggest to explore large density point could registrations. The current 8k setting is not really challenging and a game changer.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Based on the relevance of the topic, the presentation of the paper, the novelty of the methods, the comprehensive experiment results, this paper is ready for acceptable.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper receives a mixed feedback of review. All reviewers agreed that the proposed idea is of high interest to the MICCAI community. However, there were major issues regarding the significance of the proposed work. Additionally, the reviewer R2 pointed out a potential limitation of the methodology, which was not discussed in the current paper. The authors are strongly encouraged to address all questions and concerns raised by the reviewers in a revised manuscript, with special emphasis on the points raised by R2.

Please also address carefully the differences between this paper and paper #1171, “Unsupervised 3D registration through optimization-guided cyclical self-training”.

Author Feedback

We thank the reviewers for their thoughtful comments and the appreciation of our “well-organized” and “accurately presented” (R4) work, assessed as “the correct direction towards the solution” of an “interesting application”(R2), as demonstrated in “comprehensive and insightful experiments” (R1).

Reviewer 2 raised two major concerns. C1: “The proposed method is heavily based on the previously proposed Mean Teacher”. The two methodological contributions of our work, whose novelty was explicitly appreciated by R1, are indeed extensions of the Mean Teacher algorithm. But while R2 did not specify why this is a weakness, we believe this is rather a strength. Specifically, the Mean Teacher is among the most popular and effective methods for domain adaptation, and mitigating its limitations, as realized by our work, is – in our opinion – a highly significant contribution.

C2: The proposed improvements are limited to a better selection and generation of imprecise inputs. Our work presents two orthogonal improvements of the Mean Teacher: filtering pseudo labels and dynamically synthesizing input samples with precisely known displacements. Both methods are novel and have proven highly effective in the experiments. We do not see a weakness in this aspect but rather a significant advancement of a popular and widespread training strategy. Moreover, R2 may have misunderstood several subtle but decisive aspects of the two methods. First, our method does not select “the most accurate labels from the source set” (summary by R2) but rather those labels by the teacher that are more accurate than the current student’s predictions. Second, our method does not “generate imprecise inputs” or “new labels” (summary by R2) but dynamically synthesizes novel input pairs with precisely known labels.

Minor concerns of R2: C3: R2 does “not feel able to reproduce the training phase”. Our code repository provides detailed instructions on how to process data and perform training. We will gladly respond to any specific issues.

C4: Inappropriate citations of [4,17]. R2 may have misunderstood the sentence. The sentence states that unsupervised learning with similarity metrics is established for image registration, e.g. implemented in [4,17], but shown ineffective for deformable point cloud registration [20]. We will reformulate the sentence to prevent misunderstandings.

Reviewer 3: C5: The contribution is less significant without theoretical or model contribution. We agree that theoretical and model contributions are indispensable to advancing medical image computing. In our view, however, these are not the only contributions that make a work significant. Developing training strategies to address semi-/unsupervised learning or domain adaptation, as done in our work, is equally important. As detailed in our responses to C1/C2, our work proposes two novel and effective strategies to improve one of the most common approaches to semi-supervised learning and domain adaptation (the Mean Teacher), which is, in our view, a significant contribution to the field.

Meta-reviewer: Submission #1168 vs #1171. The submissions significantly differ in the addressed problem (domain adaptive point cloud registration vs unsupervised registration focusing on images), used datasets (lung vessel trees from PVT vs abdomen CT and Förstner keypoints from COPD), and proposed method. While #1168 addresses noisy pseudo labels in the Mean Teacher paradigm by a novel filtering strategy & data synthesis, #1171 introduces cyclical self-training as a novel unsupervised learning paradigm. Specifically, #1171 addresses missing initial labels and noisy pseudo labels by combining a feature network with a regularizing differentiable optimizer and pseudo label refinement.

back to top

A denoised Mean Teacher for domain adaptive point cloud registration