Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yang Liu, Shi Gu

Abstract

The registration of pathological images plays an important role in medical applications. Despite its significance, most researchers in this field primarily focus on the registration of normal tissue into normal tissue. The negative impact of focal tissue, such as the loss of spatial correspondence information and the abnormal distortion of tissue, are rarely considered. In this paper, we propose a novel unsupervised approach for pathological image registration by incorporating segmentation and inpainting. The registration, segmentation, and inpainting modules are trained simultaneously in a co-learning manner so that the segmentation of the focal area and the registration of inpainted pairs can improve collaboratively. Overall, the registration of pathological images is achieved in a completely unsupervised learning framework. Experimental results on multiple datasets, including Magnetic Resonance Imaging (MRI) of T1 sequences, demonstrate the efficacy of our proposed method. Our results show that our method can accurately achieve the registration of pathological images and identify lesions even in challenging imaging modalities. Our unsupervised approach offers a promising solution for the efficient and cost-effective registration of pathological images. Our code is available at \url{https://github.com/brain-intelligence-lab/GIRNet

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_51

SharedIt: https://rdcu.be/dnww4

Link to the code repository

https://github.com/brain-intelligence-lab/GIRNet

Link to the dataset(s)

https://www.med.upenn.edu/cbica/brats-reg-challenge

https://www.oasis-brains.org/#data

https://www.med.upenn.edu/cbica/brats2020/data.html


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a novel unsupervised approach for pathological image registration by incorporating segmentation and inpainting. A deep learning architecture is proposed where registration, segmentation, and inpainting modules are trained simultaneously in a co-learning manner. Mutual information is used as matching score for the pathological region. This way the registration of the pathological images is achieved in an unsupervised manner.

    The approach is based on existing works [ref 22,25] but it was well adapted for the medical data and the inpainting+registration task.

    Two types of experiments are conducted: (1) atlas-based registration using ICBM 152 template as atlas (2) longitudinal alignment - pre- and post- operative images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written, it addresses an important problem with a good solution. The contribution is incremental based on exiting works on natural images but it required adaptation to medical data and the task of unsupervised segmentation.

    • the paper is well written and has a good section on related works and motivation for the new approach
    • the method is carefully formulated
    • registration evaluation with synthetic data that has correct mapping of pathological regions to healthy brain regions. This is important as ground truth for registration is not available I real data.
    • both registration and segmentation are evaluated. (registration) the method is compared with three deep learning based and two traditional methods of image registration (segmentation) evaluated on the BRATS dataset
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • ablation study: the method is complex, involves three modules that need to be trained jointly. The ablation study mainly evaluates the effectiveness of the MI-based loss on the segmentation task. I felt the ablation study should include more aspects that motivate the effectiveness of the formulation, the importance of all modules. Why is registration accuracy not evaluated in the ablation study ?

    • the use of only one atlas could be somewhat limiting; the work on multi-atlas registration has clearly shown the advantage of using several atlases; did the authors consider using several atlases ?

    • training data alignment Pg 6 : “We reoriented all MRI scans of the T1 sequence to the RAS orientation with a resolution of 1mm x 1mm x 1mm and align the images to atlas using FreeSurfer [20]” I would recommend clarifying this step, how sensitive the approach is on the accuracy of this registration. Is this only an affine alignment. Please clarify.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The method is quite complex but I believe it is reproducible. Some details will have to be taken from the works describing the related methods used.
    Not sure if the synthetic data is fully reproducible or will be made available

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • more details on the training data alignment (see above)
    • ablation study (see comments above)

    Details: Fig 2,3 text is small

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I consider it a good paper with sufficient novelty and an important medical imaging problem to address. The paper is well written and clear.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper presents GIR-Net, a tri-net framework for joint non-linear registration, segmentation, and inpainting the tumor region on brain MR images. The authors validate the method on registering pseudo brain tumor T1w MRI to the ICBM template, and registering pre-operative and follow-up images from the BraTS-Reg 2022 dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • It is promising to advance the field of deep learning-based registration of medical images with pathologies by a joint unsupervised estimation of non-correspondence regions, if the effectiveness of the method can be compared with existing deep learning-based works like [1].
    • The initiative of validating the algorithm in two different applications is good, if the evaluation design can be improved.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Lack of clarity and discussion of key differences over relevant methods. The method is described in a way that does not clearly distinguish it from the most relevant algorithms such as the work by [1, 7, 8, 17, 19]. The challenges in existing works are “The non-correspondence detection approach is very sensitive to the dataset [1]”, and “it is necessary to incorporate both a data-independent segmentation module and a modality-adaptive inpainting module into the registration pipeline”. “DIRAC[17] jointly estimates regions with absent correspondence and bidirectional deformation fields”. But the proposed method neither solves the above problems nor clearly distinguished: “To overcome this challenge, we propose a tri-net collaborative learning framework that simultaneously updates the registration, segmentation, and inpainting networks.”
    • I would miss the rationale of the evaluation design (section 3. Atlas-based registration; Fig. 2), and a discussion on the results (Fig. 2, Fig. 3). First, in principle, methods that tailored for registering images with tumor non-correspondences are expected to outperform plain registration methods on tumor data, just like the authors claimed in the Introduction. However, they (DIRAC, DRAMMS) are the worst, and much worse than the methods that did not consider lesions at all (VoxelMorph). Also, the plain methods already achieved similarly good “performance” as the proposed method. Could the authors comment on the design choices, and their impact on the results? Second, since there exist deep learning-based methods for registering images with pathologies and joint unsupervised segmentation (e.g., [1]), it would be more appropriate to compare with them, instead of plain deep learning methods like VoxelMorph.
    • The description of the method and experiments is not clear enough to fully understand it. For instance, in InpNet (Fig. 1, and section 2.1), the terms foreground and background are confusing: “InpNet takes the background (foreground) cropped by SegNet and image T ◦ ϕTS warped by RegNet as input and outputs foreground (background) with a normal appearance”. Is the lesion in the source image the foreground? Then why name the lesion mask (i.e., M; lesion=1, normal tissue=0) as Background in Fig. 1? In RegNet, what does θ(S) with a hat bar indicate (equation 1), what is the exact math definition of the loss Lsym (“Lsym is the registration loss of SymNet [15]), and which operation does ◦ denote (if it is a warping in T ◦ ϕTS, what is it in Fθ = Ω ◦ M, and in equation 2: θ(S) ◦ D[.])? How was the mean deformation error (MDE) calculated (section 3)? And how was the GIR(CIRDM) implemented? (Is the SegNet trained supervised and separately besides replacing RegNet with CIR-DM [16]? Then better to propose all different implementations in the method section).
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The description of the method and experiments is not clear enough to replicate it.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Pathology/pathological images often refer to digital pathology images, which are images of specimen slides. Given the context of this paper, it would be more accurate to use the term brain MR images with pathology (or similar) throughout the paper.
    • Include a reference for MMI, if available.
    • In the experiment of Longitudinal registration, it is not clear if the follow-up scans post-operative, and if there are radio- and chemo-therapy effects on the scan.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    3

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The major factors are the limited clarity of the paper and the validity of the evaluation design.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper presents an unsupervised approach for pathological image registration by incorporating segmentation and inpainting networks. The three modules are trained jointly in an adversarial manner. The method is evaluated on atlas registration and longitudinal registration on pathological brain MRI.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed approach to facilitate registration with unsupervised adversarial segmentation and inpainting is, as far as I know, novel. The paper is very well written and easy to follow. The evaluation is comprehensively performed on both registration and segmentation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    In the experiment, the proposed approach to simulate pseudo pathological brain MRI does not seem to faithfully reflect deformations caused by the lesions, which limits the reliability of the test.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The data used in the paper is all freely available data (provided by others, not the authors) which helps to reproduce their results and/or compare alternative methods. Details on network training or data simulation are NOT provided. The code used in the paper does not appear to be made available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    ● Details on how the pseudo dataset is generated are not provided (even in the supplementary material). It can be seen in Appendix Fig.2 that the two images are not fully aligned and the deformation caused by the lesion is not well addressed. How is the alignment achieved? Is it rigid or non-rigid? What registration method? Would this underlying approach of generating samples create bias in the DVF space? ● The abbreviation GIR is not defined. ● In Table 1, the “GT” can be a bit misleading and can be changed to, e.g., supervised.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall the paper is interesting and innovative, for both registration and segmentation. The method seems to produce good results that are better than alternative methods. However, I think an improved simulation and more details on network training (or code) would enhance the paper.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    On the positive side, the presented work is novel and addresses an important problem in medical image analysis, i.e., image registration in the presence of pathologies. Additionally, the paper is fairly well-written and includes two evaluation tasks and multiple baseline methods. On the downside, there is consensus that there are some weakness that need to be addressed. First and foremost, there is lack of clarity regarding different aspects of the paper, including methodological details (R#2), and details regarding experimental setting (all reviewers). In line with R#2, I also find it surprising that DRAMMS and DIRAC perform worse than more standard approaches (see Fig. 2), though the results in Fig. 3 seem to be in the right direction. DRAMMS’ ability to handle missing correspondences has been demonstrated in multiple occasions. Providing parameters used for these methods might help with understanding why and will improve the reproducibility of the paper. Sharing code is also encouraged for the same reason. Lastly, it is not clear why [1] was not included as baseline. Comparisons to [1] seem appropriate. Discussing these is important for finalizing decision about the paper.




Author Feedback

We appreciate the insightful feedback and suggestions. In response, we have thoroughly considered each comment and carefully respond as follows. R1-Ablation study and the use of only one atlas: Our method relies on the collaborative learning amongst the three modules. The absence or deficiency of any of these modules can potentially lead to a disruption in the training process. Thus, our ablation strategy is to vary the form of the losses rather than the whole modules. Specifically, our ablation study is devised to evaluate how the SegNet and InpNet performances on the outcome of the RegNet by 1) varying SegLoss with or without ground truth; 2) varying InpLoss with or without HM. The resultant registration accuracy is represented in Fig. 2 and Fig. 3. Regarding the use of atlas, the suggestion is helpful, and we will examine its impact in later versions. Yet we want to clarify that the main motivation of our methods here is to address the difficult case of aligning the preoperative and postoperative pair, which makes our experimental designs a bit different from the atlas-centered registration. R1, R3-Details on data alignment and pseudo-dataset: We used the mri_robust_register in FreeSurfer, which is a standard affine transform [15] to align data. The synthetic data shown in Supplement Fig.2 is entirely reproducible. For the pseudo-dataset, we generated it based on the paper [1]. We employed the mri_robust_register tool to register OASIS (healthy) and BraTS (lesion, including a mask (M)) to the atlas. Next, we performed the histogram equalization to match intensities of BraTS (B) to OASIS (O). We combined images O and HE(B) to generate an image with a lesion (P) using the mask M: P = HE(B) · M + O · (1 - M). This sampling procedure helped avoid generating additional bias in the DVF space. R2, meta-R – Model Comparison: In the atlas-based experiment where domain difference exists between the pseudo and atlas data, DIRAC failed to correctly register the pair and DRAMMS worked slightly better but also with high MRE. In the longitudinal registration where the pre-operative and post-operative scans were generated under the same condition, both DIRAC and DRAMMS serve as good baselines (Fig. 3). While there exists co-registering methods, most of them required the supervised segmentation. Regarding NCRNet [1], we recognize its valuable attempt on the similar topic and we approach the problem in different infrastructures. In practice, our decision stemmed from the efforts on fine-tuning the hyperparameters. The loss function of [1] has three hyperparameters, and it was difficult to identify suitable parameters for different datasets, even with precision up to 7 decimal places. The resulting mask consistently covered the entire brain. We demonstrated its sensitivity using Dice Coeff. in Table 1 and will supplement the experimental data of [1] if required. R2-Clarification on terms and method description: In Fig. 1, Foreground and Background denote the two branches of InpNet. We use the operation “◦” to denote the function composition in T ◦ ϕTS following the math paradigm. Here, to avoid the potential confusion with its use in DL language, we will replace it with other symbols such as ‘·’ to denote multiplication. L_sym is the registration loss of SymNet [15]. The Mean Deformation Error (MDE) is the average Euclidean distance between the coordinates of deformation field and the gold standard over the area of interests [11]. R2-Implementation of GIR(CIRDM): Implementing GIR(CIRDM) merely replaces RegNet’s bidirectional registration with CIR-DM [16]. No supervised training is required. R3-Abbreviation GIR and GT in Table 1: GIR stands for generation, inpainting, and registration. To avoid misleading, we will change “GT” to “Supervised” in Table 1. We believe that we’ve tried our best to clarify and hope our responses addressed the reviewers’ concerns. We will incorporate these clarifications and improvements in the revised manuscript.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposes an interesting and novel approach for registering images in the presence of pathologies. A major concern was the experimental validation. Specifically, the results in Fig. 2 are counter intuitive showing that methods that were designed missing correspondences performed worse than more standard approaches. In Fig. 3, the results are in the right direction, i.e., methods designed to handle missing correspondences, indeed performed better then standard approaches. This suggests that the creation of the pseudo-dataset might be problematic. Additionally, Fig. 3 suggests that the proposed method performs worse than baselines (i.e., DRAMMS) both near and far from tumor. These observations were not addressed in the rebuttal. They are particularly important given that the main motivation of the proposed approach is to handle aligning the preoperative and postoperative pair. The results limit my support for the paper



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The method received some praise for adapting a recent method for combined registration, in-painting and segmentation from computer vision to medical imaging. Overall ratings are mixed and the rebuttal did not result in any change. It is a little bit hard to follow the experimental comparisons so I was doubtful at first whether they are fair and demonstrate a realistic scenario. It should also be considered that the second reviewer was quite critical in that regard. It is fine to firstly evaluate a model on synthetically generated pairs (that largely follow the assumptions made by the method itself) as a sanity check, but then achieving the best score compared to more generic approaches (DRAMMS) that were not trained on such idealised data is expected. Subsequently, the proposed method slightly trails DRAMMS on BraTsReg for target registration error and only improves compared to DIRAC in terms of tumour segmentation. I think overall the joint solution constitutes a valuable contribution to the field of medical image registration and recommend acceptance. But I am a little bit worried that the second finding (large improvement in unsupervised tumour segmentation) seems linked to the fact that GIRnet is pre-trained on BRATS segmentation masks.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Overall, there are serious concerns about the paper that remain after rebuttal, but there are also interesting merits that the reviewers seem to value. There was no discussion after rebuttal, unfortunately.

    I took a quick look at the paper, and while I agree with some concerns that the actual contribution is unclear given the many registration frameworks that jointly do registration and segmentation nowadays and could be applied ot this case, I think the paper has sufficient engineeering development and an interesting application and could lead to interesting discussion at the conference. Overall both I and the average scores lean towards accept, although it is close.



back to top