Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Nilesh Kumar, Prashnna K. Gyawali, Sandesh Ghimire, Linwei Wang

Abstract

Obtaining labelled data in medical image segmentation is challenging due to the need for pixel-level annotations by experts. Recent works have shown that augmenting the object of interest with deformable transformations can help mitigate this challenge. However, these transformations have been learned globally for the image, limiting their transferability across datasets or applicability in problems where image alignment is difficult. While object-centric augmentations provide a great opportunity to overcome these issues, existing works are only focused on position and random transformations without considering shape variations of the objects. To this end, we propose a novel object-centric data augmentation model that is able to learn the shape variations for the objects of interest and augment the object in place without modifying the rest of the image. We demonstrated its effectiveness in improving kidney tumour segmentation when leveraging shape variations learned both from within the same dataset and transferred from external datasets.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_24

SharedIt: https://rdcu.be/dnwx6

Link to the code repository

https://github.com/nileshkumar0726/Learning_Transformations

Link to the dataset(s)

https://competitions.codalab.org/competitions/17094

https://kits19.grand-challenge.org/data/


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a method to learn object-specific diffeomorphic shape transformations for data augmentation. They propose a VAE model to learn transformations for pairs of tumor patches in a locally affine framework. The VAE model can be sampled to learn object-specific deformations during training, and these augmentations can be applied online. They demonstrate improved performance in kidney and liver tumor segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Generative model to sample object-specific transformations online
    • Diffeomorphic formulation by modeling transformations as piecewise affine
    • Simple strategy for object-centric learning, and online method to produce realistic object-specific transformations
    • Statistically significant improvement in experimental results
    • Method is well described and reproducible
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Small improvement compared to TumorCP in larger data regimes
    • No ablation study testing limits of learned deformations
    • Proposed improvements are minimal when not combined with TumorCP
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The implementation and experimental details are described clearly and the authors promise to publish code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This paper is well written and very clear. The authors present a method to learn a generative model for object-centric diffeomorphic transformations for data augmentation. Their generative model is a VAE trained to learn transformations from one tumor shape to another.

    The transformation model is diffeomorphic by a piecewise affine formulation. They propose a simple strategy for training by grabbing bounding boxes of tumors, and a simple and effective strategy for applying this at training time to avoid distorting the surrounding space of the image.

    The experimental results improve those of the TumorCP baseline, which copies and augments tumors in images.

    I have minor weaknesses to point out. The main one is that when compared with TumorCP, this method often leads to worse performance, indicating that this type of augmentation strategy may not generalize well to other tasks. When using the authors’ proposed augmentation with TumorCP, performance improves, albeit slightly (1 dice point) compared to TumorCP in the large data regime.

    It would have been interesting to perform an ablation study and see what the effect of distorting tumors (x_tgt) is on the deformation model. It is possible that the VAE is not learning large enough variations to improve training generalization.

    The transformation model is diffeomorphic by a piecewise affine formulation. They propose a simple strategy for training by grabbing bounding boxes of tumors, and a simple and effective strategy for applying this at training time to avoid distorting the surrounding space of the image.

    The experimental results improve those of the TumorCP baseline, which copies and augments tumors in images.

    I have minor weaknesses to point out. The main one is that when compared with TumorCP, this method often leads to worse performance, indicating that this type of augmentation strategy may not generalize well to other tasks. When using the authors’ proposed augmentation with TumorCP, performance improves, albeit slightly (1 dice point) compared to TumorCP in the large data regime.

    It would have been interesting to perform an ablation study and see what the effect of distorting tumors (x_tgt) is on the deformation model. It is possible that the VAE is not learning large enough variations to improve training generalization.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a novel and simple idea for object-centric augmentation. They propose a diffeomorphic generative model to sample transformations, and propose simple ways to use this in training. Their results improve a state-of-the-art augmentation strategy for tumor segmentation.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper presents a method for data augmentation based on a VAE that is trained to generate diffeomorphic displacements. The network is trained on pairs of tumor patches, extracted from label masks, and learns to transform the source patch to the target. During training of a segmentation network, data augmentation is then done by sampling from the VAE to generate a transformation that is applied to the tumor region of the training image(s). To not deform the full image, the displacement is made small outside of the tumor region.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This is a well written paper! Its main strength is the contribution of a generative model, which allows for augmenting particular objects in images with diffeomorphic transformations, keeping the rest of the image unchanged. Although the paper applies the model to tumors, the method is general so that the model should be learnable on any labeled dataset. Seeing the patches are kept small, sampling the model during training should also be quick. Additionally, the validation was performed on multiple organs (liver and kidney), and tested for domain adaptation (within-data vs cross-data), which is important for a real use of the proposed model. I also appreciate the use of statistical tests for significance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The major weaknesses of the paper, which if addressed would increase my rating are: -The model is 2D, which means it is not applicable to augmenting data in the real world application of training segmentation networks on volumetric images. I am not sure why the authors did not implement their model in 3D already in this submission? -In the experiments section, compare learning diffeomorphic transformations with the generative model and applying them to the tumor regions, with simply applying nonlinear warps randomly (without prior knowledge of object shape). For example, using https://docs.monai.io/en/stable/transforms.html#rand2delasticd. This is important because methods like SynthSeg have shown that augmentations do not need to be realistic in order to be effective for training networks. -The methods section dedicates close to one page describing the paper’s use of a particular diffeomorphism, motivated by C1 diffeomorphisms giving “better DNN training”. However, this is not validated in any ablation study, which would be easy to do by replacing the current integration with, e.g., scaling and squaring (readily available). Furthermore, regarding “the integration can be done via a specialized solver [5].”, what is the computational complexity of this solver?

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Please correct/clarify: -”A must-have ingredient for training a deep neural network (DNN) is a large number of labelled data”, I am not sure, many methods now exist for training models on smaller datasets, e.g., self-supervised learning. -I would replace DA with data augmentation, it makes the paper more readable. -The literature review might want to mention generative models trained on entire images as another method for data augmentation that shows promise. However, these models require large compute. -Under the Generative Modeling section, perhaps mention (and cite?) that this model is known as a VAE. -It is not clear to me how the Euclidean distance is used to select tumor pairs? -Fig. 2: Please clarify what the top row is in (b), the reconstruction? Additionally, it would be nice if a larger field-of-view was shown in the image patches as this would give a better idea how the displacement affects the regions surrounding the tumors. -Why is the baseline model not augmented also with random nonlinear warps? -I would remove the asterisk from the top right column of Table 1 and just mention the statistical test in the text. Currently, it can confuse the reader as being a footnote.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    -The method is 2D only. -Randomly deforming the tumours not included in the experiements.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The rebuttal addressed my concerns. As promised, I have increased my rating of the paper to weak accept.



Review #4

  • Please describe the contribution of the paper

    This paper proposes an object-centric data augmentation method that is able to learn the shape variations for the objects of interest and augment the object in place without modifying the rest of the image. Specifically, the authors design a generative model to learn the object-centric diffeomorphism. Then they use this model to generate diverse augmentations of different instances of tumours in place. The authors have evaluated their method with two tumor segmentation datasets, and the results show the effectiveness.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Generally, the paper is well written and organized, easy to follow and comprehend.
    • The design of generative modeling and online augmentation strategy is interesting.
    • The within-data and cross-data evaluation on two tumor segmentation datasets of different organs is helpful and supportive.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Insufficient analysis about the underlying rationale that the cross-data performance is better than the within-data performance which is beyond expectation, especially when introducing the proposed method.
    • As shown in Table 1, the proposed Diffeo alone only adds slight benefits on top of the baseline compared with TumorCP. That being said, the proposed method limit as a supplement for TumorCP.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The paper is reproducible since the author has agreed to release code upon acceptance.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • The analysis for the suprising results is not convincing. I suggest the author to provide more justification and discussion of the transferability in the Methods section. Then in the Results section, readers may feel easier to understand the presented results and analysis.
    • More discussion on the training/implementation efficiency compared with TumorCP would add value.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes an efficient method incorporating generative model to do data augmentation for liver tumor segmentation. With extensive evaluation, the method shows improvement when integrated with TumorCP. However, more analysis and clarification about the interesting results are necessary.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper received a mixture of positive and negative feedback. It is noteworthy that all reviewers acknowledged the novelty and significance of the proposed idea. The authors are strongly encouraged to address all questions and concerns raised by the reviewers during the rebuttal phase, with special emphasis on the points raised by R2.




Author Feedback

We appreciate the constructive comments and support for our work. We provide major clarifications below.

Random warping augmentations (R2): We added experiments testing augmentation with random elastic warp instead of learned shape variations. The average DICE from all experiments (different data percentages) decreased from 0.678 to 0.637. We speculate that while unrealistic transformations work for whole images, they may be problematic when only augmenting specific local objects in an image.

Choice of transformations and specialized solver (R2): The choice for C1 Diffeomorphisms was based on their robustness to recover from wrong transformations during gradient-based optimization [1]. The solver chosen produces faster and more accurate results than a generic ODE solver. Specifically, the cost for this solver is O(C1) + O(C2 x Number of integration steps), where C1 is matrix exponential for the number of cells an image is divided into and C2 is the dimensionality of an image. We will add these details to make the motivation more explicit. As suggested by R2, we also experimented with using “squaring and scaling” layers for integration. The experimental results showed that the learned transformations suffered from losing texture information in tumors, leading to bad reconstructions and generations overall.

3D Implementation (R2): The presented method is applicable to 3D segmentations. In a simple fashion, we can apply 2D augmentation to each slide of the tumor volume. In a more advanced fashion, we can learn the shape variations of the tumors in 3D. We started experiments with the former but were not able to obtain results in time within the rebuttal period due to heavy training costs. We are happy to add these results to the final version of the manuscript.

Cross-data vs. within-data augmentations (R3): We believe that the better cross-data augmentation results are due to two factors. Firstly, learning of the within-data augmentations is limited to the percentage of the training set used for segmentation. The number of objects to learn transformations from is thus greater in cross-data augmentation settings. Secondly, the transformations present in cross-data are completely unseen in the segmentation training network which helps in generating more diverse samples. Note that, as the transformations are learned as variations in object shapes, they can be transferred easily across datasets. We will add the reasoning to better explain the cross-augmentation results.

Computational efficiency comparison (R3): We acknowledge that our augmentations add overhead as they need to be generated using an extra neural network, compared to just pasting tumors on an organ in Tumor-CP. Quantitative comparisons on the computational cost for these two augmentations will be added to the final manuscript. Note that this overhead only occurs at training time; once trained, the segmentation network no longer requires the extra network.

Distorting target images to learn richer transformations (R1): Distorting x_tgt may yield more expressive augmentations but it also runs the risk of producing unrealistic augmentations which can be problematic in segmentation performance. This can be observed in the experiments with random warps (rebuttal point 1).

Minor modifications: We will address the rest of the minor comments in the final manuscript, including adding the VAE citation in the methods section, expanding the caption of Fig(b) to make it more clear, and modifying the statement about the limited data problem in neural network training to reflect advancement in self-supervised training. The images in Fig(b) are different generations from a single tumor patch.

Detlefsen, N.S., Freifeld, O., Hauberg, S.: Deep diffeomorphic transformer networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have adequately addressed and clarified the reviewers’ major questions. With the raised score by R2, this paper receives consistent recommendations to be accepted by MICCAI.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    There is consensus amongst reviewers that this work contributes with a meaningful strategy to create realistic augmentation that benefit medical image segmentation models. The authors rebuttal addresses remaining concerns, adds a useful ablation and I would recommend acceptance.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presents a novel generative approach to improve (object-centric) augmentation for tumor segmentation. Reviewers appreciated the novelty and the strength of validation. However, there were concerns on the general applicability of the method to multidimensional images, computational complexity, and experimental results. The rebuttal satisfactorily addresses these concerns and the authors have provided sufficient clarification on random warping augmentations and promised to include further results on 3D data in the camera ready. Recommendation is to accept. It would be useful to also carefully discuss the results in comparison with TumorCP in the camera ready submission.



back to top