Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yang Li, Beiji Zou, Yulan Dai, Chengzhang Zhu, Fan Yang, Xin Li, Harrison X. Bai, Zhicheng Jiao

Abstract

In this paper, we address the domain shift in cross CT-MR liver segmentation task with a latent space investigation. Domain adaptation between modalities is of significant importance in clinical practice, as different diagnostic procedures require different imaging modalities, such as CT and MR. Thus, training a convolutional neural network (CNN) with one modality may not be sufficient for application in another one. Most domain adaptation methods need to use data and ground truths of both source and target domain in the training process. Different from these techniques, we propose a zero-shot bidirectional cross-modality liver segmentation method by investigating a parameter-free latent space through the prior knowledge from CT and MR images. Experiments on the CHAOS, the subset of LiTS and the local TACE datasets demonstrate that our method can well deal with the problem of CNN failure caused by domain shift and yields promising segmentation results.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16440-8_59

SharedIt: https://rdcu.be/cVRwM

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper addresses the problem of cross modality learning from the perspective of abdominal MR/CT segmentation. A nominal intensity model for MRI and CT seeks to create an invariant latent space. The target is for zero shot cross-modality learning.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • use of modern transformer approaches
    • integration of a clever transform model to adapt target intensity
    • solid descriptions of latent spaces
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The writing quality renders much of the text very difficult to parse.
    • Lack of baseline zero-shot approaches
    • numerically varied / heterogenous zero-shot performance
    • unclear degree of domain specific information embodied in the transfer function
    • lack of robust baselines for core intramodality model
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The reproducibility of the paper is adequate with open datasets. Code / models do not appear to be shared.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The lack of baselines and alternative approaches renders this approach interested, but not well connected with the literature. It is uncertain if these approaches would be superior to established techniques.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The driving factor is the lack of inclusion of baselines for zero shot learning.

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    4

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    The baseline methods discussed in the paper and mentioned in the rebuttal were not actually evaluated.



Review #2

  • Please describe the contribution of the paper

    The authors propose a zero-shot bidirectional cross-modality liver segmentation method by investigating a parameter-free latent space through the prior knowledge from CT and MR images,which address the domain shift in cross CT-MR liver segmentation task. The evaluation is done on a variety of datasets. The structure of the manuscript is clear. This is an interesting and good paper.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors propose a zero-shot bidirectional cross-modality liver segmentation method by investigating a parameter-free latent space through the prior knowledge from CT and MR images,which address the domain shift in cross CT-MR liver segmentation task. The evaluation is done on a variety of datasets. The structure of the manuscript is clear. This is an interesting and good paper.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The evaluation is not adequate. More experiments should be done to compare the proposed method with other related works.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The idea is clear. it is easy to be reproduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    The evaluation is not adequate. More experiments should be done to compare the proposed method with other related works.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors propose a zero-shot bidirectional cross-modality liver segmentation method by investigating a parameter-free latent space through the prior knowledge from CT and MR images,which address the domain shift in cross CT-MR liver segmentation task. The evaluation is done on a variety of datasets. The structure of the manuscript is clear. This is an interesting and good paper.

  • Number of papers in your stack

    3

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #3

  • Please describe the contribution of the paper

    1.This work provides a new paradigm for the task of cross-modality liver segmentation: solving the problem of modality transfer and domain shift through parameter-free latent feature space. 2.Based on the prior knowledge of liver intensity information, a bidirectional cross modalities latent feature converter is proposed to project CT and MR images into a common space.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper proposes a novel way to solve the cross-modality segmentation puzzle: its main focus is on feature commonality from different modalities images, without relying on deep learning models, and most of the existing research is devoted to the optimization of neural network models. Through the inherent prior knowledge (liver intensity distribution), a bidirectional parameter-free latent feature space is found. When one modality is not marked, the cross-modality segmentation is realized by using the data of the other modality.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The comparative experiment is not sufficient: The performance of the current backbone in the case of single-modality, cross-modality and cross-site images is listed in Table 1, and it can be seen that the backbone performs moderately well in the case of single-modality and poorly in the case of cross-modality and cross-site images. However, this experiment can only show that it is the defect of current backbone, and cannot prove that most of the existing methods cannot do cross-modality and cross-site images segmentation.
    2. The comparative experiment can not fully explain the problem: In Table 1, for the experiment of CT - > CT of backbone setting1, after replacing the test data, dice decreased from 97.08 to 85.07, which seems acceptable; However, in the MR - > MR experiment, after replacing the test data, dice decreased from 87.03 to 41.09, which can not indicate whether it is a problem with the model or the data itself.
    3. LST bidirectional problem: It is mentioned that LST is a bidirectional latent feature space, and its feature mapping is the common feature closest to different modalities. However, in the third experiment in Table 2(IG kernel setting3), there is a large gap between the experimental results of CT -> MR and MR -> CT under the same dataset, which seems to fail to reflect the bidirectional nature of LST, and there should be no large difference in results in the same latent space.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This paper uses the LST method proposed by itself and a common backbone to form the model architecture. The dataset adopts the public dataset. For the LST method proposed by the author, the specific calculation formula and brief derivation process are given in the paper, which is less difficult to reproduce. On the whole, it has high reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
    1. Adding different comparison experiments and using the common shortcomings derived from different backbone experiments to show that the problem proposed to be solved in the article is a problem that exists in many current studies. The use of only one backbone method does not effectively indicate whether it is a problem of the backbone itself or a common problem of existing studies.
    2. Optimize the common feature representation of the proposed potential space to better balance the features between the two modalities. The current latent space formed by the a priori knowledge of liver intensity seems to be more biased towards the feature representation of CT images and less expressive for MR images, which makes the segmentation from CT to MR to a greater extent than the segmentation from MR to CT under the same conditions, and also the segmentation from CT to CT to a greater extent than the segmentation from MR to MR
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    For recommendation 1: The paper uses a backbone network to verify the difficult problem of cross-modality, which is not very strong to prove the problem if it is the only experiment, and the experimental results it produces may also be the deficiency of the backbone model itself. In addition, the backbone is not one of the best backbones (e.g., nnunet, swin-transformer) available. For recommendation 2: The important innovation of the article is to propose a potential space to map the features of two different modalities, which makes the difference minimized to achieve better cross-modal segmentation results. The experimental results of the article show that there is a large difference between the experimental results of CT to MR and MR to CT, and this difference cannot effectively explain the commonality of the features of the potential space proposed in the article for the two modalities, and it can be found from the results that the potential space is more biased to the representatio

  • Number of papers in your stack

    4

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    the work studying zero-shot cross-modality liver segmentation is interesting researchers. The rebuttal is convincing.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper addresses the problem of zero-shot cross modality learning for abdominal MR/CT segmentation. The goal is to learn an invariant latent space for capturing the intensity contrast between liver and normal parenchyma on MRI and CT. The approach of combining analytical intensity transformation with transformers is interesting. However, the gains in accuracy that could be achieved using this method in comparison to established state of the art techniques is unclear and should be discussed. Also, the results indicate discrepancies in the bi-directional modeling, raising questions about reliability of the invariant latent space embedding. Please explain how the bi-directional modeling helps in this regard. Utility and impact of the approach for zero shot learning could be better presented with results comparison to baseline and established methods and discussion comparing the advantages of this approach over existing methods. Overall, a technically interesting paper but its impact/utility for zero shot learning is unclear due to lack of sufficient experiments compared with baseline models and established techniques and should be addressed in rebuttal.

  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2




Author Feedback

We really appreciate the valuable comments from all reviewers. Here are our responses.

  1. Reviewers#1,#2,#3: Discussing and comparing the proposed method with SOTA. To our knowledge, there’re few works studying zero-shot cross-modality liver segmentation, since it is emerging and more challenging than other related settings as the blinded target domain in the training process. The only two similar works we have found are Pham et.al. (2020)[12] that were discussed and compared in the manuscript and Zhou et.al. (2022)Ref1 accepted and last revised by CVPR on 2 and 28 Mar 2022. As MICCAI’s guideline “arXiv papers are not considered prior work since they have not been peer-reviewed. Therefore, citations to those papers are not required and reviewers are asked to not penalize a paper that fails to cite an arXiv submission.”, we didn’t cite [Ref1] before. Pham et.al.[12] did a single-direction zero-shot MR to CT liver segmentation. Zhou et.al.[Ref1] found the source-similar/dissimilar images and did bi-direction coross-modality segmentation including liver segmentation. Compared with them: (1)Our method is bi-directional and parameter-free to get domain-invariant latent space, which provides new insight into the zero-shot cross-modality segmentation; (2)We construct a new MR dataset TACE for more sufficient validation of our method; (3)Liver Dice in CHAOS show that ours(86.25%) is superior to 12 in MR to CT and ours (86.25% MR to CT, 71.99% CT to MR) outperforms Ref1.
  2. Reviewer#1: About the baseline. As there did’t have a uniform certifier in this topic, we first built baseline in Table 1, which shows that the CNN would fail if target and source data are from different modalities/sites. Then, we validated LST, the results are in Tables 2,3, which prove that our method bridging domain gap with common space is effective to addre
  3. Reviewer#3: The discrepancies in the bi-directional modeling. As Table 2 shows, for CT to MR, the Dice of CHAOS, TACE MR are 71.99%(+67.67%), 73.24%(+35.68%); for MR to CT, the Dice of CHAOS, LiTS CT are 86.25%(+73.01%), 83.17%(+39.93%). Though there’re discrepancies in accuracy, the overall trends of the growth are similar, which proves the reliability of the invariant latent space in bi-directional modeling.
  4. Reviewer#3: Validating the method by other backbone and clarifying the decrease in performance is due to model or data itself. We use the suggested nnUNet under the same experimental settings in manuscript. The results are: CT to MR, the Dice of the original, LST transferred CHAOS MR are 3.92%, 65.39%(+61.47%); MR to CT, the Dice of the original, transferred CHAOS CT are 22.26%, 38.89%(+16.63%); CT to CT, the Dice of the original, LST transferred LiTS CT are 90.86%, 92.18%(+1.32%); MR to MR, the Dice of the original, LST transferred TACE MR are 22.51%, 53.54%(+21.03%). Besides, for CT to CT and MR to MR, replacing the test data, the Dice decreased from 97.23% to 90.86% and from 90.28% to 22.51%. From the results, we conclude: (1)The decrease in performance is caused by the domain shift between data itself, since different protocols especially MR protocals yield different intensity distributions, texture and boundary properties of the organs(Fig.1), while the success of CNN mainly relies on the same/similar distributions of train/test data[5,12]; (2)Our LST mining a common space to reduce domain shift is effective; (3)Our backbone is stronger and more robust than nnUNet in zero-shot cross-site/modality liver segmentation.
  5. Reviewer#1: Writing quality. We deliberate the writing without increasing the length.

Finally, during our investigation, we have found it’s really in need of a experimental certifier and a large-scale dataset for this topic. We sincerely hope that our work is contributed and makes more researchers interested in this topic.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors addressed most of the reviewers’ concerns, although the presented baseline methods discussed in the paper were not evaluated. Overall, the idea of using zero-shot cross-modality liver segmentation is interesting and the idea of learning an invariant latent space for capturing the intensity contrast between liver and normal parenchyma on MRI and CT using transformers is novel. The authors promised to perform experiments comparing their approach to nnUnet, and showed results of the comparison in the rebuttal. Given that this experiment is already performed and requires a minimal change to the paper, this might be considered as a minor revision to get the paper accepted to the conference proceedings.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper presents a zero-shot cross modality learning for abdominal MR/CT segmentation. The principle is to learn an invariant latent space for capturing the intensity contrast between the liver and the normal parenchyma in MRI and CT, allowing then to help segmentation. The approach of combining analytical intensity transformation with transformers is interesting. The rebuttal answers most questions. Despite some weaknesses, the method has potential. My proposal is therefore “Acceptance”.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    7



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors here present an interesting approach to zero-shot cross-modality segmentation that could be of interest to the community. All reviewers felt that there was not enough comparisons to alternative approaches, which I agree is a major weakness. In their rebuttal, the authors compared two other approaches for the liver, but (1) their search for alternatives should not be limited to just the liver, as general-purpose zero-shot cross modality approaches would seem to be just as applicable; and (2) it is hard to evaluate these additional experiments since there is no actual second round of review. Thus, it is difficult to incorporate these new experiments in the final rating: it would have been much better if such numbers had been included in the original manuscript so reviewers can properly assess them.

    Additionally, I found writing clarity in both the manuscript and the rebuttal to be very poor. This severely weakens the work, and I think more effort in this regard is required before it can meet the MICCAI bar.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Reject

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    11



back to top