Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Dan Yin, Wei Huang, Zhiwei Xiong, Xuejin Chen

Abstract

Unsupervised domain adaptation (UDA) has gained great popularity in mitochondria segmentation, aiming to improve the adaptability of models from the labeled source domain to the unlabeled target domain via domain alignment. However, existing UDA methods only focus on aligning domains on the prediction level, while ignoring the feature space containing more adequate information than the predictions. In this paper, we propose a class-aware domain adaptation method for mitochondria segmentation on the feature level, which relies on the prototype representation to achieve more fine-grained alignment. In particular, we first extract the feature centroids of classes from the source domain as prototypes. Leveraging the extracted prototypes as a bridge, we constrain that features belonging to the same class but from different domains are pulled closer to each other, achieving the class-aware alignment. Meanwhile, we derive a segmentation prediction directly from feature space based on the distance between target features and source prototypes. By incorporating a pseudo label to supervise the learning of this prediction, the feature distribution gap across domains is further reduced. Furthermore, to take full advantage of the potential of target domain, we propose an intra-domain consistency constraint to maintain consistent predictions of samples perturbed differently from the target image. Extensive experiments on different datasets demonstrate the superiority of our proposed method over existing UDA methods.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_23

SharedIt: https://rdcu.be/dnwC7

Link to the code repository

https://github.com/Danyin813/CAFA.

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a new method for unsupervised domain adaptation for segmenting mitochondria in EM. The method is based on a combination of different loss functions. The main idea is to compute class prototypes based on the model features and align these prototypes between source and target domain. This is extended by pseudo-labeling using different views of the same input in the target domain.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper introduces a new formulation for unsupervised domain adaptation that uses class prototypes derived from the network features and aligns them between source and target domain by minimizing the cosine distance of representatives of the same class and maximizing them for representatives of different classes. This prototype based loss is augmented by pseudo-labeling and a consistency loss in the target domain. This idea is sound and at its core is described well. The results appear to show an improvement compared to related methods (though it is hard to judge these results given the missing description of most of the experimental set-up, see weaknesses). The paper includes an ablation study that shows the contribution of the different loss terms.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper reads like it is in an unfinished state and is missing key information on the problem set-up, network architecture and experimental results:

    • What architecture is used? This is not mentioned at all in the main text. The supplementary contains an architecture sketch that looks like a UNet variant. However, this architecture is not described in sufficient detail to understand the logic behind it: what is “m” in “X_i+m”? What exactly are the different outputs “P_i”, “P_i+1” and “R_i”? How do these outputs relate to the loss functions that you optimize?
    • Which features exactly are taken into account to compute the representatives? Just the features from the last layer before the segmentation head, or multiple layers?
    • Is the architecture 2D or 3D? Again, this is not clear from the main text; based on the supplementary figure it seems to be a 2D network with multiple input slices, but this is not explained in any detail.
    • Where exactly do the baseline results come from and how do you ensure that the comparison with your method is fair? Do you take the results reported in the other publications, do you rerun the code provided by them or do you reimplement these methods yourself? Without a clear description of this it is not possible to judge whether the reported improvement over previous methods is real or whether it is just due to favorable conditions for your own method.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This paper is not reproducible as important information about the basic set-up and implementation is missing. See weaknesses for details. The authors do not provide any code and also do not mention whether they intend to make code public if the paper was to be accepted.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Provide the missing information about the problem set-up, architecture and experiments. See weaknesses for details.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    2

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is not ready for publication. It does not provide the necessary information to judge the quality of the proposed method. It might be a strong new domain adaptation method, but in the current form it cannot be published. The review / rebuttal period is not sufficient to fix these flaws and re-evaluate the paper.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    The authors have addressed some of the points I have raised in my review. Given the response, I have improved my score to a weak reject. I believe that there is still some work necessary to ensure reproducibility and correct description of the method. I am a bit uncertain whether it is possible to address this in the revised version; and I would like to stress again that the current version should not be published because of key lacking details in the description of the network and experimental set-up. Hence I would still recommend to reject the paper and resubmit a revised version to a later conference. If the area chairs decide to accept the paper after all I strongly urge the authors to extend the method description with the crucial method details (network architecture, meaning of terms, experimental set-up and details on baseline experiments), because the paper is not reproducible otherwise.



Review #2

  • Please describe the contribution of the paper

    This paper proposes a class-aware domain alignment method in the feature space for unsupervised domain adaptive (UDA) mitochondria segmentation in electron microscopy (EM) images. The method relies on prototype representation to achieve fine-grained feature alignment. The contributions of this work can be summarized as follows:

    The authors present a class-aware feature alignment method for domain adaptive mitochondria segmentation, which is to align source and target domains on the feature level in UDA for EM mitochondria segmentation. The class-aware feature alignment is based on source prototypes, representing class knowledge from the feature space.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of the paper include:

    Novel formulation: The authors propose a class-aware alignment method is unique, as it moves away from traditional methods that align domains at the prediction level and instead leverages feature space information for improved results.

    Demonstration of clinical feasibility: While the paper focuses primarily on the methodological aspects and evaluation of the proposed class-aware alignment method for unsupervised domain adaptation in mitochondria segmentation, it does not directly address the clinical feasibility of the approach.

    Original use of data: The paper presents an original way to use data by proposing class-aware alignment in the feature space. The method extracts source prototypes representing class knowledge, serving as a bridge for alignment between source and target domains. This fine-grained alignment enables the model to learn more effectively from the source domain and improve its performance on the target domain. Furthermore, the authors incorporate an intra-domain consistency constraint, which maintains consistent predictions of samples perturbed differently from the target image, further exploiting the potential information within the target domain. Overall, the approach allows the model to adapt better to the target domain while leveraging the valuable information from the source domain.

    Evaluation: The paper presents a thorough evaluation of the proposed method, including comparisons with several baseline models and an ablation study to assess the contribution of each loss term.

    The proposed class-aware alignment method for unsupervised domain adaptation in mitochondria segmentation offers an innovative approach compared to existing methods, such as UALR, DA-VSN, and DA-ISC. The key difference lies in the focus on class-aware alignment in the feature space, which has the potential to improve performance over methods that only align output predictions or rely solely on adversarial training. Furthermore, the explicit incorporation of class-aware information, such as class prototypes and intra-domain consistency loss, sets it apart from other methods, such as DA-ISC, which does not include explicit class-aware alignment. Additionally, the proposed method introduces an intra-domain consistency constraint to further exploit knowledge within the target domain, a feature not present in either DA-VSN or DA-ISC. This approach enables more effective use of the abundant knowledge and information in the target domain, which can further improve the domain adaptation performance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper itself does not explicitly mention the weaknesses of the proposed method. However, after analyzing the paper, some potential weaknesses can be identified:

    Limited comparison with other methods: The paper compares the proposed method with DA-VSN and DA-ISC, but it would be useful to include more comparisons with other existing UDA methods for a more comprehensive evaluation.

    Lack of detailed analysis of the class-aware alignment technique: While the proposed class-aware feature alignment is novel, the paper could benefit from a more in-depth analysis of the impact of different components of the method. For instance, it would be helpful to study the effects of the intra-domain consistency constraint and the pseudo-labeling method separately.

    Clinical feasibility: Although the paper demonstrates the method’s effectiveness on EM dataset benchmarks, it is not clear if the method has been evaluated in real clinical scenarios. Assessing the performance of the proposed method in real-world applications would strengthen its clinical relevance.

    Generalizability: The proposed method is focused on mitochondria segmentation in EM images. It is unclear how well the method would generalize to other types of medical images or other segmentation tasks. Further exploration and evaluation of the method’s applicability in other contexts would be beneficial.

    Including more visualizations and examples in the paper would aid readers in understanding the proposed method and its results. Visualizing the class-aware alignment, intra-domain consistency constraint effects on features and segmentation outputs, and comparisons with other methods could provide valuable insights into the method’s effectiveness.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Detailed methodology: The authors provide a comprehensive description of their proposed method, including the formulation of various loss functions and the overall architecture. This information would be helpful for researchers looking to implement the method on their own.

    Implementation details: The authors have described the implementation details, including the patch size, optimizer, learning rate, and the number of training iterations. They also provide the balancing weights and thresholds used in their loss functions. This information should be helpful for reproducing the experiments.

    Dataset: The authors use publicly available datasets, Lucchi and MitoEM, which can be accessed by other researchers.

    Evaluation metrics: The authors use standard evaluation metrics (e.g., Intersection over Union) to assess the performance of their method, which would allow for a fair comparison with other methods and make it easier to reproduce the evaluation.

     

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    The method is innovative and has shown promising results on EM dataset benchmarks. However, there are some areas where the paper could be improved to provide a more comprehensive understanding of the proposed method and its potential. Here are my detailed and constructive comments for your consideration:

    While the paper demonstrates the effectiveness of the proposed method on EM dataset benchmarks, it is not clear if the method has been evaluated in real clinical scenarios. Assessing the performance of the proposed method in real-world applications would strengthen its clinical relevance and applicability. The proposed method is focused on mitochondria segmentation in EM images. It would be valuable to explore and evaluate the method’s applicability in other contexts, such as other types of medical images or other segmentation tasks. This would help readers understand the potential scope of your method’s utility.

     

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors claim that their proposed class-aware alignment method is the first attempt to align source and target domains on the feature level in unsupervised domain adaptation (UDA) specifically for electron microscopy (EM) mitochondria segmentation. This novelty is good. However, the clarity needs improving. The evaluation includes comparisons with multiple state-of-the-art domain adaptation methods, such as Advent, UALR, DAMT-Net, DA-VSN, and DA-ISC. By comparing the performance of the proposed method against these well-established approaches, the authors demonstrate the effectiveness and advantages of their method. It needs some more polishing.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    Overall the paper proposed a good method, but the experiments are not very sufficient and the implementation is not described in detail. The rebuttal has provided some other details that may improve the issues, but I will keep my original rating.



Review #3

  • Please describe the contribution of the paper

    The paper addresses the domain shift challenge in electron microscopy (EM) images, a common issue resulting from various imaging devices and collection processes. Unsupervised domain adaptation (UDA) has shown potential in segmenting unlabeled data. The authors propose a class-aware feature alignment method that incorporates distance-based alignment, pseudo-labeling learning, and intra-domain consistency constraints to achieve multi-class feature alignment for superior mitochondria segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The performance of the proposed learning strategy demonstrates its ability to transfer knowledge to unlabeled data, addressing a critical issue in medical image segmentation where data annotation is scarce.

    2. The proposed method effectively aligns domains on both prediction and feature levels.

    3. The ablation study and experiments provide evidence for the functionality of the proposed method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper would benefit from a comparison of the proposed method with other generative models (e.g., VAE, diffusion models) to assess the capability of domain transfer.

    2. A discussion on the performance when using each constraint individually would help to further understand the intra-relationship between different constraints during domain shift processing.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It is achievable to reproduce the method according to the illustration in the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    In the prototype extraction section, please clarify the meaning of the centroid of each class in the feature space. Why did you choose to use only the centroid instead of all features? Providing insight into the rationale behind this decision would help readers better understand your approach.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The motivation and innovation of the proposed design.

    2. The experiments and performance of the proposed method.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    There are mixed opinions. Two reviewers are in favor of acceptance of the paper, acknowledging the novelty of the methods and the sufficient ablation studies, while one reviewer has raised several concerns. Specifically, some important details regarding the method design, and experiments implementations are missing. These issues should be addressed in the rebuttal.




Author Feedback

We thank all reviewers for their valuable suggestions and positive comments. We apologize for missing some details about network architecture due to the space limit. We will add these details in the revision. The code will be released upon acceptance.

R #1 Unfinished state and missing key information. We respectfully cannot agree. 1) We elaborate on the contributions and method in the introduction and method. 2) We have compared with various UDA mitochondria segmentation methods in Table 1 and Table 2. 3) We further performed ablation studies in Table 3 (main body), Table 1 and Table 2 (supplementary material). 4) We will provide more details about network architecture and experiments in the revision.

Network details. We apologize for the missing details. 1) Our network architecture follows the baseline DA-ISC [7], as claimed in Fig. 2 (supplementary material). 2) We will provide more details about the architecture in the revision according to your valuable suggestions.

The definition of notations. Thank you for the nice suggestions. 1) The definitions of P and R in Fig. 2 (supplementary material) follow DA-ISC, where R_i is the residual of the predictions of two input slices, and m denotes the interval between two input slices. Note that P and R exist in both source and target domains. R is ignored in the paper for simplicity. 2) There is a clerical error in the supplementary: P_i+1 should be P_i+m. We will correct it in the revision.

Loss function. In eq. 7, L^s_seg follows DA-ISC to optimize all the segmentation outputs (P_s and R_s) in the source domain. In the target domain, L_cf is the cross entropy between P_t, R_t and their pseudo labels.

Which features are used? We just use the features from the single last layer before the segmentation head.

2D or 3D network? It is a 2D network.

Fair comparison. 1) The results of DAMT-Net, UALR, DA-VSN and DA-ISC are adopted from DA-ISC. 2) To make a fair comparison, we use the same experimental settings and evaluation metrics following DA-ISC. In addition, we reproduce another general UDA segmentation method Advent [19] under the same setting. 3) We will describe the baseline methods in detail in the revision.

R #2 Limited comparison. We perform an adequate comparison with a state-of-the-art method for UDA mitochondrial segmentation (DA-ISC) and Advent, UALR, DAMT-Net, DA-VSN. In addition, we further reproduce the latest UDA segmentation method [TPS, ECCV 2022] on MitoEM. In MitoEM-R -> MitoEM-H, the mAP of TPS is 90.5 while ours is 92.8.

Detailed analysis. Table 3 has validated the effectiveness of each loss term sufficiently. We will also provide more visualization analysis in the revision.

Clinical feasibility and generalizability. Thank you for the valuable suggestions. Due to the space limit, we only experiment on EM images. For future work, we will extend our method to more medical datasets, such as BRaTS 2021 to validate the generalizability of our proposed method.

More visualizations and examples. Thanks for the nice suggestions and we will provide more feature and segmentation visualizations on our project page for each loss.

R #3 Comparison with other generative models. Since there is no VAE or diffusion-based generative methods for EM image segmentation, we don’t compare them with these methods. Due to the space limit, we will reproduce the DDA [Back to the source, CVPR 2023] in UDA EM image segmentation for a more comprehensive comparison.

Discussion about the performance. We have conducted thorough ablation experiments on the effectiveness of each loss term in Table 3 (main body) and Table 2 (supplementary material), and we provided a comprehensive analysis in Section 3 (Ablation Study for Loss Functions).

The meaning of the centroid of each class in the feature space. Following [8, 22], we represent the prototype as the centroid of each class, it can represent the class knowledge better with less noise than the raw features and reduce computing space.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors’ rebuttal has provided good responses and addressed the reviewers’ concerns regarding the implementation details and comparison studies. The final version should further address R1’s concerns regarding reproducibility by including more comprehensive implementation details.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This paper introduces a class-aware domain alignment method in the feature space, specifically designed for mitochondria segmentation in EM data. The proposed approach effectively utilizes limited labeled data from a source domain to improve the segmentation quality of unlabeled target data. During the initial review phase, diverse opinions were expressed, with the main criticism revolving around insufficient details and incompleteness of the manuscript. While the authors successfully addressed most of these concerns in the rebuttal, R1 remains strongly skeptical about the paper’s maturity for publication. However, taking into account the assessments of the other two reviewers as well as my own evaluation, it is apparent that the paper possesses merits worthy of publication. Therefore, I recommend accepting this paper.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    As this work has received mixed scores, and reviewers do not seem very enthusiastic about the paper, I have read the paper to make a personal impression of this work. After reading the manuscript, as well as reviewers comments and authors rebuttal, I find that this submission still needs a major improvement, which might not be feasible for the camera ready version. In particular, two important concerns that have raised after considering all these points is the novelty (in particular the actual differences with existing works, e.g., [8]) and the unjustified choices in the empirical validation, particularly related to the compared methods. For the novelty, I found that there exist many similarities with [8] (for example, the inter and intra-class constraints are the same as the contrast adaptation step in [8], with the only difference that one uses cross-entropy while the proposed method employs a cosine similarity). Furthermore, [8] also integrates a pseudo supervision mechanism. Thus, differences wrt to prior literature should have been better highlighted to position the proposed method. Related to this, I found the claim ‘for the first time, we propose the class-aware alignment for DA on mitochondria segmentation in the feature space’ misleading, as this is a straightforward application of an existing method to a specific application. Concerning the choice of compared methods, I side with reviewers in that the evaluation is weak, and additional, more related approaches should have been considered. For example, three out of the five methods ([6],[19]) compared are not from the medical field and somehow outdated (from 2019 and 2021, which makes me wonder why if general methods are selected, these are not state-of-the-art) and one in the same task is also somehow outdated ([16]). In summary, I found that this work needs a better positioning with respect to existing methods, particularly highlighting its main differences and include a more comprehensive evaluation, properly motivating the choice of selected approaches, rather than random methods. All these arguments, together with the lack of enthusiasm found in the scores/reviews, I wonder whether this approach will trigger interesting discussions in the conference, and thus recommend its rejection.



back to top