Authors

Chenyu You, Weicheng Dai, Yifei Min, Lawrence Staib, James S. Duncan

Abstract

Integrating high-level semantically correlated contents and low-level anatomical features is of central importance in medical image segmentation. Towards this end, recent deep learning-based medical segmentation methods have shown great promise in better modeling such information. However, convolution operators for medical segmentation typically operate on regular grids, which inherently blur the high-frequency regions, i.e., boundary regions. In this work, we propose MORSE, a generic implicit neural rendering framework designed at an anatomical level to assist learning in medical image segmentation. Our method is motivated by the fact that implicit neural representation has been shown to be more effective in fitting complex signals and solving computer graphics problems than discrete grid-based representation. The core of our approach is to formulate medical image segmentation as a rendering problem in an end-to-end manner. Specifically, we continuously align the coarse segmentation prediction with the ambiguous coordinate-based point representations and aggregate these features to adaptively refine the boundary region. To parallelly optimize multi-scale pixel-level features, we leverage the idea from Mixture-of-Expert (MoE) to design and train our MORSE with a stochastic gating mechanism. Our experiments demonstrate that MORSE can work well with different medical segmentation backbones, consistently achieving competitive performance improvements in both 2D and 3D supervised medical segmentation methods. We also theoretically analyze the superiority of MORSE.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43898-1_54

SharedIt: https://rdcu.be/dnwBO

Link to the code repository

https://github.com/charlesyou999648/MORSE

Link to the dataset(s)

N/A

Reviews

Review #3

Please describe the contribution of the paper

The authors proposed MORSE, which formulate medical image segmentation as a rendering to better model both high-level semantical contents and low-level anatomical features in medical images, allowing it to better preserve high-frequency boundary regions. The core technique novelty of MORSE are 1) to continuously align the coarse segmentation prediction with point representations and aggregate these features to adaptively refine the boundary region; 2) leverages a stochastic gating mechanism inspired by the Mixture-of-Experts (MoE) approach to optimize multi-scale pixel-level features in a parallel manner. The authors perform extensive experiments demonstrate that MORSE achieves competitive performance improvements in both 2D and 3D supervised medical segmentation.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is well motivated, with good novelty and experimental support;
- The injection of coordinate-based neural representations for anatomical rendering in a continuous space is a good technique contribution；
- Mixture-of-experts (MoE) is a fair technique contribution to optimize multi-scale features;
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The term of “Anatomical Rendering” is misleading may leads to overselling. Initially after reading abstract and introduction, it sounds like the authors finds a way to inject anatomical prior to guide the rendering in continuous, and if so it will be inspirational. However after read through section 2.2, it turns out no anatomical prior is used;
- Is the Positional Encoding in this work different to those in transformer? If so what are the differences?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

the author provided sufficient implementation details, code will be released upon publication
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- claim “Anatomical” in the core technique contribution is misleading, consider better clarification;
- In introduction, the paragraph introducing MoE, the following is difficult to follow “a spanning set of the target function class”, may worth more explaination.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

6
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is well motivated, with sufficient novelty and experimental support.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #1

Please describe the contribution of the paper

This paper proposes an implicit neural rendering framework (MORSE), which formulates medical image segmentation as an end-to-end rendering problem and uses MoE and stochastic gating to optimize multi-scale features in parallel.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Formulating medical image segmentation as an end-to-end rendering problem is relatively new.
- It’s good to show the advantage of MoE for specializing feature maps and improving performance.
- The ablation study to some extent validates the effectiveness of the proposed module.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The novelty of the paper is limited.The PointRend[3] approach implicitly formulates image segmentation as an end-to-end rendering problem. What differentiates the methodology proffered herein from PointRend[3]? More illustration and analysis should be made to illustrate the advantage of the proposed algorithm over the compared algorithms.
- It would be prudent to incorporate additional specifics pertaining to the proposed methodology to facilitate comprehension, such as: “The entire model M is trained end-to-end using the supervised segmentation loss Lsup [22] (i.e.,equal combination of cross-entropy loss and dice loss).”
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors assert that all comparative evaluations were conducted using the released open-source implementation of the proposed method. However, insufficient details are provided regarding methodologies employed to ensure fairness of comparisons between the novel technique and other alternative approaches examined.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

Please check part 5 and part 7 for more detailed information.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The novelty of the paper is limited. The expression of the paper needs improvement. The experimental results need more illustration and analysis.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The paper presents MORSE, a novel implicit neural rendering framework for medical image segmentation that aims to improve segmentation quality by adaptively composing coordinate-wise point features and rectifying uncertain anatomical regions. The framework leverages the concept of Mixture-of-Experts (MoE) with a stochastic gating mechanism to optimize multi-scale pixel-level features. The authors demonstrate the effectiveness of MORSE in conjunction with various medical segmentation backbones, achieving competitive performance improvements for both 2D and 3D supervised medical segmentation methods. Additionally, the paper provides a theoretical analysis to support the superiority of the proposed approach.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The paper presents a novel implicit neural rendering framework MORSE, which formulates medical image segmentation as a rendering problem in an end-to-end manner. This unique approach addresses the limitations of traditional grid-based representations and improves the segmentation quality by adaptively composing coordinate-wise point features and rectifying uncertain anatomical regions.
2. The paper provides a thorough theoretical analysis that verifies the expressiveness of the proposed INR-based model, adding credibility to the methodology and its potential impact on the field.
3. The authors conduct extensive experiments, demonstrating that MORSE consistently improves the performance compared to 2D and 3D state-of-the-art CNN- and Transformer-based approaches. This thorough evaluation showcases the effectiveness of the proposed method in a variety of scenarios and backbones.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. The authors mention the potential of implicit neural representations (INRs) in computer vision and graphics, but they do not provide a comprehensive comparison between MORSE and other INR-based methods in the medical image segmentation domain. Such a comparison could provide more insights into the advantages of the proposed method over existing INR-based techniques.
2. The paper does not adequately address the potential drawbacks or limitations of the proposed method.
3. The decoder part of the model structure is drawn incorrectly
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors provide some details.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

Comprehensive comparison with INR-based methods: The authors should include a more thorough comparison of their proposed method (MORSE) with existing INR-based methods in the medical image segmentation domain. This would help to better demonstrate the novelty and effectiveness of MORSE compared to other methods in the field.

Discuss potential drawbacks and limitations: The authors should provide a more in-depth discussion of potential drawbacks or limitations of the proposed method.

Correct the model structure: The authors should revise the incorrect depiction of the decoder part of the model structure in the paper.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

4
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

See both the strengths and weaknesses sections.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

This paper presents a novel image segmentation method that utilizes neural rendering and Mixture-of-Experts (MoE) concepts. All reviewers acknowledged the innovative approach of formulating image segmentation as a rendering problem, along with the effective utilization of MoE to optimize multi-scale features. While there are some concerns regarding the absence of comparisons with existing INR-based methods, the merits of this work outweigh its weaknesses. Consequently, I recommend accepting this paper.

Author Feedback

We thank the reviewers (R1, R2, R3, Meta-R) for recognizing the novelty and application value of our research, as well as for providing constructive comments. We would like to address the concerns and suggestions as follows:

To R1:Thanks!

novelty: Our major contribution and novelty is a new implicit neural rendering framework. This framework has fine-grained control of segmentation quality by adaptively composing INRs (i.e., coordinate-wise point features) and rectifying uncertain anatomical regions. Furthermore, we illustrate the advantage of adopting a mixture-of-experts (MoE) approach that endows the model with better specialization of features maps for improving the performance, which is not considered in [3]. Our current experiments show that our method consistently improves performance compared to 2D and 3D SOTA CNN- and Transformer-based approaches. Our theoretical analysis also demonstrates the effectiveness of MORSE.

specifics pertaining: Thanks for pointing this out. We will revise this based on your suggestions.

Implementation: We have routinely done this based on previously sampling method [8], and MoE design [13]. We also confirm our promise to release our source code and pre-trained models on the community.

To R2:Thanks!

comprehensive comparison: To our best knowledge, our MORSE is the first work that incorporates the idea of implicit neural representations (INRs) (coordinate-based information) to address multi-class medical image segmentation. Meanwhile, we compared the other general computer vision INR in Tables 1 and 2.

potential limitations and discussion: Thanks for pointing this out. First, our MoE method is based on multi-task learning, and it will lead to a relatively high computational cost. We plan to optimize the training. Moreover, our approach should improve transparency in rectifying medical segmentation maps. We will revise this based on your suggestions.

potential limitations and discussion: Thanks for pointing this out. The potential limitations are two-fold. First, our MoE method is based on multi-task learning, and it will lead to a relatively high computational cost. We plan to optimize the training. Moreover, our approach should improve transparency in rectifying medical segmentation maps. We will revise this based on your suggestions.

decoder figure: Thanks for pointing this out. We will fix the direction of the arrows.

To R3:Thanks!

“Anatomical Rendering” is misleading: Thank you! We aim to have fine-grained control of segmentation quality, i.e., to adaptively compose coordinate-wise point features and rectify uncertain anatomical regions. In practice, we encode the sampled coordinate-wise point features into a continuous space, and then align position and anatomical features with respect to the continuous coordinate.

Positional Encoding vs transformer Yes. Our Positional Encoding, based on eq.3, includes a parameter matrix (i.e., w_1, w_L) that encodes coordinates into a higher dimension feature, to capture high-frequency signals. The parameter matrix is trainable, while the one used in the transformer doesn’t have this.

term of MoE IAR represents a coordinate-based subnetwork as a functional combination of bases sampled from the pixel/voxel-level features [19, 20], and MORSE assembles a group of coordinate-based subnetworks that are tuned to span the desired function space [13].

Overall, we’ve made a substantial revision to the paper, which addresses all the issues, with emphasis on the clarity of the explanation of our method. We hope that the changes will make this work in better shape for publication. Please feel free to contact us for further concerns.

back to top

Implicit Anatomical Rendering for Medical Image Segmentation with Stochastic Experts