Authors

Linrui Dai, Wenhui Lei, Xiaofan Zhang

Abstract

As research interests in medical image analysis become increasingly fine-grained, the cost for extensive annotation also rises. One feasible way to reduce the cost is to annotate with coarse-grained superclass labels while using limited fine-grained annotations as a complement. In this way, fine-grained data learning is assisted by ample coarse annotations. Recent studies in classification tasks have adopted this method to achieve satisfactory results. However, there is a lack of research on efficient learning of fine-grained subclasses in semantic segmentation tasks. In this paper, we propose a novel approach that leverages the hierarchical structure of categories to design network architecture. Meanwhile, a task-driven data generation method is presented to make it easier for the network to recognize different subclass categories. Specifically, we introduce a Prior Concatenation module that enhances confidence in subclass segmentation by concatenating predicted logits from the superclass classifier, a Separate Normalization module that stretches the intra-class distance within the same superclass to facilitate subclass segmentation, and a HierarchicalMix model that generates high-quality pseudo labels for unlabeled samples by fusing only similar superclass regions from labeled and unlabeled images. Our experiments on the Brats2021 and ACDC datasets demonstrate that our approach achieves comparable accuracy to a model trained with full subclass annotations, with limited subclass annotations and sufficient superclass annotations. Our approach offers a promising solution for efficient fine-grained subclass segmentation in medical images. Our code is publicly available here.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43895-0_25

SharedIt: https://rdcu.be/dnwyi

Link to the code repository

https://github.com/OvO1111/EfficientSubclassLearning

Link to the dataset(s)

https://www.creatis.insa-lyon.fr/Challenge/acdc/databases.html

http://braintumorsegmentation.org/

Reviews

Review #1

Please describe the contribution of the paper

The manuscript introduces a hierarchical approach to conduct segmentation of multiple structures with complete foreground/background but incomplete structure subclass labels. The method involves superclass logit sharing, multi-task with combined superclass and subclass loss, and a local GuidedMix augmentation. Experiments are carried out on two public datasets ACDC and BraTS2021, where comparison with previous methods and ablation studies are conducted. Lower and upper boundaries are set.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Addressing a legitimate issue of hierarchical segmentation problem in image analysis
2. Modules are proposed with clear purposes and integrated properly to solve the targeted problem.
3. Complete ablation study and comparison with weak and strong baselines.
4. Overall the manuscript was written with care and the code sharing and reproducibility (usage of public dataset) were handled properly.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. A more precise problems statement may be needed.
2. HierarchicalMix apparently contributed the significant most to the improved performance which makes the value of other modules marginal.
3. Missed opportunities for deeper discussions on sensitivity to the data/task and label efficiency.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The experiments are conducted on public datasets. Code is shared via anonymous.4open.science. Overall it is considered very reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. Fine-grained problems are often associated with limited knowledge of subclass information. If strong prior or even the number of subclasses is required, the value of the method could be substantially different. So precisely the method is not solving a fine-grained problem but instead a few-shot hierarchical segmentation problem with complete superclass labels.
2. HierarchicalMix seems to be the major driving force of improved segmentation performance. As one of the key innovations, HierarchicalMix effectively merges labels with similar semantic meaning in radiology images, while if we remove the locality requirement is the method generalizable and still outperforming GuidedMix? Such applications could be multi-class cell segmentation on histopathology images.
3. BraTS2021 results are improved significantly when number of labeled data increased, and addition of other modules was not bringing in substantial changes. This is a different behavior than ACDC results. This should need a discussion.
4. Further discussions on how adding volumes with full subclass annotations would approach the upper bound is worth a study.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Overall the manuscript has just sufficient technical novelty. The experiments are complete and presented with good clarity. With a more precise problem statement I can see the manuscript to be recommended to be scored as 6: accept.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The paper presents a segmentation method for precise sub-class segmentation that can be trained using only a limited number of samples with sub-class ground truth annotations and ample samples with super-class ground truth annotations. The proposed architecture incorporates multiple components, including prior concatenation, separate normalization for foreground and background feature maps, and hierarchical mixing of foreground regions between samples with sub-class and super-class annotations to leverage the available super-class annotations. The method is evaluated against several baselines, including two semi-supervised learning approaches on two medical image segmentation datasets (ACDC and BraTS), and demonstrates superior performance.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

○ The paper is well-written and easy to follow. ○ The idea of effectively leveraging existing superclass information to learn to segment sub-classes in medical images has not been extensively explored in previous literature, despite a large number of works in semi-supervised approaches. The authors have presented an interesting approach named Hierarchical mixing, inspired by Guided-mix [1], to mixup only the foreground superclass regions of finely annotated and coarsely annotated samples, which encourages the model to focus on sub-class regions. ○ The proposed method is evaluated on two standard segmentation datasets, ACDC and BraTS 2021, showing a significant improvement over the baseline method (semi-supervised approaches), particularly in ACDC. Interestingly, the model’s performance is close to that of U-Net, which is trained with all the available sub-class annotations, despite using only a few fine-grained sub-class annotations in the presence of sufficient coarse superclass annotations.

Ref: [1] Tu, P., Huang, Y., Zheng, F., He, Z., Cao, L., Shao, L.: Guidedmix-net: Semisupervised semantic segmentation by using labeled images as reference. Proceedings of the AAAI Conference on Artificial Intelligence 36, 2379–2387 (06 2022)
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

○ While the paper acknowledges the potential benefits of leveraging available coarse-grained information for fine-grained analysis, the authors could have provided a more detailed justification for the need to do so in medical imaging, along with relevant clinical applications and related works. Such information could help readers better understand the motivation behind the proposed approach. ○ Furthermore, the results for the BraTS dataset show a high standard deviation in the dice score (shown in supplementary pdf), indicating that the results are not statistically significant, as confirmed by the p-value.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

The authors have provided a link to the source code in the paper itself and used two publicly available datasets. Based on the description, the work should be reproducible.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

○ In the Introduction section of the first paragraph, the author could strengthen the claim that “the research focus has shifted towards finer-grained categories” by providing relevant citations to support it. ○ It would have been helpful if the authors had evaluated the approach of first training a U-Net using examples with only super-classes and then fine-tuning the model on samples with sub-classes by replacing the end layer. This could have served as an additional baseline and provided a stronger case for why joint training is necessary. It would have also been interesting if this approach could incorporate samples without any annotations. ○ The authors could have provided a stronger justification for using prior concatenation in the architecture, as the results did not show significant benefits of using it.

○ Minor: In the Method Problem section, authors could also define z for clarity. In the Prior Concatenation section, sub-class class » sub-class. In the Hierarchical Mix section, how did the authors determine the high confidence while assigning a valid subclass label? If it’s a threshold, please mention it in the experiment details. The authors missed reporting the alpha factor value chosen in the implementation details. Also, the authors could include a short description of how gradients were cut in back-propagation in the implementation details.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

While the paper is well-written and well-structured, and the proposed method has demonstrated good results when compared to baselines in two datasets, there is room for improvement in highlighting the real-world clinical motivation behind the research. Specifically, the ability to learn from coarse-grain classes to accurately segment fine-grain classes is crucial in clinical settings, and the author could have emphasized this aspect more effectively in the paper.
Reviewer confidence

Confident but not absolutely certain
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

The paper proposes an approach for fine-grained subclass segmentation in medical images. The authors leverage the hierarchical structure of categories to design a network architecture that includes a prior Concatenation module to enhance confidence in subclass segmentation, a Separate Normalization module to stretch the intra-class distance within the same superclass, and another model to generate pseudo labels for unlabeled samples by fusing similar superclass regions from labeled and unlabeled images.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. A good domain specific improvement to GuidedMix-Net.
2. The augmentation strategy makes sense to few-shot segmention tasks.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. It is unclear of the comparison between the proposed methods to other data augmentation strategies and other image editing / impainting methods to augment images.
2. HierarchicalMix is a small improvement to GuidedMix-Net.
3. No ablation study to compare proposed with GuidedMix-Net.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

Not sure for the reproducibility.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
1. The author needs to explain their method has advantages than mixup and other data augmentation strategies. They should also compare with some image editing / impainting methods to augment images.
2. The author should provide more details on the experimental settings and the motivation.
Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

5
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors make their code public and it is easy for reviewers to check their results.
Reviewer confidence

Very confident
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Primary Meta-Review

Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

The paper presents a segmentation method that focuses on formulating the problem as superclass-subclass hierarchical segmentation problem with limited labels. Evaluation is conducted on two public datasets. Overall, the reviewers recognizes the design and performance of the proposed method, but also raised many questions, which should be clarified when preparing the updated version.

Author Feedback

We would thank all reviewers for your time reading our work, the following parts are feedback for some of your concerns and suggestions.

Reviewer #1: Weakness 1: We apologize for your trouble reading the problem statement, we will work on articulating it in a journal version. Weaknesses 2-3: Indeed, HierarchicalMix may seem to provide more performance boost than others on the chosen datasets, but unfortunately we cannot confirm whether there is a dataset bias at play. We are currently working on transferring the algorithm to other datasets, and hopefully we will see a more comprehensive discussion on data sensitivity as well as label efficiency in a journal version. Comments 2-4: We are aware that there are many more ways of interpreting our experiment results left unexplored, but due to the page limit, we are unable to cover every detail. We will work on refining our experiment and discussion in a journal version while taking into account your precious suggestions.

Reviewer #2: Weakness 1: We apologize for your confusion, the main motivation behind the coarse-to-fine architecture is to put available coarse segmentations into use, should the research focus shift towards finer structures. Instead of re-labeling the entire dataset, we are able to effectively reuse the available coarse labels and only label a minimal amount of fine structures additionally. By doing this, we save the time and resource needed. We will try to add more detailed explanations in a journal version. Weakness 2: The undesirable performance stability on BraTS2021 dataset may be attributed to the nature of BraTS2021 given that not all of its samples have full 3 subclass labels, this may undermine our experiment somehow. We will provide a more detailed analysis should this phenomenon persist on other datasets in a journal version. Comment 1: Thanks for pointing this out, we will add corresponding citations in the camera-ready version. Comment 2: We actually tried the method of first training a U-Net with only superclasses and then finetuning it. The results are generally undesirable compared to both ours and the two semi-supervised baselines. But due to the page limit, we were unable to display them. Comment 3: The Prior Concatenation module works intuitively by appending prior knowledge to the feature map before subclass classification. But this effect may be washed down by Separate Normalization, since if this module does its job well, the prior knowledge may be already captured by the foreground branch.

The minor problems will be fixed in the camera-ready version, and the gradients were cut by using detached tensors, thank you for pointing them out.

Reviewer #3: Weaknesses 1-3 and Comment 1: Thanks for your advice, we are aware of the importance of substantiating the superiority of our data augmentation process, and we may provide a more detailed comparison with other data augmentation methods in a journal version. Comment 2: We apologize for your confusion, the main motivation behind the coarse-to-fine architecture is to put available coarse segmentations into use, should the research focus shift towards finer structures. Instead of re-labeling the entire dataset, we are able to effectively reuse the available coarse labels and only label a minimal amount of fine structures additionally. By doing this, we save the time and resource needed. We will try to add more detailed explanations in a journal version. As for the experimental setting, we are going to include more details in the supplementary materials.

back to top

Efficient Subclass Segmentation in Medical Images