Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Kai Ren, Ke Zou, Xianjie Liu, Yidi Chen, Xuedong Yuan, Xiaojing Shen, Meng Wang, Huazhu Fu

Abstract

Classification and segmentation are crucial in medical image analysis as they enable accurate diagnosis and disease monitoring. However, current methods often prioritize the mutual learning features and shared model parameters, while neglecting the reliability of features and performances. In this paper, we propose a novel Uncertainty-informed Mutual Learning (UML) framework for reliable and interpretable medical image analysis. Our UML introduces reliability to joint classification and segmentation tasks, leveraging mutual learning with uncertainty to improve performance. To achieve this, we first use evidential deep learning to provide image-level and pixel-wise confidences. Then, an uncertainty navigator is constructed for better using mutual features and generating segmentation results. Besides, an uncertainty instructor is proposed to screen reliable masks for classification. Overall, UML could produce confidence estimation in features and performance for each link (classification and segmentation). The experiments on the public datasets demonstrate that our UML outperforms existing methods in terms of both accuracy and robustness. Our UML has the potential to explore the development of more reliable and explainable medical image analysis models.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43901-8_4

SharedIt: https://rdcu.be/dnwCE

Link to the code repository

https://github.com/KarryRen/UML

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presented a joint classification and segmentation method for image analysis. The network uses the uncertainty at the image level and pixel-wise to improve downstream tasks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    -The paper is well-written and presents its ideas in a clear manner. -The authors utilize a public dataset in their method, which allows for comparison and evaluation with other methods. -The use of uncertainty to enhance the performance of the network is a notable contribution of the paper.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Segmentation, a form of classification used in image analysis, can benefit from incorporating uncertainty from the classification process. It’s unlikely that the same binary task will extract identical features, so the proposed method involves two ensembles where the uncertainty of one is incorporated into the other.

    It would be helpful to see information on the computational expense of the network, as well as the training and testing time.

    Other uncertainty methods, such as using ensembles of segmentation networks, are known to be time inefficient. How does the proposed method compare to using ensembles and adding uncertainty in the final layers of the segmentation process?

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Looks good

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    For Table 3, my recommendation would be to use the segmentation results (UN) as the baseline and then evaluate the performance of the model by incorporating UI. The segmentation is the downstream task and classification would may add some information into the final decesion.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is written nicely, the application for multi-task segmentation and classification is not necessarily applied. In the cases of having image segmentation a binary classification is not necessarily. This would have been nice if different classes (stages of lesion) would have been considered.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    7

  • [Post rebuttal] Please justify your decision

    I would like to thank the authors for their modification, I vote for acceptance of the paper.



Review #2

  • Please describe the contribution of the paper

    The paper presents a multi-task (segmentation/classification) netowrk that mutually learning features with uncertainty guidance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    It has certainty novelty and the results show significant improvement to other methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The uncertainty estimation is based on average of the probabilistic output, which may not be a valid uncertainty estimation. It needs to be further justified. One model can generate high probability in decision making but not reliable or confident.

    • For the SPY1 dataset, training/testing split should be performed on the subject level rather than 2D image level, which makes the performance over optimistic. Also, pCR is only relavent to the turmour site, hence needs to be considered in 3D rather than 2D.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The network structure is quite complex, which is difficult to reimplement and reproduce the results from scratch.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    • Please justify the use of subjective logic as the uncertainty estimation.

    • For the SPY1 dataset, training/testing split should be performed on the subject level rather than 2D image level, which makes the performance over optimistic. Also, pCR is only relavent to the turmour site, hence needs to be considered in 3D rather than 2D. Please clarify the related detail.

    • For the Refuge dataset, why equally divided the data into training validation and testing?

    • Please perform stastical test when claim one method is better than the other.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method has certain novelty, but requires more thorough evaluations.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    In this paper, the authors proposed a follow-up work of TBraTS [27], which is a joint medical image segmentation and classification by integrating the uncertainty estimation (evidential deep learning). In the proposed framework, an uncertainty informed mutual learning (UML) network is first introduced, and then use it to provide pixel-wise and image-level uncertainty for segmentation and classification, respectively. In the experimental part, the proposed method shows obvious improvement by comparing with TBraTS [27], previous evidential method [16].

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    I think the joint learning of segmentation and classification could improve the final results for both of the two applications. Moreover, the proposed method may be the first one to jointly utilize the pixel-wise and image-level uncertainty.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. What exact are the BCS and ECS in the experimental part? I do not find their original articles in the text. Because these two methods are the key comparisons, without their detailed information, their experimental results could not strongly support the effectiveness of the proposed method.

    2. The proposed method obtains better performance by comparing with TBraTS [27], but the proposed one is a follow-up work of TBraTS [27] from the same group, and thus the authors should compare with some similar methods from other groups.

    3. Transformer-based medical image segmentation has achieved good results for different medical data. Because the authors try to show a “general” approach for medical images, is it possible to compare with some representative approaches, e.g., UNETR: Transformers for 3D Medical Image Segmentation, or other related ones?

    4. Because the proposed work could also be treated as a “two-task” framework, is it possible to compare with some multi-task framework, e.g., Pelvic Organ Segmentation Using Distinctive Curve Guided Fully Convolutional Networks?

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors do not release their code.

    Its anterior work, TBraTS [27], releases the code on the segmentation with uncertainty estimation.

    The training on the joint framework, i.e., first do the MUT computation, and then do the seg loss, and finally do the cls loss may make the whole training complicated.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    I rate the paper as “weak reject”, not because it is a bad work, but as I said in the weakness part, the experimental part of this work has flaws, and I hope the author can give a response in the rebuttal.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In this paper, the experimental settings are not strong and comprehensive enough to support the effectiveness of the method.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    4

  • [Post rebuttal] Please justify your decision

    Thanks for the explanation. This is an interesting paper, but I still have two concerns.

    1. Please clearly state in the revised version that the BCS is based on the framework of DSI [25] and utilizes two encoders and decoders to accomplish the tasks. And the BCS is actually the baseline framework of the proposed method.

    In your rebuttal, you reported another result on DSI: DSI/0.837 (0.722)/0.653(0.314)/1.33e-60(1.71e-114). What is the different of network structure between BCS and DSI?

    1. The details of ECS is not clear. You said that “ECS (Evidential deep learning based Classification and Segmentation framework) represents an alternative variant that focuses on evidential-based uncertainty estimation. ECS builds upon BCS by integrating a deep evidential learning uncertainty estimation module into its framework.”. Yet, in the text, you also said “we adopt evidential deep learning [15, 27] to simultaneously estimate the uncertainty of both to estimate image-level and pixel-wise uncertainty”. Thus, what is the actual difference between the proposed one and the ECS?

    Why the classification results (0.765 0.810) in the table 3 are the same as the ECS’s results (with sigma = 0.01) in table 2?

    In sum, “What exact are the BCS and ECS in the experimental part?” is not well answered. Moreover, based on the above two points, the BCS, ECS, TBraTS and the proposed one in table 1 to 3, may be seen as a series of combined experiments of your own methods?




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors propose a uncertainty-aware multi-task network for joint classification and segmentation. Concurrently quantifying pixel-wise and image-level uncertainty and integrate them to enhance the performance and reliability in mutual learning could be the primary novelty of the paper. The proposed method needs more clear justification and description, e.g., regarding the estimation of uncertainty, the experimental setup etc. The color (of the lines) in Fig. 1 needs to be corrected.




Author Feedback

We thank the reviewers for their high-quality reviews and constructive comments. We are happy to learn that all reviewers appreciate our motivation and novelty. Below we provide point-to-point responses to the comments, which will be integrated in the final version.

Methods/FLOPs(G)/Time(ms) BCS/115.49/68 DSI/73.93/51 Ensemble/577.45/261 Dropout/115.49/259 ECS/115.72/63 Ours (ECS+MD+UI+UN) /160.87/73

Our method is more computationally efficient than ensemble and dropout methods, while still being comparable in terms of complexity and test time to other methods.

Ablation: Methods/ACC/Dice_disc BCS/0.723/0.802 ECS/0.775/0.782 Ours (MD + UN) /0.813/0.836 Ours (MD + UN + UC)/0.853/0.855 The above ablation experiments verify the effectiveness.

SOTA: Methods/ACC/Dice_disc/p-value DSI/0.837 (0.722)/0.653(0.314)/1.33e-60(1.71e-114) Ensemble/0.690(0.470)/0.826(0.665)/8.96e-06(2.24e-06) Dropout/0.702(0.480)/0.810(0.666)/1.57e-12(1.45e-07) TransUNet/-/0.633(0.629)/4.651e-63(6.997e-13) UNETR/-/0.697(0.650)/8.15e-59(1.81e-11) Ours/0.853(0.733)/0.855(0.744)

The results demonstrate that our method outperforms other methods, even in the presence of noise.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Thank the authors for their efforts in addressing the concerns, especially those on methodological details & comparisons with more SOTAs. The current version of this paper looks solid to me.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper proposes a joint learning of a segmentation and classification task by leveraging a mutual learning of pixel-wise and image-wise uncertainty. The reviews have highlighted the merit of such approach but mostly blocked on its validation choices, questioning a lack of a multi-task comparison, insufficient statistical tests, clear novelty from previous work. The rebuttal has addressed most concerns but the novelty from previous work remains unclear to one reviewer.

    Despite this confusion, the conceptual approach of the power has merit. For all these reasons, and situating the work with respect to the other submissions, the recommendation is towards Acceptance.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have tried to respond to the reviewers comments and the paper may now be acceptable, but I have some remaining concerns about the paper, including the novel detail of ECS and the results presented.



back to top