Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Yimu Pan, Tongan Cai, Manas Mehta, Alison D. Gernand, Jeffery A. Goldstein, Leena Mithal, Delia Mwinyelle, Kelly Gallagher, James Z. Wang

Abstract

The placenta is a valuable organ that can aid in understanding adverse events during a pregnancy and predicting adverse events after birth. However, manual pathological examination and report generation is laborious and resource-intensive. Limitations in diagnostic performance and model efficiency have impeded previous attempts to automate placenta analysis. This study presents a novel framework for the automatic analysis of placenta images that aims to improve accuracy and efficiency. Building on previous vision-language contrastive learning (VLC) methods, we propose two enhancements, namely Pathology Report Feature Recomposition and Distributional Feature Recomposition, which increase representation robustness and mitigate feature suppression. In addition, we deploy efficient neural networks as image encoders to achieve model compression and inference acceleration. Experiments demonstrate that the proposed approach outperforms previous work in both performance and efficiency by significant margins. The benefits of our method, including enhanced efficacy and deployability, may have significant implications for reproductive healthcare, particularly in rural areas or low- and middle-income countries.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43987-2_12

SharedIt: https://rdcu.be/dnwJu

Link to the code repository

N/A

Link to the dataset(s)

N/A


Reviews

Review #1

  • Please describe the contribution of the paper

    The primary contribution of this work is the proposal to divide the placenta analysis report into multiple segments and encode each segment independently, thereby ensuring that all tokens in the report are fully considered. Experimental results on benchmark datasets demonstrate the effectiveness of the proposed method.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. To utilize all the tokens in the placenta analysis report for the downstream task, the authors propose to split the placenta analysis report into multiple segments and encode them independently.
    2. Instead of using sum or mean pooling, the author proposes a distributional feature recomposition approach for aggregating information in each segment of the report.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. From a technical perspective, the author addresses the problem of encoding long documents in this paper. However, the most recent foundational models can already handle long documents; for example, GPT-3 can process 4k tokens, and GPT-4 can handle 32k tokens. It is not clear to the reviewer whether this problem remains of interest to the research community.
    2. It seems to the reviewer that the technical contribution in this work is limited.
    3. It’s not clear to the reviewer the advantage of the distributional feature recomposition over widely used sum or meaning pooling. The authors are suggested to better support this design via theoretical analysis or empirical results.
    4. The authors mentions that there are only 52 testing images for the iPad task. We may not derive reliable conclusions from results from such a small testing set
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It seems to the reviewer that the authors are not going to open source the code and processed dataset. Thus, the reproducibility could be improved.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    This paper has been well-written and well-organized. However, the reviewer has concerns primarily about the technical contribution reported in this paper:

    1. In this paper, the author addresses the problem of encoding long documents in this paper. However, the most recent foundational models can already handle long documents; for example, GPT-3 can process 4k tokens, and GPT-4 can handle 32k tokens. It is not clear to the reviewer whether this problem remains of interest to the research community.
    2. It’s not clear to the reviewer the advantage of the distributional feature recomposition over widely used sum or meaning pooling. The authors are suggested to better support this design via theoretical analysis or empirical results.
    3. The authors mentions that there are only 52 testing images for the iPad task. We may not derive reliable conclusions from results from such a small testing set
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The reviewer believes that the technical contribution presented in this paper is not sufficient for publication at this time.

  • Reviewer confidence

    Somewhat confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    In this work, the authors propose a vision-language contrastive learning framework for placenta image analysis. The authors proposes to decomposing the placenta pathology report into set, and estimates a stable high-dimensional vector space by fitting and resapling from a Normal distribution.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. I think the idea of representing a radiology report as a set and distributional feature recomposition are novel.
    2. The empirical performance of proposed method is better than baseline methods.
    3. The writing quality is good and easy to follow.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. I think some details are missing. For example, how do you decomposing the placenta pathology report into set? Do you split by fix length (if so, what’s the length), or split by sentence?
    2. Spliting reports into set can have some issues. For example, the ordering information of items in set can be lost.
    3. The authors propose to use samples to estimate the normal distribution in feature space. But the feature space is high-dimensional and the sample size is limited, the estimation may not be accurate. Do you assume a independent normal distribution (a.k.a. have zero covariance)?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The authors state that they won’t release the code for this work in the checklist.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The authors could provide more details for their method. For example, how do they split the report. What is the sample size for the estimation of normal distribution.
    2. In Table 2, the authors could provide efficiency metric for other baseline methods, including ConVIRT.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think the idea of represents report as a set and estimate the distribution is novel. However some issues exist, for example, the ordering of segments may be lost, the estimation of high-dimension distribution with limited number of samples is not reliable.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This work presents an automatic placenta analysis framework that improves performance and efficiency. The framework introduces Feature Recomposition and Distributional Feature Recomposition techniques for capturing features from pathology reports of variable lengths and generating robust, distribution-aware representations. The proposed framework can accommodate architectures of different sizes, resulting in better-performing models that are faster and smaller, with clear performance advantages over previous work. These improvements have the potential to promote the clinical deployment of automated placenta analysis, particularly in low-resource communities.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper was well written and provided extensive evaluation of the representation learning framework. It is easy to follow and understand what’s happening and why the design choices (architecturally, and loss-wise). The proposed method improved data efficiency and has decent performances even with lightweight models.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The purpuse of the IPAD dataset is weak, may need further discussion on the dataset in this study or find a another bigger dataset serving as external validation. Details and discussion of qualitative prediction results are lacking.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The models and hyper-parameters for training is reported detailed. Code is not available. The reproducibility is not fair without the proposed pan’s dataset.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    It would be better to include a more professional dataset as external validation to prove the robustness and generalizability.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    7

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method is interesting, particularly with joint image-language contrastive learning. The data efficiency is improved compared with baseline methods. My initial rating primarily based on the novelty of this work and the concerns about repeoducibilty.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    7

  • [Post rebuttal] Please justify your decision

    The author’s rebuttal addressed my concerns, I recommend accept




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The authors propose a vision-language method for the analysis of post-birth placenta images and reports for better diagnostics. The main contributions are the decomposition of the placenta report in sets to allow variable length in the report and a robust feature representation using normal distribution. The reviewers agree that the paper is clearly written and that the proposed framework is novel. However, they expressed concern about the small test dataset. Also, they questioned the importance of the contributions as a detailed analysis of, e.g., the distributional feature recomposition is missing. In the rebuttal, the authors are expected to provide more motivation and discussion about the importance of their contributions and provide more details about the methods (cp. R#1, R#2). Additionally, qualitative results and the limitations of the test dataset should be discussed.




Author Feedback

We want to extend our gratitude to reviewers for their detailed reviews and valuable feedback. We’re glad to see that everyone found our manuscript well-written and -organized. We hope this document will address any concerns and miscommunications. The manuscript will be updated accordingly.

R1 6.1./9.1 Solving the issue of lengthy input is one contribution. Using newer text models without performing feature recomposition does not solve the feature compression problems. We mentioned in the paper that “this [prior] method suffers from two critical issues.” GPT may handle the first issue of “lengthy report”, but not the second issue of feature suppression as “the encoder can ignore certain placental features.” Our feature recomposition method overcomes both challenges, regardless of the text encoders.

6.2 Besides simultaneously addressing feature suppression and variable report length problems, our method is easy to implement and generalizable to other pathology reports. Its simplicity and generalizability adds to its technical merit, making it potentially more widely applicable than complex alternatives.

6.3/9.2 Our proposed feature recomposition contains an aggregation step over the set of features using mean pooling (Sec 3.2). The advantages of distribution estimation are evidenced through examples in Fig.2 and Sec 3.3. Note that sets V_1 and V_3 will be identical if we simply use the mean pooling without distribution estimation, leading to inferior performance as shown in the ablation study.

6.4/9.3 We acknowledge the concern on the small size of the iPad dataset. It’s crucial to understand that our study doesn’t primarily depend on this dataset for its conclusions. It is chosen to illustrate the robustness under different conditions. Our principal conclusions (ie., effectiveness based on the ablation study and efficiency) are driven by a much larger dataset (random subsets from 2,811 images) detailed in the Appendix and in Pan et.al.’s work.

R2 6.1/9.1 We described the splitting method in the paper as “each t_i \in T represents a distinct placental feature.” In Fig. 1, we showed an example pathology report and each black dot represents a distinct placental feature. We will clarify this point in the method section by adding explanations such as “each t is shown as a separated sentence in the example pathology report in Fig. 1.”

6.2/9.1 As shown in the pathology report in Fig. 1, each item only represents the presence or absence of certain placental features; the order does not affect the integrity of the pathology report.

6.3/9.1 Although the dimension is high, the possible space the text features occupy is much smaller since we only have a limited number of placental features. The central limit theorem tells us that the mean of means adheres to a normal distribution. Given that we sampled sufficiently (400 times) and our training dataset has n>10,000 samples, our estimes should be reliable. We assume the zero covariance for simplicity. Despite the simplicity, the effectiveness of this estimation is demonstrated by the ablation study.

9.2 ConVIRT and Pan. et. al. (i.e., all previous VLC follows ConVIRT) differ only in the design of their loss functions, and thus they have the same training and testing efficiency. Similar performance improvement can be expected if we use other text and image encoders with our training method.

R3 6/9 In addition to the above response to R1 6.4 on the purpose of the IPAD dataset, we agree on the importance of larger datasets and it is a future direction. Due to space constraints, we presented some qualitative examples and explanations in the Appendix. We will extend the discussion in the revision.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    Unfortunately, R#1 and R#2 did not respond to the rebuttal. However, I agree with R#3 that the authors addressed the main concerns in the rebuttal sufficiently. Overall, the paper is well-written and the method combining imaging and language data is interesting. Therefore, I will recommend acceptance.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper received mixed initial reviews. The authors addressed the concerns in the rebuttal. Only R3 posted a post-rebuttal response. The AC acknowledged that the paper made contributions to the Vision-Language contrastive learning development. The weakness does not overweigh its merit. More clarity and performance improvements may be necessary before its publication.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    I think the concerns of the small size of the test dataset is still present. Also, in their rebuttal, the authors have asserted that their method can be generalized to other pathology reports, but evidence to support this claim is currently lacking in the paper.



back to top