Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Benjamin Gallusser, Max Stieber, Martin Weigert

Abstract

State-of-the-art object detection and segmentation methods for microscopy images rely on supervised machine learning, which requires laborious manual annotation of training data. Here we present a self-supervised method based on time arrow prediction pre-training that learns dense image representations from raw, unlabeled live-cell microscopy videos. Our method builds upon the task of predicting the correct order of time-flipped image regions via a single-image feature extractor followed by a time arrow prediction head that operates on the fused features. We show that the resulting dense representations capture inherently time-asymmetric biological processes such as cell divisions on a pixel-level. We furthermore demonstrate the utility of these representations on several live-cell microscopy datasets for detection and segmentation of dividing cells, as well as for cell state classification. Our method outperforms supervised methods, particularly when only limited ground truth annotations are available as is commonly the case in practice. We provide code at https://github.com/weigertlab/tarrow.



Link to paper

DOI: https://doi.org/10.1007/978-3-031-43993-3_52

SharedIt: https://rdcu.be/dnwNX

Link to the code repository

https://github.com/weigertlab/tarrow

Link to the dataset(s)

HELA: http://data.celltrackingchallenge.net/training-datasets/Fluo-N2DL-HeLa.zip http://data.celltrackingchallenge.net/test-datasets/Fluo-N2DL-HeLa.zip

MDCK: https://rdr.ucl.ac.uk/articles/dataset/Cell_tracking_reference_dataset/16595978

YEAST: https://zenodo.org/record/6795124


Reviews

Review #2

  • Please describe the contribution of the paper

    This paper use time arrow prediction task as pretext task for Self-supervised dense representation learning.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper use time arrow prediction task as pretext task for Self-supervised dense representation learning. Author evaluated the proposed pretext task and the downstream image-level and pixel-level tasks on several datasets. Experimental results of both the pretext task and downstream tasks were provided.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1.No comparisons with the methods that use time arrow prediction as a pretext task, such as ref.15,14,30,22,2,11 mentioned in Introduction.

    1. Experimental results of 4 time arrow prediction pretext tasks were incomplete.
    2. Augmentations and Implementation details parts should be placed in the Experiments section.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It seems that the authors intend to release the code

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Augmentations and Implementation details parts should be placed in the Experiments section.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See strengths and weaknesses.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #1

  • Please describe the contribution of the paper

    The paper addresses a specific, important use case (automated annotation of live cell videos). It leverages previously untapped (to my knowledge) domain constraints to tackle the problem.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is strongly-grounded in a valuable use case (annotation of live cell videos). I believe it is well grounded in the literature.

    It leverages time-directional aspects of the problem, as well as other, to shape a solution. This is to my mind far more effective than trying to apply a generic system without regard for the domain-specific problems and shortcuts, but it is a surprisingly rare approach.

    It is well and clearly written.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    A couple items are unclear (see comments below).

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Abstract: “subsequent time arrow prediction head”, “dense representations”. These are unclear to me. Introduction: great Pg 3 top: briefly explain “permutation-equivariant” Method paragraph 1: why logits (pre-softmax layer)? Are these better than a softmax, and do they not need to sum to 1? The L_decorr equations: Perhaps put them in their own lines (with eqn numbers) for readability. The loss fuction does not minimize A_ij (i != j), which you would need to decorrelate. Is this a typo? It looks like the j in the summation of the A_ij = … equation is overloaded (unless I am missing the boat on this equation. (Last line): thanks for the brief explanation of heuristic choices. Very useful to the reader. Equation at bottom of page 3: I regret that this eqn was incomprehensible to me - I don’t understand what all the indices are doing (and are they correct?). Also please number the eqns. Page 4: Augmentations: well-described grounding in the use case. Description of data sets: Can you show sample images of each, to orient the reader? Page 5: … points, suggesting that the amount of data augmentation is key … new paragraph at “cues. Next… …global drift or cell growth… Is this cue useful for segmenting cells against a background (ie for certain use cases)? page 6: figure 3: you could perhaps crop the big images and reduce the number of thumbnails, to make everything bigger.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    They leverage domain-specific structure in a novel way to improve performance in an important use case.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    In this paper the authors propose a self-supervised representation scheme for live cell videos based on the concept of time arrow prediction. The proposed method, referred to as TAP in the manuscript, is used to analyze four different cellular processes.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The concept of time arrow for video analysis has been used in processing of natural image sequences. This paper introduces this concept into the processing of cellular image sequences. Overall the paper is logically organized, and the results seem promising.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Writing of the paper is difficult to follow. For Section 2, it would be much clearer if the algorithm can be more clearly organized, presumably in pseudocode. The result sections are also wordy and messy.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    It is indicated in the abstract that the code will be made openly accessible. But the code is not submitted along with the manuscript.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    Validation of the proposed TAP method should be more clearly presented. At present the presentation is wordy and messy. Comparison with competing methods such as conventional video track methods is limited.

    In the Experiments section, first paragraph, it seems that the setting of delta_t needs to be tuned for different videos, How critical is this parameter? What is the overall intuition that should be considered in setting this parameter?

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the paper has substantial weaknesses, to advocate the concept of self-supervised representation learning in biological video analysis should be encouraged.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The reviewers were overall enthusiastic about the application area of this paper: annotation of live cell videos, taking the concept of time arrow prediction to the domain of live microscopy data. The key strengths of the paper have been its 1) clear writing, 2) thorough evaluations and 3) promising results on real data. However, some weaknesses were noted which the author’s should fix before the final submission. Notably the method can be described more clearly using pseudocode or a clearer schematic (several authors commented on the lack of clarity of the math). With these fixed, the paper should be a fine addition to the MICCAI community.




Author Feedback

We thank all reviewers for their detailed feedback and helpful comments. For the camera-ready version, we have made the following changes to the paper to include the suggestions of the reviewers:

  • We appreciate the comments on potential improvement of method clarity (R1 - R3, Meta-Review). To address them we extended Figure 1, fixed notation ambiguities and adjusted for overloaded indices, added additional guiding remarks, and inserted appropriate equation enumeration.
  • The URL to the public GitHub repository is now de-anonymized (R2, R3).
  • Regarding comparison against existing time arrow prediction methods (R2, R3): We would like to clarify that these methods are specifically designed for natural videos and the predominant image-level task of action recognition (i.e. they produce image level features rather than dense features). This renders the application of these methods on our dense downstream tasks for live-cell microscopy images non-straightforward.
  • We moved the implementation details to the experiments section (R3).
  • We adjusted the wording of the abstract (R1).
  • We extended the discussion on the domain-informed choice of delta t parameter (R3). We choose delta t according to the biological processes of interest, and empirically found that the corresponding attribution maps are meaningful. Investigating the general effect of delta t on downstream tasks is interesting but left for future work.
  • To improve readability of Figure 3, we now provide high-resolution images and increased the font size of multiple text fields (R1).
  • Regarding sample images for the presented datasets, we now explicitly refer the reader to the examples in an existing figure (R1).



back to top