List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Víctor M. Batlle, José M. M. Montiel, Pascal Fua, Juan D. Tardós
Abstract
We propose a new approach to 3D reconstruction from sequences of images acquired by monocular endoscopes. It is based on two key insights. First, endoluminal cavities are watertight, a property naturally enforced by modeling them in terms of a signed distance function. Second, the scene illumination is variable. It comes from the endoscope’s light sources and decays with the inverse of the squared distance to the surface. To exploit these insights, we build on NeuS, a neural implicit surface reconstruction technique with an outstanding capability to learn appearance and a SDF surface model from multiple views, but currently limited to scenes with static illumination. To remove this limitation and exploit the relation between pixel brightness and depth, we modify the NeuS architecture to explicitly account for it and introduce a calibrated photometric model of the endoscope’s camera and light source.
Our method is the first one to produce watertight reconstructions of whole colon sections. We demonstrate excellent accuracy on phantom imagery. Remarkably, the watertight prior combined with illumination decline, allows to complete the reconstruction of unseen portions of the surface with acceptable accuracy, paving the way to automatic quality assessment of cancer screening explorations, measuring the global percentage of observed mucosa.
Link to paper
DOI: https://doi.org/10.1007/978-3-031-43999-5_48
SharedIt: https://rdcu.be/dnww1
Link to the code repository
N/A
Link to the dataset(s)
https://doi.org/10.7303/syn26707219
Reviews
Review #1
- Please describe the contribution of the paper
The proposed LightNeuS adds two modifications to NeuS: (a). use illumination decline a depth cue and (b). an endoscope photometric model. The proposed LightNeuS is compared with NeuS on four sequences in the C3VD dataset and achieves better results.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Observing the strong correlation between pixel brightness and distance to the camera of the endoscope’s images, the proposed LightNeuS explores this correlation for neural network self supervision. The distance t is provided to the renderer, taking into account illumination decline. Incorporating this illumination decline improves performance comparing to the conventional NeuS model. In addition, a photometric model for endoscope is used (eq. 5) to take into account various factors such as gamma correction, fish-eye lenses and light-sources.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
In real-world application, one needs to estimate depth (value t). Assuming depth is not available, how does the algorithm performs if depth has to be estimated? In other words, how robust the performance when t has error?
The proposed LightNeuS model incorporates two modifications to NeuS: use illumination decline as a depth cue and use the endoscope photometric model. An ablation study is needed to evaluate the contribution of these two modifications.
The proposed endoscope photometric model is simple comparing to other models (See https://arxiv.org/pdf/2204.09083.pdf). The authors need to compare the performance of LightNeuS using their photometric model vs. using other models.
There are 22 sequences in the dataset, why only 4 sequences are used in the experiment? The authors should report results for all 22 se3quences.
Note that only two textures, 1 and 4 are used in the evaluation. Why? How does LightNeuS perform for textures 2 and 3?
20 frames are extracted uniformly over the duration of the video and they are used for training. However the number of frames per sequence is different (Cecum 1a - 276 frames, Descending 4a - 148 frames, Transcending 1a - 61 frames and Transcending 4a - 382 frames). Although the ratio of training frames over total frames for Transcending 1a is largest (20/61), it has the worst result comparing to other sequence (for the surveyed case). This is counterintuitive. Please explain.
How does the algorithm perform when one select more frames for training? Given a particular video, what is the rule for choosing number of frames for training?
Please explain why NeuS fails for sequences Descending 4a, Transcending 1a and Transcending 4a?
For online application, assuming that the algorithm trains the LightNeuS model on the first N frames, how does one update the model when receiving the next M frames?
A few English usage issues:
- Page 2. “…the renderer explicitly uses it reproduce …” should be “…the render explicily uses it to reproduce…”
- Page 7. There is an extra space after “(MAE)”
- Page 8. “…neighbouring areas that where not observed.” should be “…neighbouring areas that were not observed.”
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Limited: Parameters for network training are missing. Parameters for Photometric model are missing.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
The authors propose the LightNeuS model, incorporating two modifications to the NeuS model: depth cue based on illumination decline and a photometric model. Evaluating on 4 sequences, LightNeuS has better performance comparing to NeuS. As outlined in Section 6 on Weakness of the paper, there are several issues. Please see this section for detailed feedbacks.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
4
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The proposed approach on illumination decline as a depth cue is unrealistic since given a video sequence, one needs to estimate depth. Since the depth information (t) is squared inversely proportional to the color (eq. 4), any error in depth estimation has large effect on the color. This issue must be studied to determine the robustness of the proposed algorithm.
Other issues on the experiment setup are pointed out in Section 6.
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
5
- [Post rebuttal] Please justify your decision
The authors’ rebuttal clear up some of the questions that I have.
Review #2
- Please describe the contribution of the paper
This paper presents a new approach to 3D reconstruction from sequences of images acquired by monocular endoscopes, which is inspired by NeuS [25]. They validate the proposed method by four sequences of the C3VD dataset [4], covering different sections of the colon anatomy.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
This method demonstrates that neural radiance fields can be used to obtain accurate dense reconstructions of colon sections.
-
They modify the NeuS architecture to introduce a calibrated photometric model of the endoscope’s camera and light source.
-
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Evaluation and comparison are a bit weak.
More detailed comments are given in the following Sec. 9.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Seems okay.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
-
Fig. 1 does not well show the major difference between NeuS and LightNeuS? Only t and loss are different?
-
In the experiments, it seems only 4 data examples used? Is it sufficient for training as well as justification for the feasibility of this method? In each video, how many frames?
-
It only compared with NeuS, but missed other 3D reconstruction methods for comparison, such as the existing methods mentioned in related work section, e.g., [17, 14, 20, 22, 12, 13].
-
How about the performance on other video sequences of NeuS?
-
There is no clinical evaluation to justify the real accuracy and practical value of this method.
-
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Overall, this paper proposes a method for 3D dense multi-view reconstruction from endoscopic images. The idea is interesting and timely. The only concern to me is that the evaluation is a bit weak and limited.
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #3
- Please describe the contribution of the paper
This paper proposed a neural field based reconstruction method for Colorectal endoscopy 3D reconstruction. The key contributions of the paper include 1) an endoscope light source model that uses illumination decline as depth cue and 2) a more realistic photometric model. Combining the newly proposed model with neural field based deep learning reconstruction algorithms, the paper achieves state of the art accuracy in colorectal structure reconstruction.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The major strengths of the paper is its novel formulation of the endoscopy illumination. By taking the physically insight into consideration, the algorithm seems significantly improving the reconstruction quality without playing with complex neural network architecture design. This is a smart strategy for biomedical applications because the biomedical images are typically noisy and the training data is limited. By incorporating physical insight, higher quality and better robustness could be achieved.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The main weakness of the paper is that the algorithm is tested on a small dataset. More experiments might be needed to validate the strengths of the methods compared with other method. Also, for me it seems not simple to find a path to make the current framework an online reconstruction approach. That will influence the deployment of the methods in real clinical applications.
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The reproducibility of this paper is acceptable. With enough details, it should be easy for a domain expert to reproduce the results.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
- the validation experiment is limited. It would be good for the authors to show more reconstruction examples, especially list one or two cases the algorithm might fail. That would help better analyze the robustness of the algorithm.
- In the future work, it would be good to spend more words discussing how to expand the algorithm to an online version.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
6
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
I recommend accepting the paper for its novel forward modeling of endoscopy illumination source and deep insights of the light decay. Although this idea seems easy, it could be very helpful for practical applications.
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
6
- [Post rebuttal] Please justify your decision
The authors address my minor concerns. I’ll keep my rating.
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
The paper presents a new approach to 3D reconstruction from sequences of images acquired by monocular endoscopes. The proposed LightNeuS model incorporates two modifications to NeuS: using illumination decline as a depth cue and using an endoscope photometric model. The algorithm is compared with NeuS on four sequences in the C3VD dataset, and better results are achieved. The key contribution of the paper is the novel formulation of endoscopy illumination, which significantly improves the reconstruction quality without complex neural network architecture design. However, the proposed photometric model is simple compared to other models, and an ablation study is needed to evaluate its contribution. The evaluation is limited, and more experiments might be necessary to validate the method’s strengths compared to other methods. The algorithm’s robustness when depth estimation has errors needs to be studied, and the authors should report results for all 22 sequences in the dataset. Overall, the paper proposes a method that is interesting and timely, and the novel formulation of the endoscopy illumination source and light decay insights are helpful for practical applications.
Author Feedback
We thank the reviewers for their encouraging comments on the interest and novelty (R2,R3), timeliness (R2), practical usefulness (R3), and for pointing out parts that need a better explanation (R1,R2,R3).
R1: True depth is provided to the renderer — No, that is a critical misunderstanding. Our method estimates depth as part of the NeuS optimization: NeuS training selects a pixel from an image and samples points along its projecting ray. In NeuS, the color of the sampled points only depends on the viewing direction. For any such point, the volumetric renderer predicts the pixel color and subtracts it from the observed color to compute a training loss. In LightNeuS, we explicitly feed to the renderer the distance of each one of these sampled points to the light source. Hence, the renderer can exploit the inverse-square illumination decline. This makes the minimization problem better-posed and, thus, the automated depth estimation more reliable.
R1: Illumination decline and photometric model ablation — LightNeuS cannot ablate the endoscope’s photometric model: It is needed to compute the illumination of each sampled point. The only ablation possible is already in the paper: Removing both things together, that is, using standard NeuS. And it fails dramatically.
R1,R2: Performance of NeuS — Classical NeRF/NeuS assume constant illumination. The strong light changes typical of endoscopy fatally mislead the method. We only report numerical results of NeuS in one sequence because in all the rest, the SDF diverges and ends up blown out of the rendering volume, providing no result at all.
R2: Only t and loss are different? — Yes! Just adding illumination decline is the “seemingly easy” but “very helpful” and “smart strategy” (R3) that allows neural rendering to obtain reconstructions in endoscopy, with “higher quality and better robustness” (R3).
R1: The photometric model is simpler than others (see arxiv) — The reviewer suggests comparing it to the model in [3], but that is indeed the model we use in the experiments: “L_e is not constant…and can be calibrated [16,3]” (Sec 3.2). We will further clarify it.
R1,R2: Only 4 sequences, R3: Small dataset, R1: Only textures 2 and 3 — NeuS evaluation is costly and we only had one GPU at our disposal. We focused our efforts on “covering different sections of the colon anatomy” (R2) with their different topologies: cul-de-sac (cecum), straight and curved tube (transverse, descending). Still, after evaluating all 22 sequences the mean error hardly changed when the camera moved at least 1cm (2.80mm). The other four smaller trajectories (<1cm) lack parallax and the mean error is higher (8.23mm). The mean error of textures 1 (3.42mm) and 4 (2.33) are similar to those of textures 3 (3.13) and 2 (2.44). Thus, our experiments showcase the main contribution: Exploiting illumination decline makes neural rendering feasible in endoscopy.
R2,R3: Comparison with other methods — Most 3D methods don’t provide code [14,22], don’t evaluate in biomedical environments [17,20], or don’t report reconstruction accuracy [12,13]. Unfortunately, the effort required to run these methods and to make them work in endoscopy is very large.
R1,R2: Frames for training — We follow the NeuS paper approach of using a few informative frames per scene, as separated as possible, by sampling each video uniformly. Some C3VD videos have many more images but the endoscope was moved very slowly, resulting in a comparable total motion. Using more images would significantly slow down training without providing benefits, as the baseline between neighboring images would be too small.
R1,R3: Online, R2,R3: Clinical application — The current method could be used offline for post-exploration coverage analysis and endoscopist training. The new NeuS2 converges in minutes, enabling automatic coverage reporting. Real-time reconstruction would allow recovery of missing regions during the exploration, but this will require further research.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The paper presents an interesting approach for 3D reconstruction from sequences of images acquired by monocular endoscopes. The rebuttal has adequately addressed the reviewers’ concerns, with one reviewer raising their score from weak reject to weak accept. Authors are encouraged to add clarifications based on reviewers’ comments to their camera ready.
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
This paper proposed a neural field based reconstruction method for Colorectal endoscopy 3D reconstruction. Key Strengths:
- novel model using neural radiance fields to obtain accurate dense reconstructions of colon sections.
- good performance
Key weaknesses:
- missing technical details
- lack of comparison with existing methods
The rebuttal clarifies some technical issues and explains why it is hard to compare with existing methods.
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The authors provided a nice rebuttal that successfully convinced R1 who was the only reviewer to recommend reject in the initial review phase. All reviewers consistently recommended to accept the paper in the post-rebuttal evaluations. So I am happy to recommend to accept the paper for the publication of MICCAI23.