Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Long Bai, Tong Chen, Yanan Wu, An Wang, Mobarakol Islam, Hongliang Ren

Abstract

Wireless capsule endoscopy (WCE) is a painless and non-invasive diagnostic tool for gastrointestinal (GI) diseases. However, due to GI anatomical constraints and hardware manufacturing limitations, WCE vision signals may suffer from insufficient illumination, leading to a complicated screening and examination procedure. Deep learning-based low-light image enhancement (LLIE) in the medical field gradually attracts researchers. Given the exuberant development of the denoising diffusion probabilistic model (DDPM) in computer vision, we introduce a WCE LLIE framework based on the multi-scale convolutional neural network (CNN) and reverse diffusion process. The multi-scale design allows models to preserve high-resolution representation and context information from low-resolution, while the curved wavelet attention (CWA) block is proposed for high-frequency and local feature learning. Moreover, we combine the reverse diffusion procedure to optimize the shallow output further and generate images highly approximate to real ones. The proposed method is compared with eleven state-of-the-art (SOTA) LLIE methods and significantly outperforms quantitatively and qualitatively. The superior performance on GI disease segmentation further demonstrates the clinical potential of our proposed model. Our code is publicly accessible at github.com/longbai1006/LLCaps.

Link to paper

DOI: https://doi.org/10.1007/978-3-031-43999-5_4

SharedIt: https://rdcu.be/dnwje

Link to the code repository

https://github.com/longbai1006/LLCaps

Link to the dataset(s)

https://mycuhk-my.sharepoint.com/:u:/g/personal/1155161502_link_cuhk_edu_hk/EYtX3vMBWE1KizB1scvGOkgBzG4JW5SjTMAnJuxZTUAwdg?e=KJk1k2

https://mycuhk-my.sharepoint.com/:u:/g/personal/1155161502_link_cuhk_edu_hk/EZ_Dz7G4J4hBpDKn3YPng6cByGmdGt1z2Qd51fZsmv6DoA?e=aj6KlO


Reviews

Review #1

  • Please describe the contribution of the paper

    The objective of this paper is to introduce a novel technique for enhancing low-light endoscopy images utilizing a combination of Convolutional Neural Network (CNN), Reverse Diffusion Module, and Curve Wavelet Attention Mechanism. This study validates the proposed method on two publicly available datasets and demonstrates its effectiveness in recovering low-light capsule endoscopy images. The outcomes suggest that the proposed method shows promising performance in improving the quality of low-light endoscopy images, thus addressing a critical challenge in the field of endoscopy.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well-written and addresses one of the most challenging issues in endoscopy imaging. The proposed approach employs a combination of various modules, such as reverse diffusion, wavelet transform, and curve attention, to enhance the quality of endoscopy images. The application of these techniques results in significant improvements in the image quality, which highlights the potential of the proposed approach to address the critical issues associated with low-light endoscopy images.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Although the study incorporates a satisfactory number of experiments and ablation studies, the authors could explore alternative methods, such as those previously applied to capsule endoscopy for recovering low-light images or image enhancement techniques, to compare the effectiveness of the proposed approach. In section 2.2, the authors should reference prior works related to the Curve attention mechanism to provide context for the proposed method. To clarify the study’s synthetic data’s relevance to real-life scenarios for medical images, Section 3.1 should briefly elaborate on the similarities between the synthetic data and actual clinical cases encountered in endoscopy practice.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Authors provided a link to the GitHub repo,

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    please refer to the section6

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well structured and considers one of base challenges in capsule endoscopy with a satisfactory level of experiments.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors proposed a multi-scale residual network called LLCaps, which is integrated with the curved wavelet attention (CWA) block and further optimized with reverse diffusion process in an end-to-end manner, in order to illuminate low-light wireless capsule endoscopy (WCE). The experimental results demonstrate the superiority of the proposed LLCaps in image quality improvement of WCE, which also benefits the lesion segmentation task.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This work is advanced in technology, where the novel curved wavelet attention (CWA) block is incorporated and the most popular reverse diffusion model is adopted to optimize the proposed LLCaps in a simple and ingenious way. In addition, the application in the lesion segmentation task also demonstrates the clinical potential of low-light wireless capsule endoscopy (WCE) enhancement via their LLCaps.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The proposed LLCaps have not been verified in the real-world low-light wireless capsule endoscopy (WCE), which limited the demonstration of clinical feasibility.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    The details of the method have been clarified clearly and the link of the code implementation has provided in the Supplementary Materials, which brings the fine reproducibility of the paper.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html
    1. The following two sentences from the second paragraph in Section 2.2 seem to express the same meaning. If so, I suggest to remove one of them. 1) Spatial attention exploits the inter-spatial dependencies of convolutional features. 2) The spatial attention (SA) layer exploits the interspatial dependencies of convolutional features [28].
    2. It is puzzling when attempting to understand the Curved Attention (CurveA) layer together with its formulation (especially for Eq. 1) and Fig. 2 (b) (especially for Feature rescaling).
    3. The experiments were only conducted on synthetic low-light images based on two publicly accessible datasets. This work lacks the assessment on the real-world low-light wireless capsule endoscopy (WCE), which limited the demonstration of clinical feasibility.
    4. How were the heatmaps obtained? Are they error distribution maps between the results and ground truth, which are projected to one certain color space?
    5. In Table 2, the results of the baseline which removes all the wavelet transform, the CurveA layer and the reverse diffusion branch should be added.
    6. In Fig. 2 of the Supplementary Materials, original images and their enhanced versions should be provided.
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The technical novelty, rich experiments and comprehensive assessments led me to my overall score for this paper.

  • Reviewer confidence

    Very confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper presents a deep learning model for low-light image enhancement (LLIE) using curved wavelet attention (CWA) and reverse diffusion. This paper compares its network with 10 other methods on 2 public datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • This work combines CWA and reverse diffusion, which leads to increased performance in the LLIE task compared with previous methods. • The downstream segmentation task shows the clinical usefulness of the proposed deep learning model.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • Cross-validation should be carried out in order to prevent model overfitting. The performance improvement reported in Table 1 may be caused by overfitting. • The low-light images are manually generated. How is the model performance when applied to real low-light images acquired in clinical settings?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    This work is reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2023/en/REVIEWER-GUIDELINES.html

    • The authors claim the CWA module can extract high-frequency information which helps the LLIE task. Thus, the frequency analysis of the reconstructed images needs to be carried out to justify this claim.

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This work presents a deep learning model combining CWA and reverse diffusion for LLIE, which leads to an improved performance comparing with existing methods. However, the experiments are not carried out in cross-validation thus there is a potential of overfitting. Besides, the training data are manually generated and may differ from clinical data.

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    6

  • [Post rebuttal] Please justify your decision

    All my questions have been addressed.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    This paper presents an approach for low-light capsule endoscopy enhancement. Strength: 1) The paper is generally well written and easy to follow. 2) The experimental results show some improvement, though based on synthetic data. Weakness: 1) The algorithm is validated on synthetic low-light images instead of real-world data. This makes the justification weak as the real-world low-light images might have much more complexed degradation. Some results from real-world images are necessary. 2) How is the downstream RLE Segmentation done? Description of the algorithm is completely missing.




Author Feedback

Thank the reviewers (R) for their critical assessment, insightful suggestions, and overall positive ratings (6 6 5). We also appreciate the meta-reviewer (MR) for granting us the opportunity to clarify some major critiques.

Real-world dataset (MR, R1, R2, R3): We are sorry that, currently we cannot find any public WCE dataset with paired GT and low-light images. Therefore, we follow [12, 17] to generate GT and low light image pairs, and it can simulate real image scenes and features under low illumination. Furthermore, we added an external validation on 100 real images selected from the Kvasir-Capsule dataset. These images are with low brightness and not included in our original experiments. We evaluate with no-reference metrics due to lacking GT, and our results yield the best. Metrics|LPIPS, PIQE LIME|0.3498, 26.41 DUAL|0.3305, 25.47 Zero|0.6723, 21.47 Enlight|0.4796, 34.75 LLFlow|0.3712, 35.67 HWM|0.5089, 35.37 MIR|0.3485, 34.28 SNR|0.3992, 26.82 MIRv2|0.3341, 41.24 DDPM|0.5069, 43.64 Ours|0.3082, 20.67

How is RLE segmentation done (MR): We employed the widely-used UNet for RLE segmentation tasks and showed the superiority of our LLCaps in enhancing data quality and benefitting downstream applications. We conduct the segmentation on the enhanced images from the LLIE test set of RLE dataset. Since segmentation is not our focus, we only provided overall experimental settings in Section 3.2, with details available in the released code.

Refer to prior work on the CurveA and clear clarification (R1, R2): We have cited references [6, 32], which have discussed the curve mechanism. We further develop the curve into the attention mechanism. The Equ.1 corresponds to the white area in Fig.2(b), and the ILn(c) in Equ.1 corresponds to the ‘fn’ from Fig.2(b). Curve Estimation is to estimate the pixel-wise curve parameter ‘Curve n’. Feature Rescaling is to rescale the input feature into [0, 1], because we want to learn the concave down curves, and the curve parameter and input feature should be within [0, 1]. We will edit the figure and explanation for better clarity.

Explore alternative methods (R1): We add StillGAN (Ma et al, TMI 2021) for comparison, as below (with the same column order in Table 1): 28.28, 91.30, 0.1302|26.38, 83.33, 0.1860|58.32, 71.56, 55.02

How were the heat maps obtained (R2): We make the difference between the enhanced image and GT to get the error map, and then multiply a coefficient to amplify the error. Finally, we map the grayscale error image to the pseudo-color space based on the ‘COLORMAP_JET’ in OpenCV. We also include the heatmap code in the anonymous repo.

Ablation without all branches (R2): We add the ablation results without all 3 branches: PSNR 31.12, SSIM 94.96, Lpips 0.0793. The lower results further show the effectiveness of our branches.

Cross-validation to prevent the potential of overfitting (R3): As suggested, we have conducted additional 3-fold cross validation and our method still outperforms. The results will be updated in our final version, and the average results of top-3 methods are shown below (with the same column order in Table 1): MIRv2|31.67, 95.22, 0.0486|32.85, 92.69, 0.0781|58.95, 70.26, 57.73 SNR|30.32, 94.92, 0.0521|27.73, 88.44, 0.1094|63.14, 75.07, 53.71 LLCaps|35.24, 96.34, 0.0374|33.18, 93.34, 0.0721|66.47, 78.47, 44.37

Frequency analysis to show CWA’s ability to extract high-frequency information (R3): We conduct an ablation study on the CWA block using average gradient (AG). Higher mean & var of AG denote richer details. When removing the wavelet transform or the CWA block, the mean & var drop greatly, showing the effectiveness of CWA in extracting features. AG|Mean (x10^5), Var (x10^10) DDPM|2.19, 0.50 MIRv2|3.51, 1.86 LLCaps w/o CWA|3.54, 0.99 LLCaps w/o CA|3.85, 2.01 LLCaps w/o WT|3.55, 1.93 LLCaps|3.87, 2.09 GT|3.95, 2.19

We will also fix the minor problems for better clarity. All the suggestions will surely be considered and added to the final manuscript




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The paper received favorable scores at first round but with some major concerns on lacking of results from real-world datasets. In the rebuttal, the authors have provided some results which seem reasonable. The authors shall include the results on real-world data in the final paper.



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed the comments satisfactorily and have validated their methods on an external dataset with real images. Additional experimental results have also been provided. Therefore, I think the paper can be accepted.



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    In this paper, the only reviewer that had originally voted for Weak Accept raised their score to Accept, in line with the other two reviewers. The primary meta-reviewer also found the response letter satisfactory. I went through the original reviews and the response letter, and found no reason to object acceptance. It is very hard to obtain ground-truth data for improving illumination of endoscopies, and the authors made a reasonable effort to improve their evaluation protocol.



back to top