List of Papers By topics Author List
Paper Info | Reviews | Meta-review | Author Feedback | Post-Rebuttal Meta-reviews |
Authors
Kai-Ni Wang, Yuting He, Shuaishuai Zhuang, Juzheng Miao, Xiaopu He, Ping Zhou, Guanyu Yang, Guang-Quan Zhou, Shuo Li
Abstract
Reliable automatic classification of colonoscopy images is of great significance in assessing the stage of colonic lesions and formulating appropriate treatment plans. However, it is challenging due to uneven brightness, location variability, inter-class similarity, and intra-class dissimilarity, affecting the classification accuracy. To address the above issues, we propose a Fourier-based Frequency Complex Network (FFCNet) for colon disease classification in this study. Specifically, FFCNet is a novel complex network that enables the combination of complex convolutional networks with frequency learning to overcome the loss of phase information caused by real convolution operations. Also, our Fourier transform transfers the average brightness of an image to a point in the spectrum (the DC component), alleviating the effects of uneven brightness by decoupling image content and brightness. Moreover, the image patch scrambling module in FFCNet generates random local spectral blocks, empowering the network to learn long-range and local disease-specific features and improving the discriminative ability of hard samples. We evaluated the proposed FFCNet on an in-house dataset with 2568 colonoscopy images, showing our method achieves high performance outperforming previous state-of-the-art methods with an accuracy of 86.35% and an accuracy of 4.46% higher than the backbone. The project page with code is available at https://github.com/soleilssss/FFCNet.
Link to paper
DOI: https://link.springer.com/chapter/10.1007/978-3-031-16437-8_8
SharedIt: https://rdcu.be/cVRsU
Link to the code repository
https://github.com/soleilssss/FFCNet
Link to the dataset(s)
N/A
Reviews
Review #1
- Please describe the contribution of the paper
This paper presents a Fourier based method to classify colonoscopy images, the proposed framework dices input images and calculates the DFT for each dice and then applies complex conv/rel/bn on them to perform the classification. Images are divided into four classes normal, polyps, adenomas, and cancers. The results show superior performances in comparison to SOTA works in the colonoscopy domain.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-A complex network is applied to colonoscopy images for classification, which suppose to deal with brightness/speculation in images. -Images are sliced so that local information can be obtained for better classification through the network.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Using complex CNN is not novel enough, in addition, the main claim of the paper is the ability of this network to deal with brightness imbalance challenges within colonoscopy images, yet no visual results are presented to determine that the network could learn to ignore those areas while doing the classification.
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
The method is explained well and is reproducible
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
Dealing with colonoscopy images is very challenging and the method proposed in this work can be useful, -yet regarding the image classification; (i) it is better to consider a sequence of frames, (ii) flat lesions are among those polyps that need to receive more attention, -There should be a section to demonstrate the effectiveness of this network through methods such as GradCam -The method should be compared against more advanced network such as transformers which has the FFT as an embedded feature.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
4
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper is well written, but I am not convinced that it performs better in comparison to more advanced networks such as transformers or CoAtNet.
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
4
- Reviewer confidence
Very confident
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
5
- [Post rebuttal] Please justify your decision
The authors provided some results regarding the performance of the FFCNet in comparison to other network, which is satisfactory. I think t-sne by it self is not enough to show how this network can deal with brightness as it was one the main claim in this paper.
Review #4
- Please describe the contribution of the paper
This paper presents a frequency learning method for automatic colon disease classification, featured the Patching Shuffling Module and complex network.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The motivation of utilising frequency learning for colon diseases classification is well stated and interesting.
-
The patching before DFT may alleviates the lack of local features in frequency learning and generates smaller numerical distribution in frequency domain, which may improve training stability.
-
They carry out extensive experiments to justify the improvement of the proposed method over baselines and the contributions of critical designs.
-
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
There has been many works treating complex data as double channel input to network, so as to mine information from both real and imaginary parts, for example, in fast MRI reconstruction task[1]. As these works are not compared nor referred, it’s unclear the difference or improvement brought by the complex network design in this paper. [1] Image reconstruction by domain-transform manifold learning
-
The random shuffling operation in PSM is somehow confusing. If the patches are randomly arranged, i.e. channel index of input is irrelevant to position, how would the network model the relationship between different patches of input? From my perspective,the kernel for each channel would be eventually equivalent in this way. And if the experiment results are correct, then it means this task requires only local feature within a patch.
-
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Easy to reproduce.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
- Illustrate the advantages of the proposed complex network over previous methods that are utilized to mine complex information. Also, supplement some ablation study on it would be even better.
- Add some visualization results to justify the improvement of the proposed method. For example, what kind of hard samples will be better recognized in the proposed method.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
4
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper is well-motivated, propose a method for automatic colon disease classification and demonstrate good performance. However, the methodology part is not clearly illustrated and the technical innovation is not convincing.
- Number of papers in your stack
4
- What is the ranking of this paper in your review stack?
3
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Not Answered
- [Post rebuttal] Please justify your decision
Not Answered
Review #5
- Please describe the contribution of the paper
The authors propose a frequency-domain complex number CNN for colon disease classification in colonoscopy images. By splitting the data into real and complex parts, the proposed approach can alleviating the effects of uneven brightness by decoupling image content and brightness.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
Overall the approach is rather interesting from a technical novelty point of view. It is exciting to see a method pushing into other areas beyond just CNNs and Transformers, even if the performance isn’t particularly spectacular (similar to some simple CNNs).
-
The experiments and ablations are thorough and can be helpful in understanding this fairly novelty approach when trying to apply similar techniques to other domains.
-
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
The baselines are a bit weak and no related method for colonoscopy classification are provided.
-
There are a number of approaches which perform pixel and patch shuffles (e.g. “PatchShuffle Regularization”) and none of them are cited. This claimed novelty is a stretch at best and these other works should be cited. I would really like to see why the proposed approach is different or better than these similar shuffling methods.
-
The authors claim in the conclusion that they introduce complex convolutions. These have been known in the literature for years (e.g. “On Complex Valued Convolutional Neural Networks” and many works since then). Be careful not to overstate your novelty. You are not introducing complex networks for the first time. But you are one of the first to apply complex CNNs in the way your are to this type of problem are it is interesting.
-
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance
Good, code will be released.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html
See weaknesses.
- Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making
5
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Overall I am happy to see an interesting and new approach to a decently well studied problem. Even if the results are not particularly groundbreaking and there are some issues with overclaimed novelty, I would still favor a very weak acceptance just to get a conversation going in the community about possible uses of complex neural networks through a Fourier decoupling procedure such as this.
- Number of papers in your stack
5
- What is the ranking of this paper in your review stack?
2
- Reviewer confidence
Confident but not absolutely certain
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
5
- [Post rebuttal] Please justify your decision
This was a bit tough. I thought about this for a long time, but I’m going to leave my original rating. I have several issues with this paper that I do not feel were well addressed by the rebuttal, but it’s so refreshing to see a tangential approach which brings back some of the FFT techniques that tried to break into deep learning a number of years ago but were never successful is really great to see and I think the community can benefit from this aspect.
A main contribution is this shuffling technique, but other shuffling techniques exist, they’re becoming fairly common in the computer vision space. The feel I got from the rebuttal was they were not going to cite or compare with any of these techniques and just dismissed the one I referenced as oh that won’t work. To not even mention these other techniques and not compare with them makes it very hard for me to put any sort of accept on this. That’s not good science. Sure your shuffling works, as shown by your ablation, but does it work better than other shuffling and why is completely ignored. Further the only comparison with other colon disease classification techniques being buried in the supplementary is deeply concerning. Based on these comment and the rebuttal I was going to lower my rating to weak reject.
That being said, the authors now make a new claim that their technique is the first complex network which can be trained directly in the frequency domain. If this is true, (unfortunately my knowledge is a little limited here), then this pushes me back up to weak accept again. But this should be clearly stated in the paper, other complex approach should be cited, should be compared against. The lack of respect for prior work shown by this paper is deeply concerning.
So tl;dr I really like the direction, motivation, and I think the community would really benefit from considering this approach. But as far as good science, referencing and comparing with prior techniques, this paper is severely lacking.
Primary Meta-Review
- Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.
Summary & Contribution: This paper proposes a Fourier-based method to classify colonoscopy images called FFCNet. The main motivation of this work is to overcome limitations of current methods such as uneven brightness, inter-class similarity to increase accuracy. The authors use a novel complex network that combines complex CNN with frequency learning. FFCNet has two main components: 1) a patch scrambling module that obtains a complex spectrogram and 2) a frequency-domain complex network which replaces convolutions, ReLu and batch normalisation for their complex version to extract richer feature information. The model is evaluated using an in-house dataset composed of 2568 images showing and results show an increased accuracy. The main contribution of this paper is the automatic classification of colon diseases with a framework that combines complex CNN and frequency learning.
Key strengths:
- Reviewers agree that the use of frequency learning and complex CNN for classification in colonoscopy is novel and interesting.
- Strong evaluation and ablation studies to justify the increase in accuracy
Key weaknesses:
- The use of complex CNN itself is not novel (the application to colonoscopy is), and it is not fully clear what is the improvement and/or novelty presented in the work as some works in the literature have been missed and not discussed.
Evaluation & Justification: Reviewers agree that the use of complex CNN for colonoscopy classification is novel and interesting. However, it is not fully clear what is the real contribution of this work to make a decision.
If a rebuttal is submitted, please do consider all reviewers comments. In particular, please clarify and discuss what is the main contribution and novelty of the work.
- What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
5
Author Feedback
Thanks to the meta-reviewers and all reviewers for their very positive appreciation of our work. -R5 appreciates that our work is ‘interesting from a technical novelty point of view’ and ‘beyond just CNNs and Transformers’. -R2 agrees that ‘The motivation of utilizing frequency learning for colon diseases classification is well stated and interesting’. -R1 believes that we eliminate image brightness imbalance and ‘show superior performances to SOTA works’. All the constructive suggestions will be adopted and the writing will be checked carefully in the final version.
- Clinical impact (-R1&R2&R5): For the first time, our work realizes the classification of colon diseases according to the stage of their development, providing clinical decision support for the level of colon disease and formulating corresponding treatment plans.
- Details of our contribution (-R1&R2&R5): Our proposed FFCNet is the first such model that can be trained directly in the frequency domain. 1) The proposed PSM module enables compressing the numerical range of the spectrogram to overcome the difficulty of training directly from the spectrum. Spatially stitching of spectral patches brings additional positional and local information into the frequency domain. 2) We develop a novel complex network for direct frequency domain learning to avoid the loss of phase information caused by the real-valued operations of current networks. Model convolution kernels, blocks, and architectures are modified to complex operations allowing us to extract richer feature information.
- Why PSM improves (-R2&R5): Our PSM is effective because of three innovative designs. 1) Patch Fourier compresses the numerical distribution in the spectrum, improving training stability. 2) The spectrum patches are spliced on the space to retain the image location and local information. 3) Shuffle operations guide the network to focus on local features and reduce the risk of model overfitting. The improvement also has been proved in the experimental results. We shuffle the positions of the patches with a certain probability, and the paper mentioned by -R5 reorders the pixels in the patches, which breaks local features.
- Differences in our complex model (-R1&R2): Our frequency complex network unlike existing complex networks in which all components are complex versions, including convolution kernels, blocks, and architecture, so we can directly learn frequency information in the spectrum. The novel dual-branch architecture of the real and imaginary parts is combined with the complex convolution kernel to avoid the loss of the phase information of the spectrum. Our network splits the real and imaginary parts of complex numbers as two maps for complex operations, while the papers with complex data mentioned by -R2 use the real and imaginary parts as different channels for real operations.
- Comparison with more advanced networks (-R1&R2): Thanks for the suggestion. The comparison with your suggested network also indicates the superiority of our FFCNet (Average: 86.35%): a joint CNN and transformer network (CoAtNet, 85.72%), a CNN network with added frequency modules (Fast Fourier Convolution, 81.57%), a transformer network with added frequency modules (GFNet, 84.81%), and other complex networks (k-Space Deep Learning for Accelerated MRI, 83.28%). FFCNet outperforms them without requiring pre-training, indicating that the designed model is more efficient.
- Comparison with the colonoscopy classification method (-R5): Thanks for the suggestion. Table 1 (in the appendix) shows that FFCNet yields performance than other colon classification methods on each metric.
- Visualization (-R1&R2): Thanks for the suggestion. The newly added Grad-CAM and magnitude map visualizations indeed demonstrate the feature extraction and brightness suppression capabilities of the model. The t-SNE (Fig.2 in the appendix) shows that the method enables distinguishing similar features between classes.
Post-rebuttal Meta-Reviews
Meta-review # 1 (Primary)
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The main contribution of this work is the automatic classification of colon diseases with a framework that combines complex CNN and frequency learning.
Key strengths:
- Reviewers agree that the use of frequency learning and complex CNN for classification in colonoscopy is novel and interesting from a technical point of view.
- Strong evaluation and ablation studies to justify the increase in accuracy
Key weaknesses:
- Evaluation on a single dataset
- The use of complex CNN itself is not novel (the application to colonoscopy is), and it is not fully clear what is the improvement and/or novelty presented in the work as some works in the literature have been missed and not discussed.
Review comments & Scores: Reviewers commented on the novelty of the approach, the effectiveness of the method and a comparison with other advanced models. After rebuttal, reviewer’s ratings slightly increased.
Rebuttal: Authors were asked to specifically discuss the main contributions and novelty of this work, as it was not fully clear. The authors have also stated that this is the first work that can be trained in the frequency domain, which is novel and interesting and can be potentially adapted to other medical imaging problems. They also provide further evidence that their model is more efficient.
Evaluation & Justification: I agree here with R5, that despite the lack of a fair literature review, the idea of this work is interesting and the MICCAI community would benefit from it. I would like to ask the authors to carefully review the references to add any missing work.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
10
Meta-review #2
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
For the problem of endoscopic image classification, the paper proposes an approach exploiting complex CNNs and the frequency domain. This type of treatment although not unique is not very common and it might be interesting for the community. I support the acceptane of the paper despite some validation weaknesses remaining after rebuttal.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
5
Meta-review #3
- Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.
The authors have satisfactorily addressed all major concerns in the rebuttal.
- After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.
Accept
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
8