Paper Info Reviews Meta-review Author Feedback Post-Rebuttal Meta-reviews

Authors

Pengxin Yu, Haoyue Zhang, Han Kang, Wen Tang, Corey W. Arnold, Rongguo Zhang

Abstract

In clinical practice, anisotropic volumetric medical images with low through-plane resolution are commonly used due to short acquisition time and lower storage cost. Nevertheless, the coarse resolution may lead to difficulties in medical diagnosis by either physicians or computer-aided diagnosis algorithms. Deep learning-based volumetric super-resolution (SR) methods are feasible ways to improve resolution, with convolutional neural networks (CNN) at their core. Despite recent progress, these methods are limited by inherent properties of convolution operators, which ignore content relevance and cannot effectively model long-range dependencies. In addition, most of the existing methods use pseudo-paired volumes for training and evaluation, where pseudo low-resolution (LR) volumes are generated by a simple degradation of their high-resolution (HR) counterparts. However, the domain gap between pseudo- and real-LR volumes leads to the poor performance of these methods in practice. In this paper, we build the first public real-paired dataset RPLHR-CT as a benchmark for volumetric SR, and provide baseline results by re-implementing four state-of-the-art CNN-based methods. Considering the inherent shortcoming of CNN, we also propose a transformer volumetric super-resolution network (TVSRN) based on attention mechanisms, dispensing with convolutions entirely. This is the first research to use a pure transformer for CT volumetric SR. The experimental results show that TVSRN significantly outperforms all baselines on both PSNR and SSIM. Moreover, the TVSRN method achieves a better trade-off between the image quality, the number of parameters, and the running time. Data and code are available at https://github.com/smilenaxx/RPLHR-CT.

Link to paper

DOI: https://link.springer.com/chapter/10.1007/978-3-031-16446-0_33

SharedIt: https://rdcu.be/cVRTs

Link to the code repository

https://github.com/smilenaxx/RPLHR-CT

Link to the dataset(s)

https://github.com/smilenaxx/RPLHR-CT


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces a dataset for volumetric SR and concurrently proposes a transformer network for super resolution. The data is evaluated for multiple network architectures, and an experiment for the domain gap and an ablation study for the proposed network is presented.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors aim at creating a public real-paired dataset for volumetric SR and provide a benchmark.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper aims at two objectives at the same time (introduce dataset and propose network), which results in non of them are explained in a sufficient manner.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Data and code are available at https://github.com/*.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    Further Questions:

    • It is not clear to me why a modeling of Long-range (!) dependencies is an issue for volumetric SR. Can you elaborate on this argument for the transformer network a bit more?
    • For the dataset analysis you are saying that “We use PSNR and SSIM to access the changes in the similarity of three slice-pairs”. However, the PSNR should be computed between a noise free image and the noisy representation. Not as a measure for image quality between CT sets. Can you explain the reasoning behind this measurement?
    • Can you elaborate on the difference for the match slice between the thick and thin CT a bit more? How have you handled motion in the dataset etc. ?
    • where is the external test set coming from? is this also public? are the parameters the same? are the patients the same?
    • benchmark: I was a bit confused, that two of the used benchmark algorithms are changed (cite: “For ResVox, the noise reduction part is removed. For MPU-Net, we do not use the multi-stream architecture due to the lack of available lung masks.”). By removing parts of the algorithm, the network is changed and it is not longer the originally proposed algorithms. Can you comment on this?

    Structure of the paper:

    • there are a lot of graphics in the paper (and supplemental material), but most of them are not mentioned/explained in the paper, and the caption is to short. Therefore, the reader has to figure out the findings by himself. (examples are: Fig. 1. “Summary of our RPLHR-CT dataset” is not exactly what i see there. Fig. 2 what are the colorful boxes in B)TAB?. )
    • the abbreviation on the heading (RPLHR) is never introduced!
  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    5

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I had the feeling that a lot of points are mixed up and not clear to the reader because of the broad range of points that are tried to make here. The pure description of the dataset and baseline and domain gap analysis one one hand or the proposed network and subsequent analysis on the other hand would have been more appropriate considering the paper limit.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    2

  • Reviewer confidence

    Somewhat Confident

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Not Answered

  • [Post rebuttal] Please justify your decision

    Not Answered



Review #2

  • Please describe the contribution of the paper

    In this paper, the authors address the problem of super-resolution of CT images in the height dimension. The main contribution of the work is the following: a new dataset of 250 images of real-paired thin CTs and thick CTs, and a new transformer-based super-resolution model. Authors compare the model’s performance with the state-of-the-art models and perform an ablation study of the main components of the model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    A new publicly available dataset for super-resolution benchmarking is a valuable contribution for the medical imaging community.

    Authors demonstrated the superiority of the transformer-based model compared to the standard convolutional models in terms of PSNR and SSIM.

    Ablation study is present.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The importance of the “medium-sized” dataset is hard to comprehend. Authors mentioned the work of Li et al. [1] that collected 880 real pairs also. What is the main difference with that dataset?

    2) It is challenging to understand the difference between ResVox, SAINT, and TVSRN in terms of SSIM in Figure 3. The statistical significance tests need more detail to really show the benefit of the proposed model. A table with confidence interval with these results can be placed in the supplementary material to help the reader.

    3) Table 1 shows the importance of the picked architecture, suggesting a performance boost compared to the ViT. However, the rest of modifications differ in the 3rf digits of SSIM. So, the improvement is at best marginal. Perhaps, other image quality metrics are needed to support authors claims.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Looks OK. The code is submitted but I did not launch it.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    1) More image quality metrics comparisons 2) Some bar plots from Figure 3 will be very valuable in the table form in supp. material

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    4

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Authors promise to present a new dataset for the benchmarking, yet it is not clear how the dataset is different and if it is any better than Li et al. [1] (880 images vs. 250). Authors should elaborate the advantages of their data.

    Authors showed the superiority of the model w.r.t. SOTA models. However, it is not clear which parts of the model were important, i.e. TVSRN-Encoder is worse than TVSRN only by 0.002 in terms of SSIM – really marginal and precludes me to give a higher score. Authors ought to provide stronger support for why their model’s architecture is advantageous. Phrase “TVSRN outperforms existing algorithms significantly.” needs to be justified.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    3

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    5

  • [Post rebuttal] Please justify your decision

    The authors wrote a well-structured rebuttal that made me appreciate the value of the dataset better than in the first round of reviewing. I increase my score accordingly.

    If the acceptance decision is made, the authors should keep their promise and implement the changes (I still think writing could be improved).

    Also, the claims on first SR transformer should be explicit for CT only and the MRI reference should be added. The reader would definitely appreciate if the difference in these works is explained (e.g., CT vs MRI volumes?).



Review #3

  • Please describe the contribution of the paper

    1) The authors developed a public real-paired volume dataset, RPLHR-CT which contains real paired thin-CTs (slice thickness 1mm) and thick-CTs (slice thickness 5mm) of 250 patients. RPLHR-CT is the first benchmark for volumetric SR, which enables fair comparison between different methods. 2) This work explored the potential of transformer for volumetric SR and proposed a novel transformer volumetric super-resolution network to alleviate the inherent shortcomings of convolutional operations, i.e., the issue of long-range dependencies. Besides, the proposed TVSRN network achieves a better trade-off between image quality, the number of parameters, and running time. 3) The authors re-implement and benchmark state-of-the-art CNN-based volumetric SR algorithms developed for CT. This indicates that the work provides some benchmark comparison and reference for the community of volumetric CT image SR.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) Artificial pseudo LR/HR training samples make the SR model have poor generalization for application scenarios. The RPLHR-CT dataset presented in this paper consists of real LR/HR image pairs, narrowing the field generation gap between handcrafted samples and real samples. 2) The writing is great and easy to follow and understand. Moreover, the work is technically novel due to two-fold reasons: (1) the first public real paired dataset RPLHR-CT as a benchmark for volumetric SR; (2) the TVSRN based on the volumetric transformer. 3) The authors re-implemented several state-of-the-art methods tested them on the proposed RPLHR-CT datasets, providing benchmark comparison and reference for volumetric CT SR.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The patch examples of the RPLHR-CT dataset may be supplied in the supplementary. Meanwhile, the visual comparison between RPLHR-CT and the other dataset is recommended to supply. 2) TVSRN should be tested on other datasets to further reveal the robustness. 3) For the visual comparison in Fig. 4, it is better to give quantitative results like PSNR or SSIM. Because the visual differences of the compared methods are not easy to observe, it is more intuitively displayed through quantitative indexes.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Note, that authors have filled out a reproducibility checklist upon submission. Please be aware that authors are not required to meet all criteria on the checklist - for instance, providing code and data is a plus, but not a requirement for acceptance

    Good reproducibility

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review: https://conferences.miccai.org/2022/en/REVIEWER-GUIDELINES.html

    None

  • Rate the paper on a scale of 1-8, 8 being the strongest (8-5: accept; 4-1: reject). Spreading the score helps create a distribution for decision-making

    6

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Good novelty, adequate work, and great contribution to the community.

  • Number of papers in your stack

    5

  • What is the ranking of this paper in your review stack?

    1

  • Reviewer confidence

    Confident but not absolutely certain

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    7

  • [Post rebuttal] Please justify your decision

    This paper built the first public real-paired dataset RPLHR-CT as a benchmark for volumetric SR, and provided some benchmark comparison and reference for the community. Personally, I think the contribution is significant, although small problems exist.




Primary Meta-Review

  • Please provide your assessment of this work, taking into account all reviews. Summarize the key strengths and weaknesses of the paper and justify your recommendation. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. In case of an invitation for rebuttal, clarify which points are important to address in the rebuttal.

    The contributions of this work are twofold: a new dataset (250 paired thin-thick CTs) and a transformer-based CT superresolution model. Concerning the dataset, this seems to be the first benchmark for volumetric SR. The methodological novelty comes from bringing transformers to the specific superresolution problem. These two contributions are important and interesting for the community, but the paper lacks clarity in some respects. Experimental results include measuring the domain gap, comparing sota algorithms, and an ablation study. Reimplementation of sota methods provides a benchmark for future works and shows improvements w.r.t. SOTA. Some results could be further analyzed and discussed.

    Comments and questions to address in the rebuttal/ revised version of the paper:

    • clarify the difference to Li et al.’s dataset (880 paired images)
    • clearly state the methodological novelty: is this the first paper to use transformers for CT superresolution?
    • the sota comparisons based on the SSIM are claimed to be significant, but scores and std bars (fig 3) could lead to thinking otherwise. Provide more details about the significance computation.
    • In the ablation studies only the choice between vit or swin transformer seems relevant. Other components do not seem to make a difference. Comment.
  • What is the ranking of this paper in your stack? Use a number between 1 (best paper in your stack) and n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4




Author Feedback

We would like to thank all the reviewers and meta-reviewer for their instructive comments. Our responses are as follows. [Q] Difference to Li’s dataset. (Meta Reviewer, R2) [A] Our dataset is the first open-source real-paired dataset for volumetric Super-Resolution (SR). Li’s dataset, which has a larger amount of data, is very similar to ours, including scanning anatomy, data reconstruction, etc. However, the dataset is not publicly available, which is why we chose to release this benchmark for the community. [Q] Clearly state the methodological novelty. (MR, R1) [A] We argue that this is the first research to use a fully Transformer architecture for CT Volumetric SR. In our model, long-range dependencies are modeled through Transformer to exploit non-local similarities of anatomical structures. For example, the ends of pulmonary vessels are often distributed in multiple different areas, yet they are similar and can play a role in mutual cueing in Volumetric SR. In a related field, Feng et al. proposed a combination of CNN and Transformer to achieve 2D SR for MRI. (Feng et al. “Task transformer network for joint MRI reconstruction and super-resolution.” MICCAI 2021) [Q] More details about the significance computation, especially for SSIM. (MR, R2) [A] We agree. To better show the difference in performance between methods, especially for SSIM, we will add 95% CI to Supplemental Material Table1. In addition, we will add sample-by-sample performance scatterplots in Supplemental Material to provide more detailed comparison results. [Q] Some components don’t make a difference in ablation studies. (MR, R2) [A] In the ablation studies, we first compared two landmark transformers in CV (ViT & Swin) to determine the backbone. The results show that the TVSRN-Encoder with Swin-backbone not only outperforms ViT-backbone by a large margin but also outperforms other CNN-based SOTA methods. On this basis, the other two components can still further improve the performance (Decoder: PSNR+0.133, SSIM+0.001; TAB: PSNR+0.112, SSIM+0.001) under the premise that the number of parameters is essentially unchanged (-0.02M, +0.17M). We performed one-sided Wilcoxon signed-rank tests to verify that the performance improvement brought by each component is significant. We will add the p value to Manuscript Table.1. Furthermore, we will add sample-by-sample performance scatterplots in Supplemental Material to further illustrate the effectiveness of individual components. [Q] Details for the external test set. (R1, R3) [A] Results on our external test set show that TVSRN is robust and achieves significantly higher PSNR and SSIM. Due to space limitations, we put details regarding the external test set and the experimental results in Supplemental Material Table1. The external test set comes from a different institution than RPLHR and cannot be made public for privacy reasons. [Q] Why benchmark algorithms are changed. (R1) [A] ResVox [6] is designed for low-resolution, low-dose CT, with SR and denoising modules. Our goal is only SR, so the denoising module is removed. The multi-stream component of MPU-Net [13] relies on lung masks, which means additional labeling costs. To ensure that the various methods are evaluated under the same conditions, we implement a version of MPU-Net that does not require lung masks for comparison. [Q] Difference between the thick and thin CT and PSNR metric. (R1) [A] As described in Sec2.1, we emphasize that the thin and thick CTs were reconstructed from the same raw data, so both reconstructions are spatially aligned. PSNR is commonly used to compare a reference image to some reconstruction of the reference [6,13,17,18]. Thus, we treat the thin CT as the reference (“noise free”) image and thick CT as the reconstruction (“noisy representation”) and use PSNR to measure similarity. [Q] Other revisions. (R1, R2, R3) [A] We will modify the legends, abbreviations, and other information according to reviewer suggestions.




Post-rebuttal Meta-Reviews

Meta-review # 1 (Primary)

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The rebuttal has clarified questions about the importance of the dataset, the methodological novelty, the significance of the results.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    1



Meta-review #2

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    The authors have addressed the questions on novelty and ablation studies SOTA comparisons, which enable reviewers to increase their score after the rebuttal. Therefore, I see value of the paper for MICCAI community and recommend acceptance of the paper.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    4



Meta-review #3

  • Please provide your assessment of the paper taking all information into account, including rebuttal. Highlight the key strengths and weaknesses of the paper, clarify how you reconciled contrasting review comments and scores, indicate if concerns were successfully addressed in the rebuttal, and provide a clear justification of your decision. If you disagree with some of the (meta)reviewer statements, you can indicate so in your meta-review. Please make sure that the authors, program chairs, and the public can understand the reason for your decision.

    This study makes two contributions: a novel dataset (250 paired thin-thick CTs) and a transformer-based CT superresolution model. In terms of the dataset, this appears to be the first volumetric SR benchmark. Bringing transformers to the specific superresolution problem provides methodological originality. These two contributions are significant and intriguing to the community, yet the article is unclear in certain areas pre rebuttal. The domain gap was measured, SOTA algorithms were compared, and an ablation study was performed. The reimplementation of SOTA methodologies serves as a standard for future work and demonstrates improvements in comparison to SOTA.

    The authors have done a good rebuttal and two reviewers changed their scores to better ones that brought up scores to 5/5/7. I am inclined to accept this work because it is top-ranked in my deck after rebuttal and the scientific contributions of the work are significant.

  • After you have reviewed the rebuttal, please provide your final rating based on all reviews and the authors’ rebuttal.

    Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    2



back to top