Reference-based diffusion model for super-resolution

A. K. Denisov; S. V. Bykovskii; P. V. Kustarev

doi:10.17586/2226-1494-2025-25-2-321-327

Reference-based diffusion model for super-resolution

A. K. Denisov, S. V. Bykovskii, P. V. Kustarev

https://doi.org/10.17586/2226-1494-2025-25-2-321-327

Full Text:

PDF (Rus) |

Generate QR code

Abstract

This article is devoted to digital image processing algorithms, namely, super-resolution task. Currently, various methods of image restoration based on deep learning are actively developing. These methods are used to solve image restoration problems, such as inpainting, denoising, and super-resolution. One important class of super-resolution methods is reference-based super-resolution that allows restoring the missing information in the main image using reference images. Methods of this class are mainly represented by convolutional neural networks which are widely used in computer vision problems. Despite the significant achievements of existing methods, they have one significant drawback: the image area not represented in the reference image often has worse quality compared to the rest of the image, which is clearly visible to the observer. In addition to convolutional neural networks, diffusion models are actively used in image restoration problems. They are capable of generating images with high quality and diverse fine details but suffer from a lack of fidelity between the generated details and the real ones. The aim of this work is to improve the quality of the reference-based image restoration method using the diffusion model. A hybrid architecture of the diffusion model denoising neural network is proposed consisting of three main blocks: the basic denoising module, the reference-based module, and the fusion module for the final result generation. Three models were trained: a diffusion model, a referencebased convolutional neural network, and a proposed hybrid model. All three models were trained and evaluated on the Large-Scale Multi-Reference Dataset dataset. Based on the results of the trained models testing, a qualitative (visual) and quantitative comparison of the three models was done. The hybrid model demonstrated higher image quality, clarity, and consistency compared to the convolutional neural network using references and better restoration of real details compared to the diffusion model. According to the quantitative evaluation, the hybrid model also showed higher results compared to pure models. The results of this work can be used to increase the resolution of any images using reference information.

Keywords

image processing, diffusion models, super-resolution, deep learning, image restoration

About the Authors

A. K. Denisov

ITMO University
Russian Federation

Aleksei K. Denisov — Assistant.

Saint Petersburg, 197101, sc 57210698353

S. V. Bykovskii

ITMO University
Russian Federation

Sergei V. Bykovskii — PhD, Associate Professor.

Saint Petersburg, 197101, sc 57216469537

P. V. Kustarev

ITMO University
Russian Federation

Pavel V. Kustarev — PhD, Dean.

Saint Petersburg, 197101, sc 35317916600

References

1. Dong C., Loy C.C., He K., Tang X. Image Super-Resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, vol. 38, no. 2, pp. 295–307. https://doi.org/10.1109/TPAMI.2015.2439281

2. Kim J., Lee J.K., Lee K.M. Accurate image Super-Resolution using very deep convolutional networks. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1646– 1654. https://doi.org/10.1109/CVPR.2016.182

3. Lim B., Son S., Kim H., Nah S., Lee K.M. Enhanced deep residual networks for single image Super-Resolution. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 1132–1140. https://doi.org/10.1109/CVPRW.2017.151

4. Ledig C., Theis L., Huszár F., Caballero J., Cunningham A., Acosta A., Aitken A., Tejani A., Totz J., Wang Z., Shi W. Photo-realistic single image Super-Resolution using a generative adversarial network. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 105–114. https://doi.org/10.1109/10.1109/CVPR.2017.19

5. Wang X., Xie L., Dong C., Shan Y. Real-ESRGAN: training real-world blind Super-Resolution with pure synthetic data. Proc. of the IEEE/ CVF International Conference on Computer Vision Workshops (ICCVW), 2021, pp. 1905–1914. https://doi.org/10.1109/ICCVW54120.2021.00217

6. Zhang Z., Wang Z., Lin Z., Qi H. Image Super-Resolution by neural texture transfer. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 7974–7983. https://doi.org/10.1109/CVPR.2019.00817

7. Jiang Y., Chan K.C.K., Wang X., Loy C.C., Liu Z. Robust Referencebased Super-Resolution via C2-Matching. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 2103–2112. https://doi.org/10.1109/CVPR46437.2021.00214

8. Cao J., Liang J., Zhang K., Li Y., Zhang Y., Wang W., Van Gool L. Reference-based image Super-Resolution with deformable attention transformer. Lecture Notes in Computer Science, 2022, vol. 13678, pp. 325–342. https://doi.org/10.1007/978-3-031-19797-0_19

9. Zhang L., Li X., He D., Li F., Ding E., Zhang Z. LMR: a large-scale multi-reference dataset for Reference-based Super-Resolution. Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 13072–13081. https://doi.org/10.1109/ICCV51070.2023.01206

10. Li G., Xing W., Zhao L., Lan Z., Sun J., Zhang Z., Zhang Q., Lin H., Lin Z. Self-Reference image Super-Resolution via pre-trained diffusion large model and window adjustable transformer. Proc. of the 31st ACM International Conference on Multimedia, 2023, pp. 7981–7992. https://doi.org/10.1145/3581783.3611866

11. Ho J., Jain A., Abbeel P. Denoising diffusion probabilistic models. arXiv, 2020, arXiv:2006.11239. https://doi.org/10.48550/arXiv.2006.11239

12. Song J., Meng C., Ermon S. Denoising diffusion implicit models. arXiv, 2020, arXiv:2010.02502. https://doi.org/10.48550/arXiv.2010.02502

13. Rombach R., Blattmann A., Lorenz D., Esser P., Ommer B. HighResolution image synthesis with latent diffusion models. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 10674–10685. https://doi.org/10.1109/CVPR52688.2022.01042

14. Li H., Yang Y., Chang M., Chen S., Feng H., Xu Z., Li Q., Chen Y. SRDiff: Single Image Super-Resolution with diffusion probabilistic models. Neurocomputing, 2022, vol. 479, pp. 47–59. https://doi.org/10.1016/j.neucom.2022.01.029

15. Yu F., Gu J., Li Z., Liu J., Kong X., Wang X., He J., Qiao Y., Dong C. Scaling Up to Excellence: practicing model scaling for photorealistic image restoration in the wild. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 25669–25680. https://doi.org/10.1109/CVPR52733.2024.02425

16. Zhang R., Isola P., Efros A.A., Shechtman E., Wang O. The unreasonable effectiveness of deep features as a perceptual metric. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 586–595. https://doi.org/10.1109/CVPR.2018.00068

17. Wang J., Chan K.C.K., Loy C.C. Exploring CLIP for assessing the look and feel of images. Proc. of the 37th AAAI Conference on Artificial Intelligence, 2023, vol. 37, no. 2. pp. 2555–2563. https://doi.org/10.1609/aaai.v37i2.25353

18. Heusel M., Ramsauer H., Unterthiner T., Nessler B., Hochreiter S. GANs trained by a two time-scale update rule converge to a local nash equilibrium. Proc. of the 31st International Conference on Neural Information Processing Systems (NIPS ‘17), 2017, pp. 6629–6640.

Review

For citations:

Denisov A.K., Bykovskii S.V., Kustarev P.V. Reference-based diffusion model for super-resolution. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2025;25(2):321-327. (In Russ.) https://doi.org/10.17586/2226-1494-2025-25-2-321-327

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2226-1494 (Print)
ISSN 2500-0373 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Reference-based diffusion model for super-resolution

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy