Automation of complex text CAPTCHA recognition using conditional generative  adversarial networks

A. S. Zadorozhnyy; A. A. Korepanova; M. V. Abramov; A. A. Sabrekov

doi:10.17586/2226-1494-2024-24-1-90-100

Automation of complex text CAPTCHA recognition using conditional generative adversarial networks

A. S. Zadorozhnyy, A. A. Korepanova, M. V. Abramov, A. A. Sabrekov

https://doi.org/10.17586/2226-1494-2024-24-1-90-100

Full Text:

PDF (Rus) |

Generate QR code

Abstract

With the rapid development of Internet technologies, the problems of network security continue to worsen. So, one of the most common methods of maintaining security and preventing malicious attacks is CAPTCHA (fully automated public Turing test). CAPTCHA most often consists of some kind of security code, to bypass which it is necessary to perform a simple task, such as entering a word displayed in an image, solving a basic arithmetic equation, etc. However, the most widely used type of CAPTCHA is still the text type. In the recent years, the development of computer vision and, in particular, neural networks has contributed to a decrease in the resistance to hacking of text CAPTCHA. However, the security and resistance to recognition of complex CAPTCHA containing a lot of noise and distortion is still insufficiently studied. This study examines CAPTCHA, the distinctive feature of which is the use of a large number of different distortions, and each individual image uses its own different set of distortions, that is why even the human eye cannot always recognize what is depicted in the photo. The purpose of this work is to assess the security of sites using the CAPTCHA text type by testing their resistance to an automated solution. This testing will be used for the subsequent development of recommendations for improving the effectiveness of protection mechanisms. The result of the work is an implemented synthetic generator and discriminator of the CGAN architecture, as well as a decoder program, which is a trained convolutional neural network that solves this type of CAPTCHA. The recognition accuracy of the model constructed in the article was 63 % on an initially very limited data set, which shows the information security risks that sites using a similar type of CAPTCHA can carry.

Keywords

text-based CAPTCHAs, deep learning, conditional generative adversarial network, CGAN, CNN, information security

About the Authors

A. S. Zadorozhnyy

St. Petersburg State University (SPbSU)
Russian Federation

Alexander S. Zadorozhnyy — Student

Saint Petersburg, 199034

A. A. Korepanova

Saint Petersburg Federal Research Center of the Russian Academy of Sciences
Russian Federation

Anastasia A. Korepanova — Junior Researcher

Saint Petersburg, 199178

sc 57218191916

M. V. Abramov

Saint Petersburg Federal Research Center of the Russian Academy of Sciences
Russian Federation

Maxim V. Abramov — PhD, Senior Researcher

Saint Petersburg, 199178

sc 56938320500

A. A. Sabrekov

Saint Petersburg Federal Research Center of the Russian Academy of Sciences
Russian Federation

Artem A. Sabrekov — Junior Researcher

Saint Petersburg, 199178

sc 56938320500

References

1. Korepanova A.A., Bushmelev F.V., Sabrekov A.A. Node.js parsing technologies in the task of aggregating information and evaluating the parameters of cargo routes by extracting data from open sources. Computer Tools in Education Journal, 2021, no. 3, pp. 41–56. (in Russian). https://doi.org/10.32603/2071-2340-2021-3-41-56

2. Zi Y., Gao H., Cheng Z., Liu Y. An end-to-end attack on text CAPTCHAs // IEEE Transactions on Information Forensics and Security. 2019. V. 15. P. 753–766. https://doi.org/10.1109/TIFS.2019.2928622

3. Noury Z., Rezaei M. Deep-CAPTCHA: A deep learning based CAPTCHA solver for vulnerability assessment // ERN: Neural Networks & Related Topics (Topic). 2020. https://doi.org/10.2139/ssrn.3633354

4. Sahil Ahmed S., Anand K.M. Convolution neural network-based CAPTCHA recognition for indic languages // Advances in Intelligent Systems and Computing. 2021. V. 1407. P. 493–502. https://doi.org/10.1007/978-981-16-0171-2_46

5. Lu S., Huang K., Meraj T., Rauf H.T. A novel CAPTCHA solver framework using deep skipping Convolutional Neural Networks // PeerJ Computer Science. 2022. V. 8. P. e879. https://doi.org/10.7717/peerj-cs.879

6. Wang Z., Shi P. CAPTCHA recognition method based on CNN with focal loss // Complexity. 2021. V. 2021. P. 6641329. https://doi.org/10.1155/2021/6641329

7. Chen J., Luo X., Zhu L., Zhang Q., Gan Y. Handwritten CAPTCHA recognizer: a text CAPTCHA breaking method based on style transfer network // Multimedia Tools and Applications. 2023. V. 82. N 9. P. 13025–13043. https://doi.org/10.1007/s11042-021-11485-9

8. Bostik O., Horak K., Kratochvila L., Zemcik T., Bilik S. Semi-supervised deep learning approach to break common CAPTCHAs // Neural Computing and Applications. 2021. V. 33. N 20. P. 13333– 13343. https://doi.org/10.1007/s00521-021-05957-0

9. Le T.A., Baydin A.G., Zinkov R., Wood F. Using synthetic data to train neural networks is model-based reasoning // Proc. of the 2017 International Joint Conference on Neural Networks (IJCNN). 2017. P. 3514–3521. https://doi.org/10.1109/IJCNN.2017.7966298

10. Wang Y., Wei Y., Zhang M., Liu Y., Wang B. Make complex captchas simple: a fast text CAPTCHA solver based on a small number of samples // Information Sciences. 2021. V. 578. P. 181–194. https://doi.org/10.1016/j.ins.2021.07.040

11. Li C., Chen X., Wang H., Wang P., Zhang Y., Wang W. End-to-end attack on text-based CAPTCHAs based on cycle-consistent generative adversarial network // Neurocomputing. 2021. V. 433. P. 223–236. https://doi.org/10.1016/j.neucom.2020.11.057

12. Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition // arXiv. 2014. arXiv:1409.1556. https://doi.org/10.48550/arXiv.1409.1556

13. Hartigan J.A., Wong M.A. Algorithm AS 136: A k-means clustering algorithm // Journal of the Royal Statistical Society. Series C (Applied Statistics). 1979. V. 28. N 1. P. 100–108. https://doi.org/10.2307/2346830

14. Khan A., Sohail A., Zahoora U., Qureshi A.S. A survey of the recent architectures of deep convolutional neural networks // Artificial Intelligence Review. 2020. V. 53. N 8. P. 5455–5516. https://doi.org/10.1007/s10462-020-09825-6

15. Oliseenko V., Abramov M. Identification of user profiles in online social networks: a combined approach with face recognition // Journal of Physics: Conference Series. 2021. V. 1864. P. 012119. https://doi.org/10.1088/1742-6596/1864/1/012119

16. Bushmelev F., Khlobystova A., Abramov M., Livshits L. Deep machine learning techniques in the problem of estimating the expression of psychological characteristics of a social media user // Studies in Systems, Decision and Control. 2023. V. 457. P. 315–324. https://doi.org/10.1007/978-3-031-22938-1_22

17. Shafiq M., Gu Z. Deep residual learning for image recognition: a survey // Applied Sciences. 2022. V. 12. N 18. P. 8972. https://doi.org/10.3390/app12188972

18. Hossen M.I., Hei X. A low-cost attack against the hcaptcha system // Proc. of the 2021 IEEE Security and Privacy Workshops (SPW). 2021. P. 422–431. https://doi.org/10.1109/SPW53761.2021.00061

19. Kapoor A., Shah R., Bhuva R., Pandit T. Understanding inception network architecture for image classification: Technical Report. 2020. https://doi.org/10.13140/RG.2.2.16212.35204

20. Mittal S., Kaushik P., Hashmi S., Kumar K. Robust real time breaking of image CAPTCHAs using inception v3 model // Proc. of the 2018 Eleventh International Conference on Contemporary Computing (IC3). 2018. P. 1–5. https://doi.org/10.1109/IC3.2018.8530607

21. Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair Sh., Courville A., Bengio Y. Generative Adversarial Networks // Communications of the ACM. 2020. V. 63. N 11. P. 139–144. https://doi.org/10.1145/3422622

22. Mirza M., Osindero S. Conditional generative Adversarial Nets // arXiv. 2014. arXiv:1411.1784. https://doi.org/10.48550/arXiv.1411.1784

23. Krizhevsky A., Sutskever I., Hinton G.E. ImageNet classification with deep convolutional neural networks // Communications of the ACM. 2017. V. 60. N 6. P. 84–90. https://doi.org/10.1145/3065386

24. Huang G., Liu Z., Van Der Maaten L., Weinberger K.Q. Densely connected convolutional networks // Proc. of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. P. 2261–2269. https://doi.org/10.1109/CVPR.2017.243

25. Ronneberger O., Fischer P., Brox T. U-Net: Convolutional networks for biomedical image segmentation // Lecture Notes in Computer Science. 2015. V. 9351. P. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28

26. Chollet F. Xception: Deep learning with depthwise separable convolutions // Proc. of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017. P. 1800–1807. https://doi.org/10.1109/CVPR.2017.195

27. He K., Zhang X., Ren S., Sun J. Deep residual learning for image recognition // Proc. of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. P. 770–778. https://doi.org/10.1109/CVPR.2016.90

28. Vyatkin A., Tulupyev A. Automation of consistency checking of ideals of conjuncts with truth probability estimates. Information Security of Russian Regions (ISRR-2021). Proc. of the XII St. Petersburg Interregional Conference, 2021, pp. 330–332. (in Russian).

29. Vyatkin A., Abramov M., Kharitonov N., Tulupyev A. Application of tertiary structure of algebraic bayesian network in the problem of a posteriori inference. Bulletin of the South Ural State University. Series “Computational Mathematics and Computer Science”, 2023, vol. 12, no. 1, pp. 61–88. (in Russian). https://doi.org/10.14529/cmse230104

30. Vyatkin A., Kharitonov N., Tulupyev A. Application of algebraic bayesian networks in handwritten character recognition. Regional Informatics and Information Security. Proc. of the Anniversary XVIII St. Petersburg International Conference, 2022, pp. 538–542. (in Russian).

Review

For citations:

Zadorozhnyy A.S., Korepanova A.A., Abramov M.V., Sabrekov A.A. Automation of complex text CAPTCHA recognition using conditional generative adversarial networks. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2024;24(1):90-100. (In Russ.) https://doi.org/10.17586/2226-1494-2024-24-1-90-100

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2226-1494 (Print)
ISSN 2500-0373 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Automation of complex text CAPTCHA recognition using conditional generative adversarial networks

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy