Preview

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Advanced search

A method for optimizing neural networks based on structural distillation using a genetic algorithm

https://doi.org/10.17586/2226-1494-2024-24-5-770-778

Abstract

As neural networks become more complex, the number of parameters and required computations increases, which complicates the installation and operation of artificial intelligence systems on edge devices. Structural distillation can significantly reduce the resource intensity of using any neural networks. The paper presents a method for optimizing neural networks that combines the advantages of structural distillation and a genetic algorithm. Unlike evolutionary approaches used to search for the optimal architecture or distillation of neural networks, when forming distillation options, it is proposed to encode not only the parameters of the neural network, but also the connections between neurons. The experimental study was conducted on the VGG16 and ResNet18 models using the CIFAR-10 dataset. It is shown that structural distillation allows optimizing the size of neural networks while maintaining their generalizing ability, and the genetic algorithm is used to effectively search for optimal distillation options for neural networks, taking into account their structural complexity and performance. The obtained results demonstrated the effectiveness of the proposed method in reducing the size and improving the performance of networks with an acceptable loss of quality.

About the Authors

V. N. Kuzmin
Mozhaisky Military Aerospace Academy
Russian Federation

Vladimir N. Kuzmin - D.Sc. (Military Science), Professor, Leading Researcher

Saint Petersburg, 197198



T. R. Sabirov
Mozhaisky Military Aerospace Academy
Russian Federation

Artem B. Menisov - PhD, Doctoral Student

Saint Petersburg, 197198



T. R. Sabirov
Mozhaisky Military Aerospace Academy
Russian Federation

Timur R. Sabirov - PhD, Senior Lecturer

Saint Petersburg, 197198



References

1. Spoorthi M., Indu Priya B., Kuppala M., Karpe V.S., Dharavath D. Automated resume classification system using ensemble learning. Proc. of the 9th International Conference on Advanced Computing and Communication Systems (ICACCS). V. 1, 2023, pp. 1782–1785. https://doi.org/10.1109/icaccs57279.2023.10112917

2. Freire P.J., Osadchuk Y., Spinnler B., Napoli A., Schairer W., Costa N., Prilepsky J.E., Turitsyn S.K. Performance versus complexity study of neural network equalizers in coherent optical systems. Journal of Lightwave Technology, 2021, vol. 39, no. 19, pp. 6085–6096. https://doi.org/10.1109/jlt.2021.3096286

3. Hankala T., Hannula M., Kontinen J., Virtema J. Complexity of neural network training and ETR: Extensions with effectively continuous functions. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 11, pp. 12278–12285. https://doi. org/10.1609/aaai.v38i11.29118

4. Koonce B., Koonce B. ResNet 50. Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization. Springer, 2021, pp. 63–72. https://doi.org/10.1007/978-1-4842-6168-2_6

5. Floridi L., Chiriatti M. GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 2020, vol. 30, no. 4, pp. 681–694. https://doi.org/10.1007/s11023-020-09548-1

6. Achiam J., Adler S., Agarwal S. et al. Gpt-4 technical report. arXiv, 2023, arXiv:2303.08774. https://doi.org/10.48550/arXiv.2303.08774

7. Bodimani M. Assessing the impact of transparent AI systems in enhancing user trust and privacy. Journal of Science & Technology, 2024, vol. 5, no. 1, pp. 50–67. https://doi.org/10.55662/JST.2024.5102

8. Lu Z., Li Z., Chiang C.-W., Yin M. Strategic adversarial attacks in AI-assisted decision making to reduce human trust and reliance. Proc. of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023, pp. 3020–3028. https://doi.org/10.24963/ijcai.2023/337

9. He Y., Xiao L. Structured pruning for deep convolutional neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, vol. 46, no. 5, pp. 2900–2919. https:// doi.org/10.1109/tpami.2023.3334614

10. Ding S., Zhang L., Pan M., Yuan X. PATROL: Privacy-oriented pruning for collaborative inference against model inversion attacks. Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 4704–4713. https://doi.org/10.1109/wacv57701.2024.00465

11. Fang G., Ma X., Song M., Mi M.B., Wang X. Depgraph: Towards any structural pruning. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 16091–16101. https://doi.org/10.1109/cvpr52729.2023.01544

12. Wen L., Zhang X., Bai H., Xu Z. Structured pruning of recurrent neural networks through neuron selection. Neural Networks, 2020, vol. 123, pp. 134–141. https://doi.org/10.1016/j.neunet.2019.11.018

13. Zhao M., Peng J., Yu S., Liu L., Wu N. Exploring structural sparsity in CNN via selective penalty. IEEE Transactions on Circuits and Systems for Video Technology, 2022, vol. 32, no. 3, pp. 1658–1666. https://doi.org/10.1109/tcsvt.2021.3071532

14. Shen M., Molchanov P., Yin H., Alvarez J.M. When to prune? a policy towards early structural pruning. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 12237–12246. https://doi.org/10.1109/cvpr52688.2022.01193

15. Katoch S., Chauhan S.S., Kumar V. A review on genetic algorithm: past, present, and future. Multimedia Tools and Applications, 2021, vol. 80, pp. 8091–8126. https://doi.org/10.1007/s11042-020-10139-6

16. Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv, 2014, arXiv:1409.1556. https:// doi.org/10.48550/arXiv.1409.1556

17. Zhou Y., Ren F., Nishide S., Kang X. Facial sentiment classification based on resnet-18 model. Proc. of the 2019 International Conference on Electronic Engineering and Informatics (EEI), 2019, pp. 463–466. https://doi.org/10.1109/eei48997.2019.00106

18. Recht B., Roelofs R., Schmidt L., Shankar V. Do CIFAR-10 classifiers generalize to CIFAR-10?. arXiv, 2018, arXiv:1806.00451. https://doi.org/10.48550/arXiv.1806.00451

19. Liu Q., Mukhopadhyay S. Unsupervised learning using pretrained CNN and associative memory bank. Proc. of the International Joint Conference on Neural Networks (IJCNN), 2018, pp. 01–08. https:// doi.org/10.1109/ijcnn.2018.8489408

20. Jeevan P., Sethi A. Vision Xformers: Efficient attention for image classification. arXiv, 2021, arXiv:2107.02239. https://doi.org/10.48550/arXiv.2107.02239

21. Hou Y., Wu Z., Cai X., Zhu T. The application of improved densenet algorithm in accurate image recognition. Scientific Reports, 2024, vol. 14, no. 1, pp. 8645. https://doi.org/10.1038/s41598-024-58421-z


Review

For citations:


Kuzmin V.N., Sabirov T.R., Sabirov T.R. A method for optimizing neural networks based on structural distillation using a genetic algorithm. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2024;24(5):770-778. (In Russ.) https://doi.org/10.17586/2226-1494-2024-24-5-770-778

Views: 18


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2226-1494 (Print)
ISSN 2500-0373 (Online)