Preview

Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Advanced search

Multi-agent adaptive routing by multi-head-attention-based twin agents using reinforcement learning

https://doi.org/10.17586/2226-1494-2022-22-6-1178-1186

Abstract

A regular condition, typical for packet routing, for the problem of cargo transportation, and for the problem of flow control, is the variability of the graph. Reinforcement learning based adaptive routing algorithms are designed to solve the routing problem with this condition. However, with significant changes in the graph, the existing routing algorithms require complete retraining. To handle this challenge, we propose a novel method based on multi-agent modeling with twin-agents for which new neural network architecture with multi-headed internal attention is proposed, pre-trained within the framework of the multi-view learning paradigm. An agent in such a paradigm uses a vertex as an input, twins of the main agent are placed at the vertices of the graph and select a neighbor to which the object should be transferred. We carried out a comparative analysis with the existing DQN-LE-routing multi-agent routing algorithm on two stages: pre-training and simulation. In both cases, launches were considered by changing the topology during testing or simulation. Experiments have shown that the proposed adaptability enhancement method provides global adaptability by increasing delivery time only by 14.5 % after global changes occur. The proposed method can be used to solve routing problems with complex path evaluation functions and dynamically changing graph topologies, for example, in transport logistics and for managing conveyor belts in production.

About the Authors

T. A. Gribanov
ITMO University
Russian Federation

Timofey A. Gribanov – Student

Saint Petersburg, 197101



A. A. Filchenkov
ITMO University
Russian Federation

Andrey A. Filchenkov – PhD (Physics & Mathematics), Engineer

Saint Petersburg, 197101

sc 55507568200



A. A. Azarov
ITMO University; North-West Institute of Management – branch of the Russian Presidential Academy of National Economy and Public Administration
Russian Federation

Artur A. Azarov – PhD, Scientific Researcher; Deputy Director

Saint Petersburg, 197101;

Saint Petersburg, 199178

sc 56938354700



A. A. Shalyto
ITMO University
Russian Federation

Anatoly A. Shalyto – D. Sc., Professor, Chief Reseacher

Saint Petersburg, 197101

sc 56131789500



References

1. Toth P., Vigo D. An overview of vehicle routing problems. The Vehicle Routing Problem. SIAM, 2002, pp. 1–26. https://doi.org/10.1137/1.9780898718515.ch1

2. Vutukury S., Garcia-Luna-Aceves J.J. MDVA: A distance-vector multipath routing protocol. Proc. 20th Annual Joint Conference on the IEEE Computer and Communications Societies (INFOCOM), vol. 1, pp. 557–564. https://doi.org/10.1109/INFCOM.2001.916780

3. Clausen T., Jacquet P. Optimized link state routing protocol (OLSR), 2003, no. RFC3626. https://doi.org/10.17487/RFC3626

4. Sweda T.M., Dolinskaya I.S., Klabjan D. Adaptive routing and recharging policies for electric vehicles. Transportation Science, 2017, vol. 51, no. 4, pp. 1326–1348. https://doi.org/10.1287/trsc.2016.0724

5. Puthal M.K., Singh V., Gaur M.S., Laxmi V. C-Routing: An adaptive hierarchical NoC routing methodology. Proc. of the 2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip, 2011, pp. 392–397. https://doi.org/10.1109/VLSISoC.2011.6081616

6. Zeng S., Xu X., Chen Y. Multi-agent reinforcement learning for adaptive routing: A hybrid method using eligibility traces. Proc. of the 16th IEEE International Conference on Control & Automation (ICCA’20), 2020, pp. 1332–1339. https://doi.org/10.1109/ICCA51439.2020.9264518

7. Ibrahim A.M., Yau K.L.A., Chong Y.W., Wu C. Applications of multiagent deep reinforcement learning: models and algorithms. Applied Sciences, 2021, vol. 11, no. 22, pp. 10870. https://doi.org/10.3390/app112210870

8. Bono G., Dibangoye J.S., Simonin O., Matignon L., Pereyron F. Solving multi-agent routing problems using deep attention mechanisms. IEEE Transactions on Intelligent Transportation Systems, 2021, vol. 22, no. 12, pp. 7804–7813. https://doi.org/10.1109/TITS.2020.3009289

9. Kang Y., Wang X., Lan Z. Q-adaptive: A multi-agent reinforcement learning based routing on dragonfly network. Proc. of the 30th International Symposium on High-Performance Parallel andDistributed Computing, 2021, pp. 189–200. https://doi.org/10.1145/3431379.3460650

10. Choi S., Yeung D.Y. Predictive Q-routing: A memory-based reinforcement learning approach to adaptive traffic control. Advances in Neural Information Processing Systems, 1995, vol. 8, pp. 945–951.

11. Watkins C.J., Dayan P. Q-learning. Machine Learning, 1992, vol. 8, no. 3, pp. 279–292. https://doi.org/10.1023/A:1022676722315

12. Mnih V., Kavukcuoglu K., Silver D., Graves A., Antonoglou I., Wierstra D., Riedmiller M. Playing atari with deep reinforcement learning. arXiv, 2013, arXiv:1312.5602. https://doi.org/10.48550/arXiv.1312.5602

13. Mukhutdinov D., Filchenkov A., Shalyto A., Vyatkin V. Multi-agent deep learning for simultaneous optimization for time and energy in distributed routing system. Future Generation Computer Systems, 2019, vol. 94, pp. 587–600. https://doi.org/10.1016/j.future.2018.12.037

14. Gao B., Pavel L. On the properties of the softmax function with application in game theory and reinforcement learning. arXiv, 2017, arXiv:1704.00805. https://doi.org/10.48550/arXiv.1704.00805

15. Mukhudinov D. Decentralized conveyor system control algorithm using multi-agent reinforcement learning methods. MSc Dissertation. St. Petersburg, ITMO University, 2019, 92 p. Available at: http://is.ifmo.ru/diploma-theses/2019/2_5458464771026191430.pdf (accessed: 01.10.2022). (in Russian)

16. Belkin M., Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering. Advances in Neural Information Processing Systems, 2001, pp. 585–591. https://doi.org/10.7551/mitpress/1120.003.0080

17. Benea M.T., Florea A.M., Seghrouchni A.E.F. CAmI: An agent oriented-language for the collective development of AmI environments. Proc. of the 20th International Conference on Control Systems and Computer Science (CSCS), 2015, pp. 749–756. https://doi.org/10.1109/CSCS.2015.136

18. Wang Y., Yao Q., Kwok J.T., Ni L.M. Generalizing from a few examples: A survey on few-shot learning. ACM Computing Surveys, 2020, vol. 53, no. 3, pp. 63. https://doi.org/10.1145/3386252

19. Liu J., Chen S., Wang B., Zhang J., Li N., Xu T. Attention as relation: learning supervised multi-head self-attention for relation extraction. Proc. of the 19th International Joint Conferences on Artificial Intelligence (IJCAI), 2020, pp. 3787–3793. https://doi.org/10.24963/ijcai.2020/524

20. Sola J., Sevilla J. Importance of input data normalization for the application of neural networks to complex industrial problems. IEEE Transactions on Nuclear Science, 1997, vol. 44, no. 3, pp. 1464–1468. https://doi.org/10.1109/23.589532

21. Baldi P., Sadowski P.J. Understanding dropout. Advances in Neural Information Processing Systems, 2013, vol. 26, pp. 26–35.


Review

For citations:


Gribanov T.A., Filchenkov A.A., Azarov A.A., Shalyto A.A. Multi-agent adaptive routing by multi-head-attention-based twin agents using reinforcement learning. Scientific and Technical Journal of Information Technologies, Mechanics and Optics. 2022;22(6):1178-1186. (In Russ.) https://doi.org/10.17586/2226-1494-2022-22-6-1178-1186

Views: 8


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2226-1494 (Print)
ISSN 2500-0373 (Online)