DOI:
https://doi.org/10.64539/sjcs.v1i2.2025.316Keywords:
UAV Swarm Coordination, Graph Attention Networks, Hybrid Reinforcement Learning, Swarm Intelligence Optimization, Control Barrier FunctionsAbstract
The autonomous UAV swarms have fundamental issues with strong coordination that arise under delays in communication, dynamic obstacles and noisy sensing environments, and the existing centralized or heuristic-based solutions are insufficient in addressing such issues. To cover this gap, this paper proposes a Graph Attention Network (GAT)-based Hybrid Reinforcement Learning and Swarm Intelligence Framework that can enable the communication-aware decentralized cooperation of UAVs. It is a multi-agent reinforcement learning and PSO, ACO, Differential Evolution, flocking behavior and Control Barrier Function-based safety correction, and GAT-inspired adaptive graph communication encoding. The results of the simulation of 18 episodes with 24 UAVs demonstrate that the reward, coverage, and collision were demonstrated to be improved by 32%, 27%, and 40% respectively as compared to a classical greedy baseline. The findings confirm the fact that the proposed hybrid GAT-RL architecture enables to promote significantly more scalability, safety, and real-time responsiveness of UAV swarms, which is a possibility on the path to large-scale autonomous aerial coordination.
References
[1] Y. Jiang, X.-X. Xu, M.-Y. Zheng, and Z.-H. Zhan, “Evolutionary computation for unmanned aerial vehicle path planning: a survey,” Artif Intell Rev, vol. 57, no. 10, p. 267, Aug. 2024, doi: 10.1007/s10462-024-10913-0.
[2] S. Ghambari, M. Golabi, L. Jourdan, J. Lepagnot, and L. Idoumghar, “UAV path planning techniques: a survey,” RAIRO - Operations Research, vol. 58, no. 4, pp. 2951–2989, Jul. 2024, doi: 10.1051/ro/2024073.
[3] B. Zhao et al., “Graph-based multi-agent reinforcement learning for collaborative search and tracking of multiple UAVs,” Chinese Journal of Aeronautics, vol. 38, no. 3, p. 103214, Mar. 2025, doi: 10.1016/j.cja.2024.08.045.
[4] Z. Feng, D. Wu, M. Huang, and C. Yuen, “Graph-Attention-Based Reinforcement Learning for Trajectory Design and Resource Assignment in Multi-UAV-Assisted Communication,” IEEE Internet Things J, vol. 11, no. 16, pp. 27421–27434, Aug. 2024, doi: 10.1109/JIOT.2024.3397823.
[5] M. Rahman, N. I. Sarkar, and R. Lutui, “A Survey on Multi-UAV Path Planning: Classification, Algorithms, Open Research Problems, and Future Directions,” Drones, vol. 9, no. 4, p. 263, Mar. 2025, doi: 10.3390/drones9040263.
[6] A. Imran, G. Beltrame, and D. St-Onge, “GNN-Based Decentralized Perception in Multi-Robot Systems for Predicting Worker Actions,” IEEE Robot Autom Lett, vol. 10, no. 6, pp. 6336–6343, Jun. 2025, doi: 10.1109/LRA.2025.3566610.
[7] W. Gao et al., “GNN-based deep reinforcement learning for computation task scheduling in autonomous multi-robot systems,” Journal of Systems Architecture, vol. 168, p. 103534, Nov. 2025, doi: 10.1016/j.sysarc.2025.103534.
[8] K. Hu, H. Pan, C. Han, J. Sun, D. An, and S. Li, “Graph Neural Network-Enhanced Multi-Agent Reinforcement Learning for Intelligent UAV Confrontation,” Aerospace, vol. 12, no. 8, p. 687, Jul. 2025, doi: 10.3390/aerospace12080687.
[9] M. Goarin and G. Loianno, “Graph Neural Network for Decentralized Multi-Robot Goal Assignment,” IEEE Robot Autom Lett, vol. 9, no. 5, pp. 4051–4058, May 2024, doi: 10.1109/LRA.2024.3371254.
[10] K. Garg et al., “Advances in the Theory of Control Barrier Functions: Addressing practical challenges in safe control synthesis for autonomous and robotic systems,” Annu Rev Control, vol. 57, p. 100945, 2024, doi: 10.1016/j.arcontrol.2024.100945.
[11] S. Khan, M. Baranwal, and S. Sukumar, “Decentralized Safe Control for Multi-Robot Navigation in Dynamic Environments with Limited Sensing,” in Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, May 2024, pp. 2330–2332. Accessed: Oct. 29, 2025. [Online]. Available: https://www.ifaamas.org/Proceedings/aamas2024/pdfs/p2330.pdf
[12] M. Harms, M. Kulkarni, N. Khedekar, M. Jacquet, and K. Alexis, “Neural Control Barrier Functions for Safe Navigation,” in 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Oct. 2024, pp. 10415–10422. doi: 10.1109/IROS58592.2024.10802694.
[13] Z. Zeng, S. Chen, X. Kong, X. Li, C. Zhang, and G. Yang, “Revised Control Barrier Function with Sensing of Threats from Relative Velocity Between Humans and Mobile Robots,” Sensors, vol. 25, no. 13, p. 4005, Jun. 2025, doi: 10.3390/s25134005.
[14] N. Bousias, L. Lindemann, and G. Pappas, “Deep Equivariant Multi-Agent Control Barrier Functions,” arXiv preprint arXiv:2506.07755, 2025, doi: 10.13140/RG.2.2.15670.20807.
[15] L. Ratnabala, A. Fedoseev, R. Peter, and D. Tsetserukou, “MAGNNET: Multi-Agent Graph Neural Network-based Efficient Task Allocation for Autonomous Vehicles with Deep Reinforcement Learning,” arXiv preprint arXiv:2502.02311, 2025.
[16] H. Peng and Y.-J. A. Zhang, “Graph Attention-based Decentralized Actor-Critic for Dual-Objective Control of Multi-UAV Swarms,” arXiv preprint arXiv:2506.09195, 2025.
[17] C. C. Ekechi, T. Elfouly, A. Alouani, and T. Khattab, “A Survey on UAV Control with Multi-Agent Reinforcement Learning,” Drones, vol. 9, no. 7, p. 484, Jul. 2025, doi: 10.3390/drones9070484.
[18] X. Zhao, R. Yang, L. Zhong, and Z. Hou, “Multi-UAV Path Planning and Following Based on Multi-Agent Reinforcement Learning,” Drones, vol. 8, no. 1, p. 18, Jan. 2024, doi: 10.3390/drones8010018.
[19] X. Kong, Y. Zhou, Z. Li, and S. Wang, “Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments,” Front Neurorobot, vol. 17, Jan. 2024, doi: 10.3389/fnbot.2023.1302898.
[20] J. Westheider, J. Rückin, and M. Popović, “Multi-UAV Adaptive Path Planning Using Deep Reinforcement Learning,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Oct. 2023, pp. 649–656. doi: 10.1109/IROS55552.2023.10342516.
[21] O. A. Amodu et al., “A comprehensive survey of deep reinforcement learning in UAV-assisted IoT data collection,” Vehicular Communications, vol. 55, p. 100949, Oct. 2025, doi: 10.1016/j.vehcom.2025.100949.
[22] H. Chen, Y. Lin, M. Fu, L. Yao, and M. Sheng, “A Survey on Reinforcement Learning Methods for UAV Systems,” ACM Comput Surv, vol. 58, no. 4, pp. 1–37, Mar. 2026, doi: 10.1145/3769426.
[23] W. Meng, X. Zhang, L. Zhou, H. Guo, and X. Hu, “Advances in UAV Path Planning: A Comprehensive Review of Methods, Challenges, and Future Directions,” Drones, vol. 9, no. 5, p. 376, May 2025, doi: 10.3390/drones9050376.
[24] C. Wang, S. Zhang, T. Ma, Y. Xiao, M. Z. Chen, and L. Wang, “Swarm intelligence: A survey of model classification and applications,” Chinese Journal of Aeronautics, vol. 38, no. 3, p. 102982, Mar. 2025, doi: 10.1016/j.cja.2024.03.019.
[25] M. J. Kobra, M. O. Rahman, and Z. M. I. Hossain, “AI-Powered Smart Grid for Sustainable Energy Distribution: A Comprehensive Simulation and Optimization Framework,” Middle East Research Journal of Engineering and Technology, vol. 5, no. 05, pp. 122–134, Oct. 2025, doi: 10.36348/merjet.2025.v05i05.003.
[26] H. Liu et al., “Adaptive multi-UAV cooperative path planning based on novel rotation artificial potential fields,” Knowl Based Syst, vol. 317, p. 113429, May 2025, doi: 10.1016/j.knosys.2025.113429.
[27] W. Li, Y. Xiong, and Q. Xiong, “Reinforcement Learning-Guided Particle Swarm Optimization for Multi-Objective Unmanned Aerial Vehicle Path Planning,” Symmetry (Basel), vol. 17, no. 8, p. 1292, Aug. 2025, doi: 10.3390/sym17081292.
[28] A. Seyyedabbasi, “A reinforcement learning-based metaheuristic algorithm for solving global optimization problems,” Advances in Engineering Software, vol. 178, p. 103411, Apr. 2023, doi: 10.1016/j.advengsoft.2023.103411.
[29] S. Lin, J. Wang, B. Huang, X. Kong, and H. Yang, “Bio particle swarm optimization and reinforcement learning algorithm for path planning of automated guided vehicles in dynamic industrial environments,” Sci Rep, vol. 15, no. 1, p. 463, Jan. 2025, doi: 10.1038/s41598-024-84821-2.
[30] B. Zhao, M. Huo, Z. Li, Z. Yu, and N. Qi, “Graph-based multi-agent reinforcement learning for large-scale UAVs swarm system control,” Aerosp Sci Technol, vol. 150, p. 109166, Jul. 2024, doi: 10.1016/j.ast.2024.109166.
[31] M. Cavorsi, L. Sabattini, and S. Gil, “Multirobot Adversarial Resilience Using Control Barrier Functions,” IEEE Transactions on Robotics, vol. 40, pp. 797–815, 2024, doi: 10.1109/TRO.2023.3341570.
[32] M. J. Kobra, M. O. Rahman, and A. M. Nakib, “Hybrid K-means, Random Forest, and Simulated Annealing for Optimizing Underwater Image Segmentation,” Scientific Journal of Engineering Research, vol. 1, no. 4, pp. 153–163, 2025, doi: 10.64539/sjer.v1i4.2025.46.
[33] W. Skarka and R. Ashfaq, “Hybrid Machine Learning and Reinforcement Learning Framework for Adaptive UAV Obstacle Avoidance,” Aerospace, vol. 11, no. 11, p. 870, Oct. 2024, doi: 10.3390/aerospace11110870.
[34] X. Tang et al., “Deep Graph Reinforcement Learning for UAV-Enabled Multi-User Secure Communications,” IEEE Trans Mob Comput, vol. 24, no. 9, pp. 8780–8793, Sep. 2025, doi: 10.1109/TMC.2025.3558790.
[35] H. Ebel and P. Eberhard, “A comparative look at two formation control approaches based on optimization and algebraic graph theory,” Rob Auton Syst, vol. 136, p. 103686, Feb. 2021, doi: 10.1016/j.robot.2020.103686.
[36] F. Gulzar, N. M. Khan, Y. A. Butt, and A. I. Bhatti, “Constraint-oriented formation control of multi-robot system in leaderless consensus under confined conditions,” Systems Science & Control Engineering, vol. 12, no. 1, Dec. 2024, doi: 10.1080/21642583.2024.2436666.
[37] S. Liu, L. Liu, and Z. Yu, “Safe robust multi-agent reinforcement learning with neural control barrier functions and safety attention mechanism,” Inf Sci (N Y), vol. 690, p. 121567, Feb. 2025, doi: 10.1016/j.ins.2024.121567.
[38] M. J. Kobra, M. O. Rahman, Z. M. I. Hossain, and M. Rashid, “Optimizing self-adaptive IoT systems for energy efficiency and predictive maintenance in industrial automation,” Computer Science & IT Research Journal, vol. 6, no. 9, pp. 649–661, Oct. 2025, doi: 10.51594/csitrj.v6i9.2064.
[39] M. J. Kobra, M. O. Rahman, and Z. M. I. Hossain, “Comparative Analysis of MRAC, DRL, and NN-MPC for robust, adaptive, and energy-efficient control in cyber-physical systems,” Computer Science & IT Research Journal, vol. 6, no. 9, pp. 632–648, Oct. 2025, doi: 10.51594/csitrj.v6i9.2063.
[40] Q. Wu, K. Liu, L. Chen, and J. Lü, “Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment,” arXiv preprint arXiv:2310.16659, 2023.




