A Hybrid Adaptive Gradient-Based Sled Dog Optimizer for Enhanced Robotic Decision-Making in Industrial Applications

Mohammad Rustom Al Nasar

doi:10.31763/ijrcs.v5i2.1788


A Hybrid Adaptive Gradient-Based Sled Dog Optimizer for Enhanced Robotic Decision-Making in Industrial Applications

^{(1) *} Mohammad Rustom Al Nasar

(American University in the Emirates, United Arab Emirates)
^*corresponding author

Abstract

As autonomous robotic systems are increasingly used in industrial applications, there is a growing need to create efficient and automated decision-making capabilities that can work in complex environments with a range of possible actions. RL offers an effective way to train robotic agents. Still, conventional RL techniques tend to have issues with slow and unstable policy learning, poor convergence, and weak exploration-exploitation balance. To solve this problem, this paper develops a Hybrid optimization approach that incorporates reinforcement learning, deep learning, and metaheuristic optimization for more robust robotic control and adaptability. The new approach utilizes a Deep Q-Network with Experience Replay for learning policies. At the same time, an Adaptive Gradient-Based Sled Dog Optimizer is used to improve and optimize decision-making. Epsilon-greedy selection combined with Noisy Network is used for hybrid exploration-exploitation, which helps learning. The effectiveness of the proposed method was validated against five existing methods, which include Conservative Q-Learning, Behavior Regularized Actor-Critic, Implicit Q-Learning, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic, over the three benchmark robotic datasets of MuJoCo, D4RL, and OpenAI Gym Robotics Suite. The vast majority of results provide compelling support for the argument that the proposed approach consistently outperformed the baseline approaches in terms of accuracy, precision, recall, stability, speed of convergence, and degree of generalization. The improvement in performance was confirmed by validation methods such as analyzing confidence intervals and computing results of p-values.

Keywords

Artificial Intelligence; Autonomous Systems; Machine Learning; Metaheuristic Optimization; Reinforcement Learning

DOI

https://doi.org/10.31763/ijrcs.v5i2.1788

Article metrics

10.31763/ijrcs.v5i2.1788 Abstract views : 197 | PDF views : 132

Cite

How to cite item

Full Text

Download

References

[1] I. Surjandari et al., "Accelerating Innovation in The Industrial Revolution 4.0 Era for a Sustainable Future," International Journal of Technology, vol. 13, no. 5, pp. 944-948, 2022, https://doi.org/10.14716/ijtech.v13i5.6033.

[2] M. A. Berawi et al., "Accelerating Sustainable Energy Development through Industry 4.0 Technologies," International Journal of Technology, vol. 11, no. 8, pp. 1463-1467, 2020, https://doi.org/10.14716/ijtech.v11i8.4627.

[3] R. Goel and P. Gupta, "Robotics and industry 4.0," A roadmap to industry 4.0: Smart production, sharp business and sustainable development, pp. 157-169, 2020, https://doi.org/10.1007/978-3-030-14544-6_9.

[4] G. Fragapane, D. Ivanov, M. Peron, F. Sgarbossa, and J. O. Strandhagen, "Increasing flexibility and productivity in Industry 4.0 production networks with autonomous mobile robots and smart intralogistics," Annals of Operations Research, vol. 308, pp. 125-143, 2022, https://doi.org/10.1007/s10479-020-03526-7.

[5] M. Javaid, A. Haleem, R. P. Singh, and R. Suman, "Substantial capabilities of robotics in enhancing industry 4.0 implementation," Cognitive Robotics, vol. 1, pp. 58-75, 2021, https://doi.org/10.1016/j.cogr.2021.06.001.

[6] Y. Cohen, H. Naseraldin, A. Chaudhuri, and F. Pilati, "Assembly systems in Industry 4.0 era: a road map to understand Assembly 4.0," The International Journal of Advanced Manufacturing Technology, vol. 105, pp. 4037-4054, 2019, https://doi.org/10.1007/s00170-019-04203-1.

[7] T. M. Moerland, J. Broekens, A. Plaat, and C. M. Jonker, "Model-based reinforcement learning: A survey," Foundations and Trends® in Machine Learning, vol. 16, pp. 1-118, 2023, https://doi.org/10.1561/2200000086.

[8] A. Heuillet, F. Couthouis, and N. Díaz-Rodríguez, "Explainability in deep reinforcement learning," Knowledge-Based Systems, vol. 214, p. 106685, 2021, https://doi.org/10.1016/j.knosys.2020.106685.

[9] M. Singh and S. A. L. A. Khan, "Advances in Autonomous Robotics: Integrating AI and Machine Learning for Enhanced Automation and Control in Industrial Applications," International Journal for Multidimensional Research Perspectives, vol. 2, no. 4, pp. 74-90, 2024, https://doi.org/10.61877/ijmrp.v2i4.135.

[10] B. Singh, R. Kumar, and V. P. Singh, "Reinforcement learning in robotic applications: a comprehensive survey," Artificial Intelligence Review, vol. 55, pp. 945-990, 2022, https://doi.org/10.1007/s10462-021-09997-9.

[11] G. Dulac-Arnold et al., "Challenges of real-world reinforcement learning: definitions, benchmarks and analysis," Machine Learning, vol. 110, pp. 2419-2468, 2021, https://doi.org/10.1007/s10994-021-05961-4.

[12] O. Dogru et al., "Reinforcement Learning in Process Industries: Review and Perspective," IEEE/CAA Journal of Automatica Sinica, vol. 11, no. 2, pp. 283-300, 2024, https://doi.org/10.1109/JAS.2024.124227.

[13] S. Gupta, G. Singal, and D. Garg, "Deep reinforcement learning techniques in diversified domains: a survey," Archives of Computational Methods in Engineering, vol. 28, pp. 4715-4754, 2021, https://doi.org/10.1007/s11831-021-09552-3.

[14] S. Biswas et al., "Integrating Differential Evolution into Gazelle Optimization for advanced global optimization and engineering applications," Computer Methods in Applied Mechanics and Engineering, vol. 434, p. 117588, 2025, https://doi.org/10.1016/j.cma.2024.117588.

[15] L. Abualigah et al., "Adaptive Gbest-Guided Atom Search Optimization for Designing Stable Digital IIR Filters," Circuits, Systems, and Signal Processing, pp. 1-23, 2025, https://doi.org/10.1007/s00034-025-02997-y.

[16] L. Abualigah, A. Diabat, S. Mirjalili, M. Abd Elaziz, and A. H. Gandomi, "The arithmetic optimization algorithm," Computer Methods in Applied Mechanics and Engineering, vol. 376, p. 113609, 2021, https://doi.org/10.1016/j.cma.2020.113609.

[17] J. O. Agushaka, A. E. Ezugwu, and L. Abualigah, "Dwarf mongoose optimization algorithm," Computer Methods in Applied Mechanics and Engineering, vol. 391, p. 114570, 2022, https://doi.org/10.1016/j.cma.2022.114570.

[18] H. Wu, J. Liu, and B. Liang, "AI-Driven Supply Chain Transformation in Industry 5.0: Enhancing Resilience and Sustainability," Journal of the Knowledge Economy, pp. 1-43, 2024, https://doi.org/10.1007/s13132-024-01999-6.

[19] S. Ekinci, D. Izci, V. Gider, L. Abualigah, M. Bajaj, and I. Zaitsev, "Optimized FOPID controller for steam condenser system in power plants using the sinh-cosh optimizer," Scientific Reports, vol. 15, p. 6876, 2025, https://doi.org/10.1038/s41598-025-90005-3.

[20] M. Li, Z. Li and Z. Cao, "Enhancing Car-Following Performance in Traffic Oscillations Using Expert Demonstration Reinforcement Learning," IEEE Transactions on Intelligent Transportation Systems, vol. 25, no. 7, pp. 7751-7766, 2024, https://doi.org/10.1109/TITS.2024.3368474.

[21] Z. Yang, Z. Zheng, J. Kim, and H. Rakha, "Eco-driving strategies using reinforcement learning for mixed traffic in the vicinity of signalized intersections," Transportation Research Part C: Emerging Technologies, vol. 165, p. 104683, 2024, https://doi.org/10.1016/j.trc.2024.104683.

[22] M. Chi, K. VanLehn, D. Litman, and P. Jordan, "Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies," User Modeling and User-Adapted Interaction, vol. 21, pp. 137-180, 2011, https://doi.org/10.1007/s11257-010-9093-1.

[23] A. P. Giron et al., "Developmental changes in exploration resemble stochastic optimization," Nature Human Behaviour, vol. 7, pp. 1955-1967, 2023, https://doi.org/10.1038/s41562-023-01662-1.

[24] T.-H. Chu and D. Robey, "Explaining changes in learning and work practice following the adoption of online learning: a human agency perspective," European Journal of Information Systems, vol. 17, pp. 79-98, 2008, https://doi.org/10.1057/palgrave.ejis.3000731.

[25] C. Wong, E. Yang, X.-T. Yan, and D. Gu, "Autonomous robots for harsh environments: a holistic overview of current solutions and ongoing challenges," Systems Science & Control Engineering, vol. 6, pp. 213-219, 2018, https://doi.org/10.1080/21642583.2018.1477634.

[26] E. Zereik, M. Bibuli, N. Miškovi?, P. Ridao, and A. Pascoal, "Challenges and future trends in marine robotics," Annual Reviews in Control, vol. 46, pp. 350-368, 2018, https://doi.org/10.1016/j.arcontrol.2018.10.002.

[27] H. Parmar, T. Khan, F. Tucci, R. Umer, and P. Carlone, "Advanced robotics and additive manufacturing of composites: towards a new era in Industry 4.0," Materials and Manufacturing Processes, vol. 37, pp. 483-517, 2022, https://doi.org/10.1080/10426914.2020.1866195.

[28] A. Grau, M. Indri, L. Lo Bello and T. Sauter, "Robots in Industry: The Past, Present, and Future of a Growing Collaboration With Humans," IEEE Industrial Electronics Magazine, vol. 15, no. 1, pp. 50-61, 2021, https://doi.org/10.1109/MIE.2020.3008136.

[29] W. Pryor, B. P. Vagvolgyi, A. Deguet, S. Leonard, L. L. Whitcomb and P. Kazanzides, "Interactive Planning and Supervised Execution for High-Risk, High-Latency Teleoperation," 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1857-1864, 2020, https://doi.org/10.1109/IROS45743.2020.9340800.

[30] M. Hassan, D. Liu, and G. Paul, "Collaboration of multiple autonomous industrial robots through optimal base placements," Journal of Intelligent & Robotic Systems, vol. 90, pp. 113-132, 2018, https://doi.org/10.1007/s10846-017-0647-x.

[31] G. Fragapane, R. De Koster, F. Sgarbossa, and J. O. Strandhagen, "Planning and control of autonomous mobile robots for intralogistics: Literature review and research agenda," European Journal of Operational Research, vol. 294, no. 2, pp. 405-426, 2021, https://doi.org/10.1016/j.ejor.2021.01.019.

[32] J. Fan, Z. Wang, Y. Xie, and Z. Yang, "A Theoretical Analysis of Deep Q-learning," ArXiv, pp. 486-489, 2020, https://doi.org/10.48550/arXiv.1901.00137.

[33] K. Azizzadenesheli, E. Brunskill and A. Anandkumar, "Efficient Exploration Through Bayesian Deep Q-Networks," 2018 Information Theory and Applications Workshop (ITA), pp. 1-9, 2018, https://doi.org/10.1109/ITA.2018.8503252.

[34] V. Zangirolami and M. Borrotti, "Dealing with uncertainty: Balancing exploration and exploitation in deep recurrent reinforcement learning," Knowledge-Based Systems, vol. 293, p. 111663, 2024, https://doi.org/10.1016/j.knosys.2024.111663.

[35] L. N. Duong et al., "A review of robotics and autonomous systems in the food industry: From the supply chains perspective," Trends in Food Science & Technology, vol. 106, pp. 355-364, 2020, https://doi.org/10.1016/j.tifs.2020.10.028.

[36] T. Zhang et al., "Current Trends in the Development of Intelligent Unmanned Autonomous Systems," Frontiers of Information Technology & Electronic Engineering, vol. 18, pp. 68-85, 2017, https://doi.org/10.1631/FITEE.1601650.

[37] W. D. Smart and L. Pack Kaelbling, "Effective reinforcement learning for mobile robots," Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292), vol. 4, pp. 3404-3410, 2002, https://doi.org/10.1109/ROBOT.2002.1014237.

[38] H. Jahanshahi and Z. H. Zhu, "Review of machine learning in robotic grasping control in space application," Acta Astronautica, vol. 220, pp. 37-61, 2024, https://doi.org/10.1016/j.actaastro.2024.04.012.

[39] G. G. Rigatos, "Modelling and control for intelligent industrial systems," Adaptive Algorithms in Robotics and Industrial Engineering, 2011, https://doi.org/10.1007/978-3-642-17875-7.

[40] B. I. Afolayan, A. Ghosh, J. F. Calderin and A. D. Masegosa, "Emerging Trends in Machine Learning Assisted Optimization Techniques Across Intelligent Transportation Systems," IEEE Access, vol. 12, pp. 173981-174005, 2024, https://doi.org/10.1109/ACCESS.2024.3501775.

[41] G. Liu, W. Sun, W. Xie, and Y. Xu, "Learning visual path–following skills for industrial robot using deep reinforcement learning," The International Journal of Advanced Manufacturing Technology, vol. 122, pp. 1099-1111, 2022, https://doi.org/10.1007/s00170-022-09800-1.

[42] G. G. Devarajan, S. M. Nagarajan, T. Ramana, T. Vignesh, U. Ghosh, and W. Alnumay, "DDNSAS: Deep reinforcement learning based deep Q-learning network for smart agriculture system," Sustainable Computing: Informatics and Systems, vol. 39, p. 100890, 2023, https://doi.org/10.1016/j.suscom.2023.100890.

[43] S. Carta, A. Ferreira, A. S. Podda, D. R. Recupero, and A. Sanna, "Multi-DQN: An ensemble of Deep Q-learning agents for stock market forecasting," Expert Systems with Applications, vol. 164, p. 113820, 2021, https://doi.org/10.1016/j.eswa.2020.113820.

[44] G. Hu, M. Cheng, E. H. Houssein, A. G. Hussien, and L. Abualigah, "SDO: A novel sled dog-inspired optimizer for solving engineering problems," Advanced Engineering Informatics, vol. 62, p. 102783, 2024, https://doi.org/10.1016/j.aei.2024.102783.

[45] Y. Li, W. Vanhaverbeke, and W. Schoenmakers, "Exploration and exploitation in innovation: Reframing the interpretation," Creativity and Innovation Management, vol. 17, pp. 107-126, 2008, https://doi.org/10.1111/j.1467-8691.2008.00477.x.

[46] T. Song, D. Li, L. Cao and K. Hirasawa, "Kernel-Based Least Squares Temporal Difference With Gradient Correction," IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 4, pp. 771-782, 2016, https://doi.org/10.1109/TNNLS.2015.2424233.

[47] E. Todorov, "Convex and analytically-invertible dynamics with contacts and constraints: Theory and implementation in MuJoCo," 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 6054-6061, 2014, https://doi.org/10.1109/ICRA.2014.6907751.

[48] J. Fu, A. Kumar, O. Nachum, G. Tucker, and S. Levine, "D4rl: Datasets for deep data-driven reinforcement learning," arXiv, 2020, https://doi.org/10.48550/arXiv.2004.07219.

[49] S. Balasubramanian, "Intrinsically Motivated Multi-Goal Reinforcement Learning Using Robotics Environment Integrated with OpenAI Gym," Journal of Science & Technology, vol. 4, no. 5, pp. 46-60, 2023, https://doi.org/10.55662/JST.2023.4502.

[50] A. Kumar, A. Zhou, G. Tucker, and S. Levine, "Conservative Q-Learning for Offline Reinforcement Learning," ArXiv, 2020, https://doi.org/10.48550/arXiv.2006.04779.

[51] Y. Wu, G. Tucker, and O. Nachum, "Behavior regularized offline reinforcement learning," arXiv, 2019, https://doi.org/10.48550/arXiv.1911.11361.

[52] I. Kostrikov, A. Nair, and S. Levine, "Offline reinforcement learning with implicit q-learning," arXiv, 2021, https://doi.org/10.48550/arXiv.2110.06169.

[53] S. Fujimoto, H. Hoof, and D. Meger, "Addressing function approximation error in actor-critic methods," arXiv, 2018, https://doi.org/10.48550/arXiv.1802.09477.

[54] T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, " Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor," ArXiv, 2018, https://doi.org/10.48550/arXiv.1801.01290.

[55] F. Morstatter, L. Wu, T. H. Nazer, K. M. Carley and H. Liu, "A new approach to bot detection: Striking the balance between precision and recall," 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 533-540, 2016, https://doi.org/10.1109/ASONAM.2016.7752287.

[56] T.-H. Lin, C.-T. Chang, B.-H. Yang, C.-C. Hung, and K.-W. Wen, "AI-powered shotcrete robot for enhancing structural integrity using ultra-high performance concrete and visual recognition," Automation in Construction, vol. 155, p. 105038, 2023, https://doi.org/10.1016/j.autcon.2023.105038.

[57] Y. Chen, H. Sun, G. Zhou, and B. Peng, "Fruit Classification Model Based on Residual Filtering Network for Smart Community Robot," Wireless Communications and Mobile Computing, vol. 2021, p. 5541665, 2021, https://doi.org/10.1155/2021/5541665.

[58] O. M. Omisore et al., "Automatic tool segmentation and tracking during robotic intravascular catheterization for cardiac interventions," Quantitative imaging in medicine and surgery, vol. 11, p. 2688, 2021, https://doi.org/10.21037/qims-20-1119.

[59] A. Srour, A. Franchi, P. R. Giordano and M. Cognetti, "Experimental Validation of Sensitivity-Aware Trajectory Planning for a Redundant Robotic Manipulator Under Payload Uncertainty," IEEE Robotics and Automation Letters, vol. 10, no. 2, pp. 1561-1568, 2025, https://doi.org/10.1109/LRA.2024.3519857.

[60] S. Greenland et al., "Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations," European Journal of Epidemiology, vol. 31, pp. 337-350, 2016, https://doi.org/10.1007/s10654-016-0149-3.

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

About the Journal	Journal Policies	Author	Information
Focus and Scope Editorial Board International Peer Review Open Access Statement Sponsorships Contact Us Google Scholar Most Cited Paper	Publication Ethics Peer Review Policy Review Guideline Archiving	Author Guidelines Online Submission Author Fee / Article Publication Charge Plagiarism Policy Article withdrawal	For Readers For Authors Journal History

International Journal of Robotics and Control Systems
e-ISSN: 2775-2658
Website: https://pubs2.ascee.org/index.php/IJRCS
Email: ijrcs@ascee.org
Organized by: Association for Scientific Computing Electronics and Engineering (ASCEE), Peneliti Teknologi Teknik Indonesia, Department of Electrical Engineering, Universitas Ahmad Dahlan and Kuliah Teknik Elektro
Published by: Association for Scientific Computing Electronics and Engineering (ASCEE)
Office: Jalan Janti, Karangjambe 130B, Banguntapan, Bantul, Daerah Istimewa Yogyakarta, Indonesia

Username
Password
Remember me