Dynamic Ball Balancing Using Deep Deterministic Policy Gradient (DDPG)-Controlled Robotic Arm for Precision Automation

(1) * K Vijaya Lakshmi Mail (Deemed to be University, India)
(2) M Manimozhi Mail (Vellore Institute of Technology, India)
(3) J Vimala Kumari Mail (Deemed to be University, India)
*corresponding author

Abstract


This paper presents a reinforcement learning (RL)-based solution for dynamic ball balancing using a robotic arm controlled by the Deep Deterministic Policy Gradient (DDPG) algorithm. The problem addressed is maintaining ball stability under external disturbances in automated manufacturing. The proposed solution enables adaptive, precise control on flat surfaces. The research contribution is a comparative evaluation of DDPG and Soft Actor-Critic (SAC) algorithms for trajectory control and stabilization. A simulated environment is used to train the RL agents across multiple initial ball positions. Key performance metrics-settling time, rise time, overshoot, and steady-state error-are analyzed. Results show DDPG outperforms SAC with smoother trajectories, ~25% faster settling times, and significantly lower overshoot and steady-state errors. Visual analysis confirms that DDPG consistently drives the ball to the center with minimal deviation. These findings highlight DDPG’s advantages in control accuracy and stability. In conclusion, the DDPG-based approach proves highly effective for precision automation tasks where fast, stable, and reliable control is essential.

Keywords


Robotic Arm; Adaptive Control; Reinforcement Learning; DDPG; SAC; Automation

   

DOI

https://doi.org/10.31763/ijrcs.v5i3.1892
      

Article metrics

10.31763/ijrcs.v5i3.1892 Abstract views : 40 | PDF views : 23

   

Cite

   

Full Text

Download

References


[1] A. C. Kadam and S. M. Patil, "Reinforcement learning-based collision avoidance of Kinova Gen3 robot for ball balancing task," International Research Journal of Multidisciplinary Scope (IRJMS), vol. 5, no. 1, pp. 336-354, 2024, https://doi.org/10.47857/irjms.2024.v05i01.0213.

[2] D. Pieczy?ski, B. Ptak, M. Kraft, and P. Drapikowski, "LunarSim: Lunar rover simulator focused on high visual fidelity and ROS 2 integration for advanced computer vision algorithm development," Applied Sciences, vol. 13, no. 22, p. 12401, 2023, https://doi.org/10.3390/app132212401.

[3] J. W. V. der Blonk, "Modeling and Control of a Ball-Balancing Robot," Bachelor's thesis, University of Twente, 2014, https://essay.utwente.nl/65559/.

[4] B. Tomar, N. Kumar, and M. Sreejeth, "Real-time balancing and position tracking control of 2-DOF ball balancer using PID with integral anti-windup controller," Journal of Vibration Engineering & Technologies, vol. 12, no. 3, pp. 5055-5071, 2024, https://doi.org/10.1007/s42417-023-01179-x.

[5] W. Yang, W. Zhang, D. Xu, W. Yan, "Fuzzy model predictive control for 2-DOF robotic arms,” Assembly Automation, vol. 38 no. 5, pp. 568-575, 2018, https://doi.org/10.1108/AA-11-2017-162.

[6] Y. Cen et al., "Digital twin-empowered robotic arm control: An integrated PPO and fuzzy PID approach," Mathematics, vol. 13, no. 2, p. 216, 2025, https://doi.org/10.3390/math13020216.

[7] B. Tomar, N. Kumar, and M. Sreejeth, "Real-time balancing and position tracking control of 2-DOF ball balancer using PID with integral anti-windup controller," Journal of Vibration Engineering & Technologies, vol. 12, no. 3, pp. 5055-5071, 2024, https://doi.org/10.1007/s42417-023-01179-x.

[8] Y. D. Son, S. D. Bin, and G. G. Jin, "Stability analysis of a nonlinear PID controller," International Journal of Control, Automation and Systems, vol. 19, pp. 3400-3408, 2021, https://doi.org/10.1007/s12555-020-0599-y.

[9] Y.-D. Song, “Control of Nonlinear Systems via PI, PD, and PID,” CRC Press, 2018, https://doi.org/10.1201/9780429455070.

[10] M. Shamsuzzoha, G. L. Raja, “PID Control for Linear and Nonlinear Industrial Processes,” IntechOpen, 2023, https://doi.org/10.5772/intechopen.100749.

[11] A. K. Varshney and V. Torra, "Literature review of the recent trends and applications in various fuzzy rule-based systems," International Journal of Fuzzy Systems, vol. 25, no. 6, pp. 2163-2186, 2023, https://doi.org/10.1007/s40815-023-01534-w.

[12] L. Magdalena, "Fuzzy Rule-Based Systems," Springer Handbook of Computational Intelligence, pp. 203-218, 2015, https://doi.org/10.1007/978-3-662-43505-2_13.

[13] A. Gegov, “Complexity Management in Fuzzy Systems: A Rule Base Compression Approach,” Studies in Fuzziness and Soft Computing, 2007, https://dblp.uni-trier.de/db/series/sfsc/index.html#Gegov07.

[14] T. Salzmann, E. Kaufmann, J. Arrizabalaga, M. Pavone, D. Scaramuzza, and M. Ryll, "Real-time neural MPC: Deep learning model predictive control for quadrotors and agile robotic platforms," IEEE Robotics and Automation Letters, vol. 8, no. 4, pp. 2397-2404, 2023, https://doi.org/10.1109/LRA.2023.3246839.

[15] M. Schwenzer, M. Ay, T. Bergs, and D. Abel, "Review on model predictive control: An engineering perspective," The International Journal of Advanced Manufacturing Technology, vol. 117, no. 5, pp. 1327-1349, 2021, https://doi.org/10.1007/s00170-021-07682-3.

[16] A. Bemporad, "Model Predictive Control Design: New Trends and Tools," Proceedings of the 45th IEEE Conference on Decision and Control, pp. 6678-6683, 2006, https://doi.org/10.1109/CDC.2006.377490.

[17] M. A. Mohammed Eltoum, A. Hussein, and M. A. Abido, "Hybrid fuzzy fractional-order PID-based speed control for brushless DC motor," Arabian Journal for Science and Engineering, vol. 46, no. 10, pp. 9423-9435, 2021, https://doi.org/10.1007/s13369-020-05262-3.

[18] P. Mohindru, "Review on PID, fuzzy and hybrid fuzzy PID controllers for controlling non-linear dynamic behaviour of chemical plants," Artificial Intelligence Review, vol. 57, no. 4, p. 97, 2024, https://doi.org/10.1007/s10462-024-10743-0.

[19] A. Modirrousta, M. Khodabandeh, "A novel nonlinear hybrid controller design for an uncertain quadrotor with disturbances," Aerospace Science and Technology, vol. 45, pp. 294-308, 2015, https://doi.org/10.1016/j.ast.2015.05.022.

[20] T. P. Lillicrap et al., "Continuous Control with Deep Reinforcement Learning," arXiv, 2015, https://doi.org/10.48550/arXiv.1509.02971.

[21] H. Anas, W. H. Ong, and O. A. Malik, "Comparison of deep Q-learning, Q-learning and SARSA reinforced learning for robot local navigation," International Conference on Robot Intelligence Technology and Applications, pp. 443-454, 2021, https://doi.org/10.1007/978-3-030-97672-9_40.

[22] S. Höfer et al., "Sim2real in robotics and automation: Applications and challenges," IEEE Transactions on Automation Science and Engineering, vol. 18, no. 2, pp. 398-400, 2021, https://doi.org/10.1109/TASE.2021.3064065.

[23] Y. Zhao, Y. Zeng, Q. Long, Y. N. Wu, and S. C. Zhu, "Sim2Plan: Robot motion planning via message passing between simulation and reality," Proceedings of the Future Technologies Conference, pp. 29-42, 2023, https://doi.org/10.1007/978-3-031-47454-5_3.

[24] K. S. Nwe, W. P. Maung, E. E. Htwe, "Dynamic Modeling and Simulation of Articulated Robotic Arm with MATLAB Robotics System Toolbox," International Journal of Scientific Development and Research (IJSDR), vol. 18, no. 11, pp. 67-72, 2025, https://www.ijsdr.org/papers/IJSDR1811012.pdf.

[25] P. Butlin, "Reinforcement learning and artificial agency," Mind & Language, vol. 39, no. 1, pp. 22-38, 2024, https://doi.org/10.1111/mila.12458.

[26] R. S. Sutton, A. G. Barto, “Reinforcement Learning: An Introduction,” MIT Press, 2018, https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf.

[27] M. Lapan, “Deep Reinforcement Learning Hands-On,” Packt Publishing, 2018, https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On.

[28] E. Ginzburg-Ganz et al., "Reinforcement learning model-based and model-free paradigms for optimal control problems in power systems: Comprehensive review and future directions," Energies, vol. 17, no. 21, p. 5307, 2024, https://doi.org/10.3390/en17215307.

[29] M. Wiering, M. Otterlo, “Reinforcement Learning: State-of-the-Art,” Springer, 2012, https://doi.org/10.1007/978-3-642-27645-3.

[30] C. Szepesvári, M. Wiering, “Foundations of Reinforcement Learning,” Springer, 2021, https://odi.inf.ethz.ch/teaching/FoRL.html.

[31] V. Mnih et al, "Human-level control through deep reinforcement learning," Nature, vol. 518, pp. 529-533, 2015, https://doi.org/10.1038/nature14236.

[32] K. Arulkumaran, M. P. Deisenroth, M. Brundage and A. A. Bharath, "Deep Reinforcement Learning: A Brief Survey," IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26-38, 2017, https://doi.org/10.1109/MSP.2017.2743240.

[33] S. R. Afzali, M. Shoaran, and G. Karimian, "A modified convergence DDPG algorithm for robotic manipulation," Neural Processing Letters, vol. 55, no. 8, pp. 11637-11652, 2023, https://doi.org/10.1007/s11063-023-11393-z.

[34] T. Lindner and A. Milecki, "Reinforcement learning-based algorithm to avoid obstacles by the anthropomorphic robotic arm," Applied Sciences, vol. 12, no. 13, p. 6629, 2022, https://doi.org/10.3390/app12136629.

[35] GitHub Repository, "Robot Arm Control with Reinforcement Learning," GitHub, 2025, https://github.com/kaymen99/Robot-arm-control-with-RL.

[36] Matlab, “Deep Deterministic Policy Gradient (DDPG) Agent,” MATLAB & Simulink Documentation, 2025, https://www.mathworks.com/help/reinforcement-learning/ug/ddpg-agents.html.

[37] R. He, H. Lv, S. Zhang, D. Zhang, and H. Zhang, "Lane following method based on improved DDPG algorithm," Sensors, vol. 21, no. 14, p. 4827, 2021, https://doi.org/10.3390/s21144827.

[38] J. Heredia, R. J. Kirschner, C. Schlette, S. Abdolshah, S. Haddadin and M. B. Kjærgaard, "Labelling Lightweight Robot Energy Consumption: A Mechatronics-Based Benchmarking Metric Set," 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1789-1796, 2023, https://doi.org/10.1109/IROS55552.2023.10341484.

[39] K. Kasaura, S. Miura, T. Kozuno, R. Yonetani, K. Hoshino and Y. Hosoe, "Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control With Action Constraints," IEEE Robotics and Automation Letters, vol. 8, no. 8, pp. 4449-4456, 2023, https://doi.org/10.1109/LRA.2023.3284378.

[40] Matlab, "Train SAC Agent for Ball Balance Control," MATLAB & Simulink Documentation, 2025, https://www.mathworks.com/help/reinforcement-learning/ug/train-sac-agent-for-ball-balance-control.html.

[41] MathWorks, “Simscape Multibody Documentation,” MathWorks, 2023, https://www.mathworks.com/help/physmod/sm/.

[42] MathWorks, “Getting Started with Reinforcement Learning Toolbox,” MathWorks, 2023, https://www.mathworks.com/products/reinforcement-learning.html.

[43] R. R. Qian, Y. Feng, M. Jiang, and L. Liu, "Design and realization of intelligent aero-engine DDPG controller," Journal of Physics: Conference Series, vol. 2195, no. 1, p. 012056, 2022, https://doi.org/10.1088/1742-6596/2195/1/012056.

[44] N. M. Ashraf, R. R. Mostafa, R. H. Sakr, and M. Z. Rashad, "Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm," PLOS ONE, vol. 16, no. 6, p. e0252754, 2021, https://doi.org/10.1371/journal.pone.0252754.

[45] V. Sumalatha, S. Pabboju, "An Overview of Deep Deterministic Policy Gradient Algorithm and Applications," IOSR Journal of Computer Engineering, vol. 26, no. 5, pp. 26-28, 2024, https://doi.org/10.9790/0661-2605032628.

[46] Y. Hou, L. Liu, Q. Wei, X. Xu and C. Chen, "A novel DDPG method with prioritized experience replay," 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 316-321, 2017, https://doi.org/10.1109/SMC.2017.8122622.

[47] F. Zhou, L. Zhao, X. Ding, and S. Wang, "Enhanced DDPG algorithm for latency and energy-efficient task scheduling in MEC systems," Discover Internet of Things, vol. 5, no. 1, p. 40, 2025, https://doi.org/10.1007/s43926-025-00134-4.

[48] N. M. Ashraf, R. R. Mostafa, R. H. Sakr, and M. Z. Rashad, "Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm," PLOS ONE, vol. 16, no. 6, p. e0252754, 2021, https://doi.org/10.1371/journal.pone.0252754.

[49] E. H. Sumiea, et al., "Deep deterministic policy gradient algorithm: A systematic review," Heliyon, vol. 10, no. 9, p. e30697, 2024, https://doi.org/10.1016/j.heliyon.2024.e30697.

[50] S. W. Shneen, H. H. Juhi, and H. A. Najim, "Simulation and modeling with designing for the proportional, integral and derivative control of industrial robotic arm by using MATLAB/Simulink," International Journal of Robotics and Control Systems, vol. 4, no. 4, pp. 2073-2094, 2024, https://doi.org/10.31763/ijrcs.v4i4.1581.


Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Vijaya Lakshmi Korupu

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

 


About the JournalJournal PoliciesAuthor Information

International Journal of Robotics and Control Systems
e-ISSN: 2775-2658
Website: https://pubs2.ascee.org/index.php/IJRCS
Email: ijrcs@ascee.org
Organized by: Association for Scientific Computing Electronics and Engineering (ASCEE)Peneliti Teknologi Teknik IndonesiaDepartment of Electrical Engineering, Universitas Ahmad Dahlan and Kuliah Teknik Elektro
Published by: Association for Scientific Computing Electronics and Engineering (ASCEE)
Office: Jalan Janti, Karangjambe 130B, Banguntapan, Bantul, Daerah Istimewa Yogyakarta, Indonesia