Adaptive Neural Networks Based Robust Output Feedback Controllers for Nonlinear Systems

The performance of the nonlinear control system that is subjected to uncertainty, can be enhanced by implementing an adaptive approach by using the robust output-feedback control and the artificial intelligence neural network. This paper seeks to utilize output feedback control for nonlinear system using artificial intelligence employing neural network. The Two Wheel Mobile Robot (TWMR) is treated as a multi-body dynamic system. The nonlinear swing-up problem is handled by designing an adaptive neural network, which is trained using a modified conventional controller called Linear Quadratic Optimal State Estimator with Integral Control (LQOSEIC). In this paper, the nonlinear system TWMR is stabilized utilizing a robust output feedback control called LQOSEIC. This controller allows a linearized model to emulate a model reference for the original nonlinear system. However, it works for a limited range of operations and will fail if the plant characteristics are unknown or uncertain. An adaptive neural network is used to overcome this problem. The adaptive neural controller is trained offline using LQOSEIC to obtain the initial weights of neurons for the network’s hidden layers. After finishing the training, the LQOSEIC will be replaced by the adaptive neural controller. The main advantage of a neuro-controller is its ability to update the weights online depending on the error signal. If there are any disturbances or uncertainties that arises within the concerned nonlinear system, the neuro-controller will be able to handle it because of online learning that compensates for the effect of unpredictable conditions. The proposed adaptive neural network improves control performance and ensures the robust stability of the closed-loop control system. Finally, numerical simulations are used to demonstrate the efficacy of the proposed controllers.


Introduction
Adaptive neural networks control is utilized in cases where little prior knowledge of the plant is known. The basic idea adaptive neural network is based on an on-line estimation of the plant or controller parameters [1], [2]. The Two Wheel Mobile Robot (TWMR) is commonly known as an inverted pendulum on a cart, is a well-known nonlinear control problem. It is a non-minimum phase system, unstable and under actuated. The task of balancing a pole on a moving cart is a common benchmark problem for evaluating various control algorithms.
Despite the fact that the TWMR is a well-studied control issue, its identification and control remain a conundrum [3], [4], [5], [6], [7], [8]. The nonlinearity behavior of the real systems is extremely difficult to represent as an analytical model. The linearized model can still be used to analyse the behavior of TWMR in some instances, although it is only valid for small nonlinearities. Non-linear systems, on the other hand, can be stabilized using a modified conventional controller. These controllers allow a linearized system to represent a model reference for a non-linear system, but they are only valid for a limited range of operations and will fail if the system's characteristics are unknown or variable.
One of the goals of this paper is to show the different techniques that may be used to construct a nonlinear control system, such as Adaptive Neural Networks (ANN) based robust output feedback control. One of the most crucial characteristics of neural networks is their ability to adapt. Artificial neural networks that adapt to changing surroundings are known as adaptive artificial neural networks. Neural networks have been used effectively in a wide range of applications, including nonlinear system identification and control. Different control techniques such as classical, optimal, and artificial intelligence techniques for control of TWMR have been addressed in [9], [6], [7], [10], [11]. The stability of using adaptive neural networks is well studied in [1], [2]. Furthermore, the robust stability is further generalized to include fractional order derivatives [12], [13], [14], [15].
The rest of this paper is organized as follows: The description Modeling of TWMR is introduced in section 2. In section, 3, we present the control design approach. Section 4 provides the main contribution of this paper; The neural networks in process modeling and control. Finally, Section 5 conclude the paper.

Description Model of the TWMR
The TWMR is an open-loop system that is inherently non-linear, single input multi output (SIMO). The system input is control voltage, and the system outputs are cart position and angle. Fig. 1 shows the mechanical structure of TWMR. The main three parts that conform the machine are: • The wheels: Moves the system backward or forward to balance the body of the system.
• The chassis: Holds the motors, circuits and any parts required for the system.
• The pendulum: The parts of the system that cause the instability and need to be stabilized. This part may be hidden inside the chassis body wheels chasis θ  The high nonlinearity and the lack of stability lends itself to the testing of prototype controllers. Traditional linear controllers are unable to represent and regulate the nonlinear system such as TWMR.
The unstable system's output data does not provide adequate information about the system. Before the identification of the system parameters, feedback controllers are created to stabilize the system [16], [17]. The pole can move parallel to the track in the vertical plane. A force F , parallel to the track, can be applied by the controller to the cart. The position of the TWMR is represented by x(m), the angle is formed by the body, and the vertical is θ in radiant. The velocityẋ is in meter per second and the angular velocityθ is in radiant per second. The systems dynamical equations have to be derived before the plant model can be constructed in Simulink [18], [19].
where V a is the applied voltage source, R a is the armature resistance, r is the radius of the wheel in meter, M w is the mass of wheel in kg, M p is the mass of body in kg, I p is the inertia of the body, I w is the inertia of the wheel in kg.m 2 , l is the length to the body's center of the mass in meter, K m is the motor torque in Nm/A, and K e is the back EMF constant in Vs/rad. The state space equations for the system can be written aṡ Next, we find the linearized model by linearizing Equations (1) and (2) around the equilibrium. The equilibrium point is represented by θ p = π + φ, where φ is the fluctuations of the tilt angle from the upright nominal position [20], [21]. As a result, the following linearized model is obtained: In order to obtain the state space representation of the system, Equation (4) and Equation (5) are rearranged as follows. After using calculus, the following state space model is obtained.
and y = 1 0 where α = I p β + 2M p l 2 (M w + Iw r 2 ) and β = 2M w + 2Iw r 2 + M p . Fig. 2 shows the numerical simulation of the nonlinear system. The simulation is carried out numerically via the Rung Kutta method with the values provided as follows: r = 0.2 m, M w = 3.5 kg, M p = 85 kg, I w = 0.07 kgm 2 , I p = 68.05 kgm 2 , l = 1.7 m, K m = 0.87 Nm/A, k e = 0.098 Vs/rad. The system has four eigenvalues, λ = 0, 0, 5.24 and −5.24. Hence, the system is unstable due to one positive eigenvalue. It can also be notice in Fig. 2, for a small input signal, we have very rapid increase in output. Thus, it needs a robust controller. This problem will be solved in the following sections.

Control Design Approach
The study and design of control systems are focused on three basic goals: achieving stability, investigating the desired transient response, and reducing steady-state errors. In this paper, we consider these objectives in order to balance and prefect the trajectory tracking of the TWMR. In addition, there are several types of performance indexes such as minimum time problem and regulator problem [21].

Modified Conventional Control Design
Creating a state feedback controller usually has one major disadvantage: it generates a significant steady-state inaccuracy. As a result, Integral Control (IC) or Reference Input Signal (RIS) is used to correct for this problem, thus, removing the steady-state response inaccuracy. Configuring the LQRIC is achieved by linking the LQR with the IC. Also, ESRIS can be derived by connecting the Eigen-assignment (ES) with the RIS. Some state variables may not be available or may be too costly to measure. If the state variables are not available, it is possible to approximate these cases using the observer as long as the system is controllable and observable. The observer can be designed using two different methods: Place Estimation (PE) and Optimal State Estimation (OSE). The basic idea now is to create an observer-based controller basis by linking ESRIS with PE to form ESRISPE. Similarly, linking LQRIC with OSE to create LQRICOSE [22], [18]. The purpose of these controllers are to make the linearized system act as a model reference system for the nonlinear system as shown in Fig.  3.
The state space model of linearized TWMR iṡ The error e is a new state with integral control where y is the system output and r is the desired reference. and The system is completely state controllable and the state feedback control (SFC) can be written as: where K a is the gains of LQR with integral control (LQRIC), K d is the gains of LQR, and K IC is integral gain. The LQR approach is based on the minimization of a quadratic cost function J, which is defined as where Q is a symmetric positive semi-definite matrix and R is a symmetric positive definite matrix [4,23].
The gain matrix K a should be resoled such that the solution to the equation Now, we will design Eigen-assignment with reference input signal (ESRIS). The control signal is As a result of which the closed-loop system, provided by: where K s is the gains of ES and K RIS is the feed forward scaling factor (Reference input signal).
The steady state solution, The steady state output The steady state error for a reference input r as a final outcome is e(∞) = r ss − y ss . Therefore, Since K RIS is a scalar, Equation (23) can easily be solved to show the value of feed forward scaling factor to get e ss = 0 is Practically, some state variables may not be available, or measuring them is too expensive. It is feasible to estimate the states if the state variables are unavailable due to system design or cost prohibitive. The observer dynamicẋ The estimation error is introduced asê The primary goal of state feedback control is to stabilize the linearized system so that all closedloop eigenvalues lie in the complex plane's left side. The controllers and estimators gains for LQRI-COSE and ESRISPE are simulated using Matlab as presented in Table 1.
We constructed complex Simulink model of the nonlinear system and the linearized system with LQRICOSE and ESRISPE controllers. In this design, the linearized system can be used as a model reference for the non-linear system [24]. Fig. 4 show the results of LQRICOSE and ESRISPE controllers and depicts the behavior of the linearized and actual nonlinear plant outputs of TWMR. Table 2 presents the numerical values of time specifications (rise time T r , settling time T s , and overshot OS %) and steady state error (e ss ). From the simulation result, it can be noticed that the LQRICOSE has a better performance than ESRISPE. In addition, the difference between actual states and estimated states using observers-based controller LQRICOSE for the linearized and nonlinear system is presented in Fig. 5 and Fig. 6 respectively. Comparing the results obtained, it is clear that the LQRICOSE gives accurate results with very small errors for both linearized and nonlinear systems.   Generally, the task of designing a control system aims to achieve a set of specifications, which defines the overall performance of the system in terms of certain measurable quantities. In this section, we discuss the dynamic error characteristics and use different most common mathematical functions as a performance index associated with the error of a closed loop system. The tracking error e(t) is the difference between the actual and the reference trajectories.
The performance indices that will be considered for both the linearized and nonlinear system are: 1) Integral of Squared Error (ISE), J =   0 te 2 (t) dt. This index gives very little importance to initial errors as compared with most recent ones. Table 3 shows the comparison of IAE, ISE, ITAE, ITSE and MSE values among LQRICOSE for actual and estimated states of linearized and nonlinear system. It can be seen from the simulation results that the characteristic error values of OELQRIC are smaller than the other three methods. This confirms that the OELQRIC is an optimal observer-based controller.
Despite the fact that the preceding results are good while the system parameters remain constant, Fig. 7 illustrates the implications of parameter changes with noise signal added. Noticeably, from this diagram the LQRICOSE controller chosen is unable to drive the system to an appropriate degree. The disadvantage of the acquired controller is the inability to prevent failure if there is any ambiguity or change in the plant's parameters. This issue will be addressed in the next design stage.

Non-linear Identification Using Linear Techniques
The first step when we design controllers for linear or nonlinear systems is to obtain the mathematical model. This task may be difficult to preform, therefore, some techniques such as system identification is used to deal with this problem. The four famous identification algorithms (ARX, ARMAX, OE, and BJ) are compared to identifying the highly nonlinear systems. For more details, we refer the reader to [25], [26]. Fig. 8 depicts a linearized system and nonlinear system with a feedback controller LQRICOSE with linear identification techniques. The non-linear model of TWMR was identified using linear approaches such as ARX, ARMAX, Output Error (OE), and Box-Jenkins (BJ) Model. The nonlinear system's input and output signals are sent to ARX, ARMAX, BJ, and OE, which creates the estimated models. In the case of the TWMR system, both the position and the angle can be measured.
The aim of the study was to compare the different available system identification methods in order to find the most suitable one for the type of problem analyzed in the paper. Four different system identification methods from the System Identification Toolbox in Matlab were applied to fit models to the simulated data of linearized and nonlinear system. Table 4 presents comparative results of parametric Model Identification Techniques for linearized and nonlinear system. The comparison plots in Fig. 9(a) for the position of linearized system show that a good match between simulated test data and identified model is obtained. It is noticeable here, ARMAX model provides maximum model best fit result among all model structures for linraized system. The optimal model is ARMAX which give the best accuracy, and the fitting between input/output is 98.64%. Also, Fig. 9(b) present the estimation position of nonlinear system.
The simulation results show that the linear identification methods have acceptable results in the case of a linearized system. However, due to the inaccuracy of the estimated nonlinear model being quite large, the linear model is not a descent approximation for a non-linear system of TWMR. The artificial neural network methodology will be examined in the next section.

Neural networks in Process Modeling and Control
Control of non-linear systems is a prominent application field for neural networks (NNs), which have been used to identify and control dynamic systems with great success. There is a benefit due   to the fact that this control approach uses the input-output relationship. When employing neural networks as controller, there is usually a two-step procedure: The system's identification and control design. NNs model of the plant should be constructed at the system identification step. Then, the controller is designed or trained using the developed model [19], [27], [18], [28].

Non-linear Identification Using Neural Networks
There are many ways of recognizing the non-linear model using neural networks. Feedforward (FF) neural network structure is the most popular approach of neural network identification. Both the process and the NN model receive the same input during the training. Then, the actual system and neuro model outputs are compared, with the error signal being used to update the NN weights and biases.
To identify the system, we use FF and CF high-order neural network defined as following . . .
where φ i is a bounded approximation error, which can be reduced by increasing the number of the adjustable weights, L i is the state of the i th neuron, L i is the respective number of high-order connections, {i 1 , i 2 , . . . , i Li } is a collection of non-ordered subsets of {1, 2, . . . , n + m}, is the state dimension, m is the number of external inputs, w i is the respective online adapted weight vector, and φ i (x k , u k ) is given by with di j (K) being nonnegative integers [29]. where u = [u 1 , u 2 , . . . , u m ] is the input vector to the neural network and S(·) is defined by and γ is any real value variable. The ideal weight vector w * i is an artificial constant quantity that will be estimated as w i . Then, the estimation error is More complicated functions can be mimicked by increasing the number of hidden layer neurons where the training Epochs = 300 and Learning rate = 0.001. Increasing the number of hidden neurons may but necessarily improve the MSE between the neuro model and the process. In our case, referring to Table 5, the performance is enhanced as the hidden layers increase. The structure of Cf and FF are presented in Fig. 10. The MSE decreases as the number of epochs grows. It is feasible to establish the right number of epochs for training the position and angle states by looking at the training diagram in Fig. 11. The neural network's quality is assessed by contrasting the neural network's outputs against the process's outputs. The process and the neural model will both get the same input, but the neural  Fig. 11. Training angle and position states network will have four targets to train instead of just one. FF networks will be used to simulate a multi-output system. The step response of the process outputs are displayed against the neural model outputs as shown in Fig. 12 As a result, the FF networks can accurately mimic the process.

Adaptive Neural Network in Control
In dynamic control systems, the adaptive artificial neural networks are considered as a separate class of networks. They are distinguished by their online learning. Adaptability is provided to neural networks through a variety of strategies including; weight adjustment, neuronal property modification, and network structure modification. This part provides adaptive neural control for TWMR system. An existing controller is required to construct a supervised neural controller [30], [31], [32]. Since the feedback controller (LQRICOSE) has already been created, this controller can be utilized as a reference for neural network (neuro controller). The neural controller will be created in the same way as the identification techniques, where the target of the neuro controller is to track the output from the original controller [33]. The weights and biases are setup when the training is completed and a Simulink model structure of neuro controller is generated. Then, we replace the existing (LQRI-COSE) controller by the neural network in the feedback loop. This type of adaptive NN gradually adjusts the weights and biases of the network during training to reduce the error e(t). The ADALINE (ADAptive LInear NEuron) networks are similar to the perceptron that is utilized in the identification section, but its transfer function is linear instead of hard limiting. Fig. 13(a)   weight values as the training proceeds. The error between the LQRICOSE controller and the adaptive neural controller is shown in Fig. 13(b), and as the error decreases, the network weights converge to their ultimate values. We compare the result of system output using the original controller and the neural controller in Fig. 14. The difference in error between the neuro controller and the original controller is approximately 10 −7 , indicating that the neuro controller is a close match to the LQRICOSE controller.
The previously learned weights are now utilized as the startup weights for the adaptive neural controller (ADALINE). The input error signal in an ADALINE network is equal to the difference between the desired output and the actual output. The ADALINE receives this error signal and adjusts the weights online which enhances the networks performance. The previous neural controller developed shows that a NN can be trained offline using LQRICOSE controller as a trainer. The Adaptive neural controller can be placed online where it will continuously update its weights. The potential of adaptive neural controller to cancel the disturbances that arise during operation is one of its advantages. When the LQRICOSE is used and a noise signal is added to the setpoint while varying the parameters of a non-linear system, the system becomes unstable. However, we will test the system's response by varying the system's parameters and adding multiple types of disturbances to the setpoint when the system is controlled using adaptive neural networks. The adaptive neuro controller will cancel the effect of any disturbance for different references as shown in Fig. 15. Fig. 16 displays snapshots of discrete points in time, the performance of a robust output-feedback control utilizing neural network for the mobile robot. Interpreting, Fig. 16, the current state, is represented by the darkest lines while the grey lines represent the previous positions. Since the state θ has been returned to a zero state, or a quiescent state, while the position has changed to a non-zero finite value. This indicates that the system experiences a resultant shift in the direction of the step input. Accordingly, we can infer that, the robust output-feedback control with adaptive NN resolves the problem of nonlinear swing-up. It can also be concluded that the robot can accurately track the reference input.

Conclusion
The stability and tracking performance behavior of the nonlinear TWMR system for reference trajectories has been studied. This has been further improved using a robust feedback control and an adaptive neural network, taking the control design methods into consideration. The linear identi-  fication models such as ARX, ARMAX, OE and BJ models were applied to estimate the non-linear system (TWMR), which is found inadequate in modeling the non-linear system. A variety of hidden layer neurons were used to create feedforward neural networks. The feedforward networks have accurately represented the nonlinear system, with a very low MSE between the process and the neuron model. The LQRICOSE has been used to stabilize the non-linear system, but it failed when there was an uncertainty. When a disturbance is introduced to the process and the plant's parameters are changed during simulation, the LQRICOSE loses control of the non-linear system. The problem is handled using an adaptive neural network. The neuro controller is trained offline using an existing controller LQRICOSE to obtain the initial weights, and then replaced to regulate the non-linear system. The adaptive neuro controller has the benefit of being able to correct disturbances or any type of uncertainty that arises during operation.