Zhao, J., Na, J., & Gao, G. (2022). Robust tracking control of
uncertain nonlinear systems with adaptive dynamic programming.
Neurocomputing, 471, 21-30.
To address the robust control problem, an uncertain system is given
as Assume the uncertainties are bounded, The desired trajectories to be tracked are produced by a
command generator given as follows: The purpose of this paper is to design a robust control such that the system state can track and the tracking error can be minimized.
The augmented system is defined as Define, , and then the
augmented system can be given as Clearly, we have the upper bound of the uncertain term as Now, we consider the nominal part of the augmented system as
To formulate the optimal control of this nominal system, we
seek a control to minimize the
following cost function with a discounted factor denotes the
discounted factor, which is used to guarantee the boundedness of the
cost function even for tracking a nonvanishing trajectory.
Hence, the remaining problem to be addressed is to solve the above
optimal control problem in an online manner. To accomplish the optimal
control design, we take the derivative of (8) along the augmented system
state , such that Hence, based on the optimal control principle, the Hamiltonian
of nominal system with cost function can be given as where is the derivative of V with respect to the augmented system
state .
We define the optimal cost function as Then the corresponding HJB equation can be given as Hence, the equation can be solved from (10) to obtain the
optimal control action as When and u satisfy
the HJB, i.e., and
, then we have Then the HJB equation can be rewritten as which implies By multiplying on both sides of the above equation, we have One can show that when , the optimal tracking control makes the tracking error
asymptotically stable. Specifically, the desired trajectories are not
necessarily asymptotically stable, hence the discount factor has to be used to guarantee that
is bounded.
In this case, we do not require that the trajectory to be tracked is
vanishing.
Critic NN and online
adaptive learning
The key idea of the ADP method is to apply a critic NN to estimate
the optimal cost function .
To this end, can be
considered as a continuous function, which can be estimated by a critic
NN as Therefore, we have its derivative with respect to as Hence, we have the following Assumption as In practice, the ideal NN weights are unknown, thus we have the
approximated cost function as Hence, the estimated solution of HJB We have the ideal optimal control and the actual optimal
control as Then we can rewrite the HJB equation as For developing an adaptive law to estimate the critic NN
weights , the known terms in can
be defined as Then, the HJB can be given as To this end, the filtered regressor matrices , and can be denoted as
Hence, its solution can be derived as which can be online calculated based on the augmented system
state .
An auxiliary vector can be defined as Then, we have Then We can design the following adaptive law to online calculate
as It should be noted that the proposed control scheme can be
implemented without any offline learning process. To illustrate the
implementation of the proposed control and learning algorithm, the
following Algorithm 1 is given:
Algorithm 1 (Adaptive Optimal Control Implementation for Solving
Robust Tracking Control)
(Initialization): Define the initial weights and gains ;
Measurement): Measure the system state , and construct the regressors
(Online learning): Calculate and update the estimated
weights to get control ;
(Control): Apply the derived control on the system
The closed-loop system can be given as To carry out the stability analysis, the following Assumption
will be given The Lyapunov function is set as The first term: The second term:
Then, here
is removed
We also know Also The third term:
Then, we have Then, we should chose: Then, we have Specifically, the constant is a residual term determined by the
critic NN error , hence we can conclude that for .