Zhao, B., Jia, L., Xia, H., & Li, Y. (2018). Adaptive dynamic
programming-based stabilization of nonlinear systems with unknown
actuator saturation. Nonlinear Dynamics, 93(4),
2089-2103.
The considered nominal continuous-time nonlinear systems can be
described as
Online nominal optimal
control
The objective of this optimal control problem is to find the
stabilizing nominal control
to minimize the infinite-horizon cost function which is given by are positive definite matrices. If the
associated infinite-horizon cost function (2) is continuously
differentiable, the infinitesimal version of (2) is the so-called
nonlinear Lyapunov equation In light of the nominal control policy and the cost function , define the Hamiltonian as and the optimal cost function as The optimal cost function of (5) can be derived from the
solution of the HJB equation The item
indicates the partial gradient of the cost function in (5) with respect to , If the solution
of (6) exists, the closed-loop description for optimal control can be
obtained as By simple transformation, we get In the following, we approximate the value function with a
critic NN with a single hidden layer as Then, the partial gradient of with respect to is where Thus, the Hamiltonian can be described as Then, we have In virtue of the ideal weight vector is unavailable, the approximate
critic NN can be expressed by Then, we have the partial gradient of with respect to as Thus, the Hamiltonian can be approximated as Define , we have For adjusting the critic NN weight vector , the steepest descent algorithm
is used to minimize the objective function . Thus, the weight
vector approximation error can be updated adaptively by Thus, can be
updated by (how to calculate ?) Therefore, the ideal nominal control policy can be expressed
as Thus, it can be approximated as Choose the Lyapunov function candidate as The time derivative of is Key assumptions: Hence, as long
as lies outside of
the following compact set Therefore, according to Lyapunov's direct method, the
approximation error of the weight vector is UUB.
In order to tackle the unknown actuator saturation, the vector which is the so-called saturation nonlinearity, is introduced
with the definition as
Noticing that in the case of no actuator saturation, remains zero, and the control
law becomes the same as the ideal nominal control law. However, is nonzero in the presence of
actuator saturation. Thus, the saturated nonlinear system can be
transformed into Here, a backpropagation NN is introduced to approximate the
unknown item and it can
be presented as Then, we have It can be updated by Define as the overall NN approximation error. We have
The overall control law for nonlinear system is designed as
1: Select a set of small positive constants , the maximum iteration time , the maximum run step , the initial values and of corresponding NN weight
vectors. Let and , and begin with a given
nominal control policy .
2: (Policy evaluation) Let , solve the following nonlinear
Lyapunov equation for the control policy : 3: (Policy improvement) Update the control
policy by 4: If , stop and obtain the approximated optimal control; else,
let , if , return to Step 2, otherwise go to
Step 5.
5: (Feed-forward compensation) Update the weight
vector of NN by And obtain the approximate unknown term as 6: (Overall control policy) Update the
overall control policy
by 7: If , return to
Step 2; else, stop running.
Stability analysis
Assumption: Choose the Lyapunov function candidate as The time derivative is For the second item, we have With , we have Then, with the fact: we have: Since is locally
Lipschitz, there exists a positive constant which satisfies . Suppose that . Then ,we have Supposing that , we have we can conclude that when the state lies
outside of the compact set with the following conditions: It implies that all the signals of the closed-loop nonlinear
system with unknown actuator saturation can be guaranteed to be UUB.