Kingravi, H. A., Chowdhary, G., Vela, P. A., & Johnson, E. N.
(2012). Reproducing kernel Hilbert space approach for the online update
of radial bases in neuro-adaptive control. IEEE Transactions on
Neural Networks and Learning Systems, 23(7),
1130-1141.
1 Introduction
The main contribution of this paper is the presentation of an
adaptive control technique that uses RKHS theory. The key requirements
being that the tracking error between reference model and the system,
and the adaptation error between the adaptive element and the actual
uncertainty be simultaneously driven to a bounded set.
2 Preliminaries
A. RKHS
Let be a Hilbert
space of functions defined on the domain . Given any element
and , there exists some
unique element (called a point
evaluation functional) such that is called an
RKHS if every such linear functional is continuous (something which is not
true in an arbitrary Hilbert space). It is possible to construct these
spaces using tools from operator theory.
A Mercer kernel on is any continuous, symmetric positive-semidefinite
function of the form: . Associated to is a linear operator which can be
defined as where f is a function in , the space of square integrable
functions (平方可积函数控制) on the domain .
Theorem 1: Suppose is a continuous, symmetric
positive-semidefinite kernel. Then there is an orthonormal basis of consisting of eigenfunctions
(特征函数) of such that the
corresponding sequence of eigenvalues is nonnegative.
Furthermore, the eigenfunctions corresponding to nonzero eigenvalues are
continuous on , and has the representation where the convergence is absolute and uniform.
The importance of these spaces from a machine learning perspective is
based on the fact that a kernel function meeting the above conditions
implies the existance of some Hilbert space (of functions) and a mapping
such that The Gaussian function used in RBF networks given by Given a point , the
mapping can
be thought of as a vector in the Hilbert space . In , we have the relation Fixing a dataset , where , let be the linear subspace generated by in . Let
and let . Then
is the output of a
standard RBF network.
Finally, given the above dataset C, one can form an kernel matrix given by . If the function
is a positive-definite
Mercer kernel, and if all the points in are distinct, the kernel matrix is nonsingular.
B. PE Signals
Definition 1: A bounded vector signal is exciting over an
interval and , if there
exists such that
Definition 2: A bounded vector signal is persistently
exciting if for all there exists and such that The strength of the signal depends on the value of .
3 MRAC and CL
Consider the uncertain multivariable nonlinear dynamical system
Since the exact model is usually not available or not invertible, an
approximate inversion model is introduced, which is used to determine the control
input where is the pseudo
control input that represents the desired model output and is expected to be
approximately achieved by .
Hence, the pseudo control input is the output of the approximate
inversion model This approximation results in a model error of the form A reference model is chosen to characterize the desired
response of the system Consider a tracking control law consisting of a linear
feedback part , a linear
feedforward part , and an adaptive part in the following form Define the tracking error as , then letting the tracking error dynamics are found to be Let . Generally, two cases for characterizing the
uncertainty are
considered. In structured uncertainty, the mapping is known, whereas
in unstructured uncertainty, it is unknown. We focus
on the latter in this paper.
Assume that it is only known that the uncertainty is continuous and defined
over a compact domain . Let be a vector of known RBFs.
Then for each , the RBFs are
given as Appealing to the universal approximation property of RBF NN,
we have To make use of the universal approximation property, a RBF NN
can be used to approximate the unstructured uncertainty. In this case,
the adaptive element takes the form A commonly used update law, which will be referred to here as
the baseline adaptive law, is given as If specifically selected recorded data is used concurrently
with instantaneous measurements, then the weights approach and stay
bounded in a compact neighborhood of the ideal weights subject to a
sufficient condition on the linear independence of the recorded data; PE
is not needed.
Theorem 2: For the th recorded
data point let , with for a stored data point
calculated as . Also,
let be the number of recorded
data points in the
matrix . If rank() =
, then the following weight update
law: renders the tracking error e and the RBF NN weight errors
uniformly ultimately
bounded.
The matrix will be referred to
as the history stack.
4
Kernel Linear Independence and the Budgeted Kernel Restructuring
Algorithm
A. PE Signals and the RKHS
Let , and
recall that
represents the linear subspace generated by in . Let , then
A matrix is positive definite,
if and only if . In the above, this translates to
Theorem 3: Suppose evolves in the state space, then if there exists some time
such that the
mapping for is orthogonal to the linear
subspace for all
time, the signal
is not PE.
Theorem 4: Suppose evolves in the state space. If there exists some state
and
some such that
, is not
PE.
Therefore, PE of follows only if neither of the above conditions are met.
weights. This shows that in order to guarantee PE in , not only does have to be PE, but also the
centers should be such that the mapping is not
orthogonal to the linear subspace generated by the
centers.
B. Linear Independence
This section outlines a strategy to select RBF centers
online. The algorithm presented here ensures that the
RBF centers cover the current domain of operation, while keeping at
least some centers reflecting previous domains of operation.
At any point in time, our algorithm maintains a "dictionary" of
centers ,
where is the
current size of the dictionary, and is the upper limit on the number of
points (the budget). To test whether a new center should be inserted into the
dictionary, we check whether it can or cannot be approximated in by the current set of
centers. This test is performed using where the denotes the
coefficients of the linear independence. The coefficients can be determined by minimizing , which yields the optimal
coefficient vector , is
the kernel matrix for the dictionary dataset , and is the kernel vector. Them, we have
Algorithm 1 Kernel linear independence test
Input: New point .
Compute Compute as if then
ifthen
Update dictionary by storing the new point , and recalculating ’s for each of the
points
else
Update dictionary by deleting the point with the
minimal ,
and then recalculate ’s for
each of the points
end if
end if
Note that due to the nature of the linear independence test, every
time a state is
encountered that cannot be approximated within tolerance by the current dictionary , it is added to the dictionary.
Further, since the dictionary is designed to keep the most varied basis
possible, this is the best one can do on a budget using only
instantaneous online data. Therefore, in the final algorithm, all the
centers can be initialized to 0, and the linear independence test can be
periodically run to check whether or not to add a new center to the RBF
network.
5 BKR-CL Algorithm
A. Motivation
The BKR algorithm selects centers in order to ensure that any
inserted excitation does not vanish. In this section, we present a CL
adaptive law that guarantees the weights approach and remain bounded in
compact domain around their ideal weights by concurrently utilizing
online selected and recorded data with instantaneous data for
adaptation. The CL adaptive law is wedded with BKR to create BKR-CL.
Algorithm 2 Singular value maximizing algorithm for
recording data points
Require:
if or a new center added by
Algorithm 1 without replacing an old center thenelse if new center added by Algorithm 1 by
replacing an old center then
if old center was found in then overwrite old center in the
history stack with new center set
equal to the location of the data point replaced in the history
stack
end if
end if
Recalculate
ifthen
for to
do
end for
find max and let denotes the corresponding column
index
if max then
else
end if
end if
B. Stability Analysis
Over each interval, between switches in the NN
approximation, the tracking error dynamics are given by the following
differential equation
The NN approximation error for the th system can be rewritten as The adaptive law:
Theorem 5: Consider the system and the control law, , where is compact, and the case of
unstructured uncertainty. For the th recorded data point let and let be the number of recorded data points
in
the matrix , such that rank() = l at = 0. Assume that the RBF centers are
updated using Algorithm 1 and the history stack updated
using Algorithm 2. Then, the weight update law in
ensures that the tracking error
and the RBF NN weight errors are bounded.
With , we
have Therefore, over , the weight dynamics are given by the following
switching system: Consider the family of positive definite functions With , we have
Here, note that in the above equation is a vector, , then Hence, if Hence, the set
is positively invariant for the th system.
Therefore, it is expected that this rank condition will be met within
the first few time steps even when one begins with no a priori recorded
data points in the history stack.
6 Application
In this section, we use BKR-CL control to track a
sequence of roll commands in the presence of simulated wing rock
dynamics. Let denotes the
roll attitude of an aircraft,
denotes the roll rate and
denotes the aileron control input. Then a model for wing rock dynamics
is The chosen inversion model has the form
% simulation model function[x,x_rm,xDot,deltaErr,v_crm]=wingrock_correct(x,x_rm,v_h,delta,dt,controlDT,Wstar,xref,omegan_rm,zeta_rm) % input : x---previous step system states, size= 2 x 1 % x_rm---previous step reference system states, size= 2 x 1 % v_h---control saturations, e don't consider control % saturations in this sim with v_h = 0 % delta---approximate inversion, v=delta % dt--- time step % controlDT---control period=time step % Wstar---ideal weights % xref--- reference signal % omegan_rm---natural frequency of reference model % zeta_rm--- damping ratio of reference model
% output: x---current step system states % x_rm---current step reference system states % xDot---current step system derivations of states % deltaErr--- actual system uncertainties % v_crm---reference model output
%Reference model update xp=stat_refmodel(x_rm,v_h,xref,omegan_rm,zeta_rm); x_rm=x_rm+xp*controlDT; v_crm=omegan_rm^2*(xref-x_rm(1))-2*zeta_rm*omegan_rm*x_rm(2);
% propogate state dynamics clear xp xp=state(x,delta,Wstar); xDot=xp; x=x+dt*xp; deltaErr=Wstar'*[1;x(1);x(2);abs(x(1))*x(2);abs(x(2))*x(2);x(1)^3]; % End main function
1 2 3 4 5 6 7 8 9 10 11 12 13
%State model function[xDot] = state(x,delta,Wstar) x1Dot=x(2); deltaErr=Wstar'*[1;x(1);x(2);abs(x(1))*x(2);abs(x(2))*x(2);x(1)^3]; x2Dot=delta+deltaErr;%delta=v xDot=[x1Dot;x2Dot]; % reference model state dynamics function[x_dot_rm] = stat_refmodel(x_rm,v_h,xref,omegan_rm,zeta_rm) x1Dot_rm = x_rm(2); v_crm=omegan_rm^2*(xref-x_rm(1))-2*zeta_rm*omegan_rm*x_rm(2); x2Dot_rm = v_crm - v_h; x_dot_rm=[x1Dot_rm;x2Dot_rm];