Reproducing kernel Hilbert space approach for the online update of radial bases in neuro-adaptive control

Kingravi, H. A., Chowdhary, G., Vela, P. A., & Johnson, E. N. (2012). Reproducing kernel Hilbert space approach for the online update of radial bases in neuro-adaptive control. IEEE Transactions on Neural Networks and Learning Systems, 23(7), 1130-1141.

1 Introduction

The main contribution of this paper is the presentation of an adaptive control technique that uses RKHS theory. The key requirements being that the tracking error between reference model and the system, and the adaptation error between the adaptive element and the actual uncertainty be simultaneously driven to a bounded set.

2 Preliminaries

A. RKHS

Let be a Hilbert space of functions defined on the domain . Given any element and , there exists some unique element (called a point evaluation functional) such that is called an RKHS if every such linear functional is continuous (something which is not true in an arbitrary Hilbert space). It is possible to construct these spaces using tools from operator theory.

A Mercer kernel on is any continuous, symmetric positive-semidefinite function of the form: . Associated to is a linear operator which can be defined as where f is a function in , the space of square integrable functions (平方可积函数控制) on the domain .

Theorem 1: Suppose is a continuous, symmetric positive-semidefinite kernel. Then there is an orthonormal basis of consisting of eigenfunctions (特征函数) of such that the corresponding sequence of eigenvalues is nonnegative. Furthermore, the eigenfunctions corresponding to nonzero eigenvalues are continuous on , and has the representation where the convergence is absolute and uniform.

The importance of these spaces from a machine learning perspective is based on the fact that a kernel function meeting the above conditions implies the existance of some Hilbert space (of functions) and a mapping such that The Gaussian function used in RBF networks given by Given a point , the mapping can be thought of as a vector in the Hilbert space . In , we have the relation Fixing a dataset , where , let be the linear subspace generated by in . Let and let . Then is the output of a standard RBF network.

Finally, given the above dataset C, one can form an kernel matrix given by . If the function is a positive-definite Mercer kernel, and if all the points in are distinct, the kernel matrix is nonsingular.

B. PE Signals

Definition 1: A bounded vector signal is exciting over an interval and , if there exists such that

Definition 2: A bounded vector signal is persistently exciting if for all there exists and such that The strength of the signal depends on the value of .

3 MRAC and CL

Consider the uncertain multivariable nonlinear dynamical system

Since the exact model is usually not available or not invertible, an approximate inversion model is introduced, which is used to determine the control input where is the pseudo control input that represents the desired model output and is expected to be approximately achieved by . Hence, the pseudo control input is the output of the approximate inversion model This approximation results in a model error of the form A reference model is chosen to characterize the desired response of the system Consider a tracking control law consisting of a linear feedback part , a linear feedforward part , and an adaptive part in the following form Define the tracking error as , then letting the tracking error dynamics are found to be Let . Generally, two cases for characterizing the uncertainty are considered. In structured uncertainty, the mapping is known, whereas in unstructured uncertainty, it is unknown. We focus on the latter in this paper.

Assume that it is only known that the uncertainty is continuous and defined over a compact domain . Let be a vector of known RBFs.

Then for each , the RBFs are given as Appealing to the universal approximation property of RBF NN, we have To make use of the universal approximation property, a RBF NN can be used to approximate the unstructured uncertainty. In this case, the adaptive element takes the form A commonly used update law, which will be referred to here as the baseline adaptive law, is given as If specifically selected recorded data is used concurrently with instantaneous measurements, then the weights approach and stay bounded in a compact neighborhood of the ideal weights subject to a sufficient condition on the linear independence of the recorded data; PE is not needed.

Theorem 2: For the th recorded data point let , with for a stored data point calculated as . Also, let be the number of recorded data points in the matrix . If rank() = , then the following weight update law: renders the tracking error e and the RBF NN weight errors uniformly ultimately bounded.

The matrix will be referred to as the history stack.

4 Kernel Linear Independence and the Budgeted Kernel Restructuring Algorithm

A. PE Signals and the RKHS

Let , and recall that represents the linear subspace generated by in . Let , then

A matrix is positive definite, if and only if . In the above, this translates to

Theorem 3: Suppose evolves in the state space, then if there exists some time such that the mapping for is orthogonal to the linear subspace for all time, the signal is not PE.

Theorem 4: Suppose evolves in the state space. If there exists some state and some such that , is not PE.

Therefore, PE of follows only if neither of the above conditions are met. weights. This shows that in order to guarantee PE in , not only does have to be PE, but also the centers should be such that the mapping is not orthogonal to the linear subspace generated by the centers.

B. Linear Independence

This section outlines a strategy to select RBF centers online. The algorithm presented here ensures that the RBF centers cover the current domain of operation, while keeping at least some centers reflecting previous domains of operation.

At any point in time, our algorithm maintains a "dictionary" of centers , where is the current size of the dictionary, and is the upper limit on the number of points (the budget). To test whether a new center should be inserted into the dictionary, we check whether it can or cannot be approximated in by the current set of centers. This test is performed using where the denotes the coefficients of the linear independence. The coefficients can be determined by minimizing , which yields the optimal coefficient vector , is the kernel matrix for the dictionary dataset , and is the kernel vector. Them, we have

Algorithm 1 Kernel linear independence test

Input: New point .

Compute Compute as if then

if then

Update dictionary by storing the new point , and recalculating ’s for each of the points

else

Update dictionary by deleting the point with the minimal , and then recalculate ’s for each of the points

end if

end if

Note that due to the nature of the linear independence test, every time a state is encountered that cannot be approximated within tolerance by the current dictionary , it is added to the dictionary. Further, since the dictionary is designed to keep the most varied basis possible, this is the best one can do on a budget using only instantaneous online data. Therefore, in the final algorithm, all the centers can be initialized to 0, and the linear independence test can be periodically run to check whether or not to add a new center to the RBF network.

5 BKR-CL Algorithm

A. Motivation

The BKR algorithm selects centers in order to ensure that any inserted excitation does not vanish. In this section, we present a CL adaptive law that guarantees the weights approach and remain bounded in compact domain around their ideal weights by concurrently utilizing online selected and recorded data with instantaneous data for adaptation. The CL adaptive law is wedded with BKR to create BKR-CL.

Algorithm 2 Singular value maximizing algorithm for recording data points

Require:

if or a new center added by Algorithm 1 without replacing an old center then else if new center added by Algorithm 1 by replacing an old center then

if old center was found in then overwrite old center in the history stack with new center set equal to the location of the data point replaced in the history stack

end if

end if

Recalculate

if then

for to do

end for

find max and let denotes the corresponding column index

if max then

else

end if

end if

B. Stability Analysis

Over each interval, between switches in the NN approximation, the tracking error dynamics are given by the following differential equation

The NN approximation error for the th system can be rewritten as The adaptive law:

Theorem 5: Consider the system and the control law, , where is compact, and the case of unstructured uncertainty. For the th recorded data point let and let be the number of recorded data points in the matrix , such that rank() = l at = 0. Assume that the RBF centers are updated using Algorithm 1 and the history stack updated using Algorithm 2. Then, the weight update law in ensures that the tracking error and the RBF NN weight errors are bounded.

With , we have Therefore, over , the weight dynamics are given by the following switching system: Consider the family of positive definite functions With , we have

Here, note that in the above equation is a vector, , then Hence, if Hence, the set

is positively invariant for the th system.

Therefore, it is expected that this rank condition will be met within the first few time steps even when one begins with no a priori recorded data points in the history stack.

6 Application

In this section, we use BKR-CL control to track a sequence of roll commands in the presence of simulated wing rock dynamics. Let denotes the roll attitude of an aircraft, denotes the roll rate and denotes the aileron control input. Then a model for wing rock dynamics is The chosen inversion model has the form

% simulation model
function [x,x_rm,xDot,deltaErr,v_crm]=wingrock_correct(x,x_rm,v_h,delta,dt,controlDT,Wstar,xref,omegan_rm,zeta_rm)
% input :   x---previous step system states, size= 2 x 1 
%               x_rm---previous step reference system states, size= 2 x 1 
%               v_h---control saturations, e don't consider control
%                           saturations in this sim with v_h = 0
%               delta---approximate inversion, v=delta
%               dt--- time step
%               controlDT---control period=time step
%               Wstar---ideal weights
%               xref--- reference signal
%               omegan_rm---natural frequency of reference model
%               zeta_rm--- damping ratio of reference model

% output: x---current step system states
%               x_rm---current step reference system states  
%               xDot---current step system derivations of states 
%               deltaErr--- actual system uncertainties
%               v_crm---reference model output

deltaErr=Wstar'*[1;x(1);x(2);abs(x(1))*x(2);abs(x(2))*x(2);x(1)^3];

%Reference model update  
xp=stat_refmodel(x_rm,v_h,xref,omegan_rm,zeta_rm);   
x_rm=x_rm+xp*controlDT;
v_crm=omegan_rm^2*(xref-x_rm(1))-2*zeta_rm*omegan_rm*x_rm(2);

% propogate state dynamics
clear xp
xp=state(x,delta,Wstar);
xDot=xp; 
x=x+dt*xp;
deltaErr=Wstar'*[1;x(1);x(2);abs(x(1))*x(2);abs(x(2))*x(2);x(1)^3];
% End main function

 %State model
function [xDot] = state(x,delta,Wstar)
        x1Dot=x(2);
        deltaErr=Wstar'*[1;x(1);x(2);abs(x(1))*x(2);abs(x(2))*x(2);x(1)^3];
        x2Dot=delta+deltaErr;%delta=v
        xDot=[x1Dot;x2Dot];
        
 % reference model state dynamics
 function [x_dot_rm] = stat_refmodel(x_rm,v_h,xref,omegan_rm,zeta_rm)
             x1Dot_rm = x_rm(2);
             v_crm=omegan_rm^2*(xref-x_rm(1))-2*zeta_rm*omegan_rm*x_rm(2);
             x2Dot_rm = v_crm - v_h;
             x_dot_rm=[x1Dot_rm;x2Dot_rm];