LAN Tianyi,LIN Hui,and LI Bingqiang
School of Automation,Northwestern Polytechnical University,Xi’an 710129,China
Abstract: In order to accelerate the convergence speed of iterative learning control (ILC), taking the P-type learning algorithm as an example, a correction algorithm with kernel-based autoassociative is proposed for the linear system.The learning mechanism of human brain associative memory is introduced to the traditional ILC. The control value of the subsequent time is precorrected with the current time information by association in each iterative learning process.The learning efficiency of the whole system is improved significantly with the proposed algorithm.Through the rigorous analysis, it shows that under this new designed ILC scheme, the uniform convergence of the state tracking error is guaranteed. Numerical simulations illustrate the effectiveness of the proposed associative control scheme and the validity of the conclusion.
Keywords: iterative learning control (ILC), associative learning,convergence speed,tracking convergence.
Iterative learning control (ILC) [1] has many attractive characteristics: (i) it is a very simple control learning algorithm,(ii)it can track any linear or nonlinear unknown periodic systems,and(iii)the control performance will be better and better with the increasing of the number of trials.The theoretical results are fruitful in the past 40 years,such as the convergenceof learning algorithms[2,3],initial state problems[4,5],robustness problems[6], new learning algorithms [7,8] and so on. Recently, it has aroused great interests of control engineers,and has been paid more and more attention in the control field[9–14].
Although a series of theoretical research results have been obtained, ILC has not been widely used. In order to make ILC widely used in industry, many experts and scholars improve the ILC from different perspectives,which include two aspects,namely,the model-free method and the model-based approach. Model-free improvement methods include adaptive ILC[14–16],PID-type control[17,18],accelerated learning algorithm[19,20],and highorder ILC [9,21]. Model-free improvement methods can retain the advantages of the traditional ILC without relying on model parameters. It has good robustness, but the controller parameters need to be adjusted according to experience and the controller only includes the output error so that it is difficult to obtain the optimal control effect.The model-based approach mainly includes the feedback learning algorithm[22,23],and predictive control[24].In[22],an ILC method based on optimal feedback and feedforward was proposed,the inverse system of the model was used in iterative learning controller design to obtain a better control performance,but the robust performance of the controller was reduced due to the introduction of model parameters.According to the predictive control method[24],the algorithm updates the control input by learning not only from the past trials but also from the predicted future trials by using the knowledge of the system model. This algorithm includes information from the predicted future trials,and the designed ILC controller properties reveal potentially substantial benefit in terms of the convergence speed.
The mechanism of human learning indicates that humans have a strong ability to learn, which is inseparable with the associative function of the human brain[25].Association is able to obtain the complete information model of objects by using the partial information. Conceiving overall part information can draw analogies, which effectively accelerates the learning process.In accordance with the mechanism of human association, some new learning schemes were proposed based on a kernel-based autoassociative method combined with the ILC to improve the learning speed of the tradition algorithm[26–28].The algorithms have the ability of association in each iterative learning process and will greatly accelerate the conver-gence of the learning speed.
This paper is organized as follows.Section 2 describes the kernel-based auto-association ILC algorithm in general.The mathematical proof of the convergence of the proposed algorithm is given in Section 3. Further, the proposed ILC law is extended to linear time-varying systems in Section 4. Two simulation examples are carried out to compare the proposed algorithm with the traditional algorithm,and the simulation results are given in Section 5 to verify the validity and correctness of the proposed algorithm.The final conclusions are obtained in Section 6.
In this section, the ILC algorithm is formulated by an abstract function.As an example(and for simplicity), these formulations are presented for discrete time linear timeinvarying and linear time-varying systems, however, the abstract problem setting is applied to more general linear systems in Hilbert spaces.This includes many situations of interest such as nonlinear time-varying systems, and nonlinear differential models[29–31].
Consider a class of the linear time-invarying(LTI)discrete system
whereNis the total number of samples;j ∈SN?1is the sampling step;SN={0,1,...,N}is the discrete time interval; subscriptkis the number of iterations(trials); andxk(j)∈Rn,uk(j)∈Rr, andyk(j)∈Rmare,respectively, the state, input, and output of the system;A,B,andCare constant matrices of the corresponding dimension satisfying the condition, i.e.,CBis of full column rank.
Assumption 1 The initial state is equal to the ideal initial state,that is,xk(0)=xd(0).
Remark 1 The identical initialization condition is a standard assumption in the ILC design to ensure the perfect tracking performance.Note that without perfect initial conditions,perfect tracking can never be achieved.More discussions on various initial conditions in the learning context could be found in[32].
Assumption 2 For any given desired trajectoryyd(j+1),there exists a desired statexd(j)and a desired control signalud(j),such that
where?j ∈SN?1.
Remark 2 Sinceud(j) exists uniquely, the uniform convergence of the control profileuk(j)toud(j)implies that the state and output tracking errors will vanish.It is a reasonable assumption that the task for control should be feasible.
We present a new ILC algorithm which can greatly improve the learning speed of linear systems.The associative function is generated by the product of the predefined kernel function with the control correction value.
Define the kernel associative function
whereiis the current time,kis the iteration number;Cok(i)is the kernel associative function;d(i,j)is the kernel function;Δuk(i)is the correction value(in this paper,the correction value is errorek(i)).Productd(i,j)Δuk(i)denotes the correction value of Δuk(i)at pointj.
whereKEis a weight coefficient,andKLis an associative gain matrix.
The kernel functionCok(i) is modified as shown in Fig.1.Among them,the abscissa represents the time axis,and the learning interval [0,T] is divided intoNsegments, each segment length is Δt=T/N, the nodes are 0,Δt,2Δt,...NΔt. For simplicity, it is briefly recorded as 0,1,2,...,N.
Fig.1 Kernel associative learning function
Through analysis,it is clear that if a learning error occurs in thekth trial, the learning algorithm will produce a correction value Δuk(j) (in the traditional ILC, the correction Δuk(j) is the errorek(j) or the derivative of errorek(j)?ek(j?1))and modify the value of the corresponding time pointjfor the next input, namely,uk+1(j) =uk(j)+Δuk(j) (traditional P-type or D-type ILC algorithms). Therefore, the modified control inputuk+1(j) is used as a control input during the(k+1)th trial.
From the above analysis,we can see that the traditional ILC is only a single point amendment to the current control point,and has no effect on other time points.Fig.2 shows the structure of the P-type ILC associative algorithm.
Fig.2 Structure of P-type ILC associative algorithm
At the time pointi=jof thekth trial,the error and the control input are computed by the traditional ILC. Whenj >i,the control value at pointsj+1, j+2,...,Nwill be pre-corrected according to (3). The pre-correction values of this time are superimposed into the corresponding point,without waiting for the(k+1)th trial.
The ILC algorithm design problem can be stated as finding a control updating law
wheredenotes the feed-forward ILC part,andis an associative controller;KPis a proportional learning gain matrix.The output tracking error is defined to beek(j)=yd(j)?yk(j).
A simple example is illustrated in Fig. 3 to show the modification value of each point in the iteration domain.In thekth trial, the learning process is from point 0 to pointN.
When running toi, the error of this point isek(i), and the point ofj > ineeds to be pre-corrected, namely,j=i+1,i+2,...,N.Fig.3 is the predictive corrected diagram of pointi.Table 1 shows the predictive corrected values corresponding to Fig.3.
Fig.3 The pre-associative diagram
Table 1 All of the predictive corrected values
Through the above description of associative learning,the control input of each point in thekth learning process can be easily obtained.For example,whenj= 3, the associative algorithm of this point is expressed as follows:
Remark 3 It can be seen that the “association”iterative learning algorithm proposed in this paper is based on the given kernel function(monotone decreasing function),using the errorek(j)of the current time point to fine-tune the subsequent non-occurrence time, so as to achieve the rapid convergence.
To prove the convergence of the proposed associative learning algorithm, the following important lemmas are given.
Lemma 1 [33] Assume the matrix M ∈ Rn×n, for?ε>0,then there exists a matrix norminduced by the vector norm on Rn,it makeswhere the spectral radiusis the eigenvalue of M.
Lemma 2 If the matrices M ∈ Rn×nand N ∈ Rn×nare all upper-triangular matrices and the principal diagonal elements are equal,then the spectral radius of MN is less than or equal to the spectral radius of the product of M and N,that is,
Proof Set matrices M and N to be
and
It can be seen from the characteristic of the proper value that the proper value of M is a,and the proper value of N is b.
According to the above, the proper value of MN is ab.According to the spectral radius definition,this obtains ρ(MN)ρ(M)ρ(N).
Theorem 1 Assume that the LTI system (1) satisfies Assumption 1 and Assumption 2, an ILC algorithm with kernel association (5) is adopted, if the condition holds:ρ(I ? KpCB) < 1. Then the output trajectory of the learning algorithm(5)converges uniformly to the desired trajectory;that is,when k → ∞,such that yk(j)→ yd(j),j ∈SN.
Proof In this paper,let d(i,j)=prove the convergence of associative algorithm,j ∈SN.
Define
Thus,using(5)and(10)can give
and
According to the LTI system(1),
Using(2)and(14),we have
According to Assumption 1,we can obtain
For the case j =0,substituting(16)into(12)yields
where I is the unit matrix.
For the case j =1,similar to(17),we can obtain
For the case j =2,similar to the above,
Summarizing(19),we can obtain
For the case j =3,similar to(19)
Coordinating(21),we can obtain
Proceeding in the same fashion,the following equation can be obtained for j =N ?1.
Summarizing(23),we can obtain
Define
According to(17),(18),(20),(22)and(24),(12)can be further rewritten in the following composite form:
ΔUk=Ud?Uk=and
According to(28),we can obtain
From(29)and Lemma 2,we have
Since ρ(I ? KpCB)<1,according to Lemma 1,then
This implies that
According to(16)and(32),we have
Therefore, when k → ∞, such that yk(j) → yd(j),j ∈SN.
Remark 4 Although the proposed associative iterative learning algorithm is similar in form to the traditional higher-order discrete learning algorithm,the learning process is completely different from the traditional higherorder iterative learning algorithm. The traditional highorder ILC is the algebraic overlay of the control information of the previous two or more trials at the corresponding time.The new iterative learning algorithm is to pre-correct the subsequent unoccurred time with the error value of the current time in the same trial. They are essentially different.
Remark 5 When KL= 0, the iterative learning law with associative properties(5)is reduced to the traditional P-type ILC.Then the convergence condition of the control algorithm is ρ(I ? CBKP) < 1.It shows that the traditional P-type ILC transforming into an associative learning algorithm is theoretically based.
Remark 6 The traditional P-type ILC belongs to the the feed-forward learning control.As long as the dynamic system of the controlled object and the control objective remains the same, the system will be sure to converge in the iteration period as the number of learning times k increases.The associative P-type iterative learning law(5)is a generalization of the traditional P-type.It is equivalent to increasing the real-time feedback and correcting the information, which can effectively compensate for the lack of prior knowledge of the conventional P-type ILC, and improve the convergence of the traditional P-type algorithm.
In this section, the proposed ILC scheme is extended to linear time-varying systems
where A(j), B(j) and C(j) are time-varying matrices with appropriate dimensions and C(j +1)B(j)is of full column rank. The result is summarized in the following corollary.
Corollary 1 For the discrete-time linear time-varying system(34),an ILC algorithm with kernel association(5)is adopted, choosing the learning gain matrixKP, if the condition holds:ρ(I ?Kp(j)C(j+1)B(j))<1.Then the output trajectory of the learning algorithm (5) converges uniformly to the desired trajectory; that is, whenk →∞,such thatyk(j)→yd(j),j ∈SN.
Proof The proof can be performed similarly as Theorem 1.
Considering the time-varying systems(34),we have
Since a similar relationship also holds in thejth trial,it follows that
Replacing(14)and(15)in the proof of Theorem 1 with(35)and(36),we can obtainρ(I?Kp(j)C(j+1)B(j))<1,it can be concluded thatand thenso thatyk(j)tends toyd(j).
Remark 7 In Theorem 1 and Corollary 1,the identical initialization condition is identical.
To illustrate the effectiveness of the proposed learning algorithm, we introduce the following example, and compare the performance of the proposed new algorithm with the traditional P-type ILC.
Case 1 Time-invarying system[34]: consider the system
wherexk(0) = [0,0,0]T,the sample pointj ∈S50is the time interval of each trial,and the desired trajectory is set to
The values of the initial iteration of the controlleru0(j)are generated randomly by the rand function.The learning gainKP=0.4 is selected(the simulation example is a single input single output system,so the gain matrix is a real number). In the new learning algorithm, the convergence condition of the learning algorithmρ(I ?KpCB) =0.6<1 is calculated by the exponential correction parametersKE=1 andKL=0.1.
Fig.4(a)and Fig.4(b)respectively represent the process of tracking the desired output of the system under the 10th and the 20th trials by the traditional algorithm and the new learning algorithm.As can be seen from Fig.4,the system output tracks the expected output completely in the whole interval with the increase of trials. The proposed association algorithm only needs 10 times, while the traditional algorithm needs 20 times.
Fig.4 Tracking profile of the system
To facilitate the comparative analysis,the new algorithm and the traditional P-type iterative learning algorithm are studied through the numerical simulation under the same simulation environment.The iterative maximal tracking errors of the two algorithms are shown in Fig. 5, that is,
Fig.5 Comparison of tracking errors in Case 1
It can be seen from Fig.5 that the modified P-type control law is convergent to the error band(0.008 3) when it is iterated 10 times.The traditional control law requires 24 iterations to converge to the above error band.
Case 2 Time-varying system[35]:in order to show the effectiveness of the new algorithm of P-type ILC, a timevarying system is given as follows:
The initial state is set asxk(0) = [0,0,0]T, let the desired trajectory be
wherej ∈S50. Without loss of generality, the input of the initial iteration is simply set to zero,namely,u0(j) =0,j ∈S50.The learning gainKP=0.4 is selected.In the new learning algorithm, the convergence condition of the learning algorithm
is calculated by the exponential correction parametersKE= 1 andKL= 0.1. The performance of the maximal tracking erroris presented in Fig. 6, where the modified P-type control algorithm is convergent to the error band(0.018 3)when it is learning 55 times.The traditional control law requires 126 trials to converge to the above error band.
Fig.6 Comparison of tracking errors in Case 2
In this paper,it can be presented that the learning speed of the proposed control algorithm is much better,compared with the traditional method at the same iteration.
Remark 8 It can be seen that the convergence condition of the proposed algorithm is ultimately unrelated to the kernel function.We think that this is exactly the advantage of the algorithm in this paper,and the proposed algorithm does not lead to more constraints. For the most conventional linear discrete systems, there is no need to change their convergence conditions,the stability of the algorithm can be guaranteed by using the algorithm proposed in this paper.The appropriate association algorithms will greatly speed up the learning convergence rate, which shows that the kernel-based self-association iterative learning idea is correct and feasible.
We design an algorithm of P-type ILC strategy with the kernel-based auto-associative memory for a linear discrete system.During each iterative learning process,the control value of the current time is corrected.Meanwhile,the control value in the subsequent time is pre-corrected by association,which speeds up the learning process.The convergence speed of the association algorithm is illustrated by simulation results, which is related with the proportional learning gain,the learning interval and the exponential factor. In addition, how to choose the proportional learning gain and the exponential factor are also an expecting topic to discuss. Further, the investigation on these issues will come forth.
Journal of Systems Engineering and Electronics2020年2期