M.D.S.ALIYU
Department of Electrical Engineering,King Faisal University,Al-Ahsa,31982,Saudi Arabia
Hamilton-Jacobi theory has remained the cornerstone of modern optimal control theory and advanced mechanics[1–5].Unfortunately,a serious set-back in the practical application of nonlinear optimal control theory is the difficulty in solving the Hamilton-Jacobi-Bellman-Isaacs partial differential-equations(HJBIEs)[6–11],[4–12],[5–15]and[16–19].There are no closed-form solutions for them,and no proven established systematic numerical approaches for solving them.Moreover,the concept of viscosity solutions[2,20–23]originally developed by Lions[22]are only useful for proving and deriving theoretical results.
Nevertheless,various attempts have however been made to find computationally sound methods for solving the HJBIEs,and there is a vast literature on the subject.An excellent literature review of past approaches can be found in[13,14].In Lukes[16],Glad[15],Isidori[24],Huang[25],Taylor series-based approximation approaches are presented,while in[13,14,18,26]Gallerkin and other basis functions expansions are used.More recently,in[27,28]policy iterations are used to derive iterative solutions in closed-form.This method is also similar in spirit with the ones presented in[13,14].However,the validity of the method has only been demonstrated with scalar systems.A similar recursive approach is utilized in[19]to compute stabilizing solutions starting from a stabilizing solution to the corresponding Riccati equation for the linearization of the system.
In addition,attempts to find exact and analytical approaches for solving the HJBIEs have also been m ade in[6–9,17].The approaches attempt to convert the HJBIEs to algebraic equations,the solution to which can yield the gradient of the desired scalar function.In fact,these were some of the first attempts to derive closed form solutions to the HJBIEs.However,the success of the approaches in[7,8]is significantly undermined by the difficulty of solving the resulting discriminant equations.Alternatively,in[17]an at tempt is made to find the algebraic gradient from the maximal in volutive ideal that contains the Ham iltonian function of the corresponding Hamiltonian system.
On the other hand,in[11,25]neural network or basis functions and Taylor series apppproximations respectively,are utilized to obtain recursive solutions to the discrete-time problem.These methods share a lot of spirit with the one originally developed in[26],and are so far some of the most tangible approaches to the discrete-time problem.Moreover,numerical methods using finite element and finite differences are also available[21,23].
The problem s with most of the methods so far presented are two fold:i)they are comptationally expensive,requiring the solution to a system of N nonlinear equations,for N basis functions;ii)they do not approxim ate the scalar function directly,but instead,approximate its gradient.These two problem s seriously limit the applicability of the methods.
Thus,in this paper,we present yet a new iterative approach to the solution to the HJBIEs.We apply fixedpoint iterations[29,30]in Banach spaces to successively approximate the scalar value-function directly,as opposed to its gradient,and we establish convergence of the approaches under fairly mild assumptions.The approaches are computationally efficient and can easily be automated using symbolic algebra packages such as MAPLE,MATHEMATICA,and MATLAB.It is hoped that the results presented in this paper will represent the first attempt at establishing systematically computationally efficient and concrete approaches for solving the HJBIEs which are also solidly founded from well-known methods of fixed-point theory in mathematical analysis.
The rest of the paper is organized as follows.In Section 2,we begin with preliminaries and problem definition.Then in Section 3,we develop the iterative methods for the HJBIEs in deterministic nonlinear optimal control.Convergence results for the methods are discussed and some exam ples are presented.These methods are then extended to the Stochastic Ham ilton-Jacobi-Bellm an equation(SHJBE)arising in stochastic nonlinear optimal control in Section 4.Again,convergene results for the method are also discussed.Finally,conclusions and suggestions for future work are presented in Section 5.Hereafter the notation will be standard except where otherwise stated.
We consider the time-invariant or stationary HJBIEs associated with the infinite-horizon optimal control of the follow ing smooth affine nonlinear state-space system Σ defined over a subset X ? Rnin coordinates(x1,...,xn):
where x=(x1,...,xn)T∈X is the state vector;w∈W?Rsis the disturbance into the system which belongs to the set W of admissible disturbances;u∈U is the control input,which belongs to the set U?Rpof admissible controls;and z∈Rris an objective or error function.Whereas f:X → Rn,g1:X → Rn×s,and g2:X → Rn×p,h:X → Rm.We also assume that for u∈U,and any x(t0)∈X,there exist smooth solutions to the system Σ[31].In addition,x0=0 is an equilibrium point of the system such that for w=0,u=0,f(x0)=0.
Our aim in this paper is to find iteratively approximate solutions of the HJBIE(2)associated with the optim al control of system(1)in a region Ω?X.We consider the Banach space of bounded real continuous functions from Ω to R with the supremum norm,BC((Ω,R),sup|·|),which for brevity we shall simply denote by BC(Ω).However,we shall focus particular attention to a subset of this set containing functions that are also smooth,i.e.,V(Ω):=C∞∩ BC(Ω).Moreover,if we assume Ω to be com pact,then it is sufficient to consider V(Ω):=C∞(Ω),since the set of continuous functions over a com pact metric space is bounded[29].
In the sequel,we construct smooth maps of the formsuch thathas a fixed-point in V(Ω).We also show that starting from any element V0∈ V(Ω),the method of successive approximation can be applied to find the fixed point V?for each map,and moreover,convergence to this fixed-point is shown to be quadratic.
Our aim in this section is to develop an iterative or successive approximation methods for solving the Ham -Jacobi-Bellman-Isaacs equation(HJBIE)arising in optimal control problem s for affine nonlinear system s using classical vector analysis tools.Accordingly,let A:R3→R3be a vector-field and φ:R3→R,a scalar function.Then,our approach is based on the following identity:
where we have used the notationThen,at this point,it is highly tempting to apply a gradient based method such as the steepest-descent or New ton’s method[10,29]or their variants to invert the maps(4),(5).However,unfortunatelyand therefore these methods fail.Nevertheless,by naive calculations from(5)and(6)we can define respectively the following inverse mapsby
In this section,we develop the successive approximation method to the solution to the HJBIE based on the for mulae(6),(7)and with the exact expression of gradient DV(x)=?V(x),while in the next section,we develop the result with an approximate gradient.Accordingly,let us first consider the map(6),leading to the following iterative formula
Assumption1For the nonlinear system Σ(1),the following assumptions hold:
b)?0 < κ1,κ2,κ3< ∞ (real constants)such that
Proposition 1Consider the HJBE(2)and let Assumption 1 be satisfied by the system.Suppose in addition,the solution V?to the HJBIE(2)is such that
Then,starting with an approximation V0∈ V(Ω),the approximation error at every iteration of the formula(8)remains point wise bounded for all x∈Ωr:={x:‖x?x0‖< r}? Ω (r sm all).
ProofUsing(8),it is easy to show that
Now note that,
Therefore,using(12)in(11),we have
We seek smooth successive approximations Vk,k=1,...to the solution V?to(2)in the neighborhood Ωr.Thus,the difference?Vk(x)??V?(x)can be estimated as
If V0(x)is smooth,then the iterative formula(8)generates smooth(except possibly at isolated points)successive approximations Vkto the solution V?of(2).Thus,for‖x?x0‖< r,?ε1> 0,ε2> 0 such that
where ε = ε1+ε2.The last term in(15)can be estimated from a first-order Taylor approximation of the difference Vk?V?around x0,as
for all x in the neighborhood Ωr.Therefore,by the triangle-inequality
Consequently,using(16)in(15),we have
Finally,using(17)in(13),we get
This show s that the iteration error is bounded;for if we start with k=0,we see that the error|V1(x)? V?(x)|is point-wise bounded by|V0(x)? V?(x)|.Similarly,the error|V2(x)? V?(x)|is point-wise bounded by|V1(x)? V?(x)|,and so on.Note also that,the above result holds for=r+∈,∈sm all,and thus for□
The next result summarize the main convergence results of the method.
Theorem 1Consider the HJBE(2)and the problem of finding the scalar function V:Rn→R that solves it.Suppose all the assumptions of Proposition 1 hold,and in addition,suppose γ2< 1 in Ωr.Then,the iterative formula(8)starting with a smooth approximation V0∈ V(Ω)converges uniformly and quadratically to a smooth solution
ProofFrom the proof of Proposition 1,applying(19)inductively for k,k?1,...,1,0,we have
Taking the limit as k→∞in the above inequality(20)and since γ2< 1,all term s of|V0(x)?V?(x)|go to 0 and we havea constant.This im p lies uniform convergence of the approximations Vkto the solution V?,albeit differing from it by a constant.Moreover,application of the boundary conditions V?(x0)=Vk(x0)=0 at each iteration guarantees that Υ=0. □
Rem ark 1We notice also from inequality(18)that,if we let r→∞,then the approxim ation error is linearly convergent,i.e.,with
We consider an exam p le at this point.
Exam p le 1Consider the following system and the exam p le:
The resulting HJBIE for the H2problem is
where
Example 2We consider the linear system
It then follows that,if P?∈P is a solution to(26),then
Corollary 1Consider the Riccati equation(26),and suppose there exists a symmeric solution P?to it.In addition,suppose for the system Σl,|Tr(A)|> 0,‖Q‖< ∞,andˉγ2<1.Then,starting with an initial approximation P0∈P,the iterative formula(27)converges quadratically to a solution P?∈ P of the Riccati equation(26).
Rem ark 3The above recursive formula(27)and algorithm is similar in sprit with the ones proposed in[34–36].
Next,we consider the inverse m ap(7),and prove as well the convergence results for the iterative formula:
We begin similarly,with the following theorem.
Theorem 2Consider the HJBIE(2)and suppose all the hypotheses of Theorem 1 hold for the system.Further,let the formula(29)be applied to obtain an approximate solution,and suppose there exists a real number η0<∞such that for all iterations,the follow ing inequality is satisfied,
Then,starting with an approximation V0∈ V(Ω),the approximation error|Vk+1(x)? V?(x)|at every iteration of the formula(29)remains point-wise bounded,and converges uniformly and quadratically to a constant as k→∞for all x∈Ωr.
ProofFrom(29)we have
Further rearrangement of the above equation leads to
Define now
and using this in(31),we have
which is exactly the same as(10)with?.f(x)replaced by.Thus,by the hypothesis(30),
Consequently,invoking the results of Proposition 1,we have
which is exactly as(19),and so the rest of the proof follow s from Theorem 1. □
Rem ark 4The result of Theorem 2 clearly establishes the equivalence of the two iterative formulas(8),(29)and the resulting algorithm s.How ever,the former(8)is computationally more tractable.
Rem ark 5It is not possible to check condition(30)a priori.However,the condition is not stringent in the sense that it is hardly violated.
In this section,we explore the application of the iterative algorithm for the HJBIE with approximate gradient.
We focus particularly on the formula(8).Recall that,the Fréchet derivative[29]generalizes the gradient of an ordinary scalar function on Rnto Banach spaces.For the function V at x,it can be implicitly defined as
for all vectors ν∈ Ωrwith origin at x.Equivalently,DV(x)can be defined as
Consequently,using?V(x)for DV(x),Vk(x)for V(x+ν),and Vk?1(x)for V(x)in(34),while restricting ν to unit vectors,we can replace(8)with the following formula:
Accordingly,we have the following result for the convergence of the algorithm.
ProofFrom(35)we have
Therefore,
Using results from the proof of Proposition 1,the second term in(36)can be estimated as
Consequently,inequality(36)reduces to
for some constant Ψ.Finally,observe that,Therefore,Hence the result follow s. □
In this section,we extend and modify the results of the previous section to the HJBE of stochastic optimal control[8,15,21,23].In particular,we consider the HJBE associated with the optimal of the Ito stochastic differ-ential system
The time-invariant SHJBE associated with the control problem
is given by
for some smooth C2-functionΩ?X→R,and wherewhile E is the mathematical expectation operator.Suppose also that the nonlinear system(39)satisfies the equivalent of Assumption 2.
Assumption 2For the nonlinear system Σs(39),the following assumptions hold:
Accordingly,define the iterative form ula for(40)as
Theorem 4Consider the SHJBE(40)and the problem of finding the scalar function:Rn→R that solves it.Suppose Assumption 2 holds for the system,and in addition,for the system.Then,the iterative formula(41)starting with a smooth approximationconverges uniform ly and quadratically to a smooth solutionof(40).
ProofUsing similar arguments as in Proposition 1,it is easy to show that
Thus,
for someδ>0.Therefore,substituting(43)in(42)yields
Rem ark 7It is sufficient to check thatandare bounded at each iteration to guarantee thatandrem ain bounded respectively,and subsequently to guarantee the convergence of the algorithm.
In this paper,we have presented new iterative approaches for solving the HJBIEs arising in the optimal deterministic and stochastic control of affine nonlinear systems.Fixed-point iterations in Banach spaces are applied to successively approximate the scalar value-function directly,and convergence results for the approaches have been established under fairly mild conditions.Some examples have also been worked-out to demonstrate the effectiveness of the approaches.The approaches presented can also be easily be automated using symbolic algebra packages.
It is hoped that the results presented will represent an attempt at establishing systematically efficient computational procedures for solving the HJBIEs.Nevertheless,it should be noted that the results presented are preliminary and inexhaustive,since iterative maps are never unique.We do not yet claim in anyway that the solutions computed will be even stabilizing.Therefore,it is expected that improvements,refinements,and more experimentation of the basic algorithm s presented will be developed before a satisfactory computational procedure is established.Moreover,it would be worth-while to see if iterative maps similar in spirit with the bisection and secant methods[39]that converge under much milder and more general assumptions can be developed.
[1]R.Abraham,J.E.Marsden.Foundations of Mechanics.Reading,MA:Addison-Wesley,1978.
[2] M.Bardi,I.C.-Dolcetta.Optimal control and viscosity solutions of Ham ilton-Jacobi-Bellman equations.Systems&Control:Foundations&Applications,Boston:Birhauser,1997:DOI https://doi.org/10.1007/978-0-8176-4755-1.
[3]V.Barbu,G.Da Prata.Hamilton-Jacobi Equations in Hilbert Space.London:Pitman Advanced Publishing Program,1983.
[4]S.H.Benton.The Hamilton-Jacobi Equation:A Global Approach.New York:Academic Press,1977.
[5]D.Kirk.Optim al Control Theory.Englewood Cliffs:Prentice Hall,1972.
[6]M.D.S.Aliyu.Nonlinear H∞Control,Hamiltonian System s and Hamilton-Jacobi Equations.Boca Raton:CRC Press,2011.
[7]M.D.S.Aliyu.An approach for solving the Hamilton-Jacobi-Isaacs equation(HJIE)in nonlinear H-infinity control.Automatica,2003,39(5):877–884.
[8]M.D.S.Aliyu.A Transformation Approach for solving the Hamilton-Jacobi-Bellman equations in H2deterministic and stochastic optimal control of affine nonlinear Systems.Automatica,2003,39:1243–1249.
[9]M.D.S.Aliyu,L.Smolinsky L.A Parametrization approach for solving the Hamilton-Jacobi equation and application to the A2Toda lattice.Nonlinear Dynamic sand System sTheory.2005,5(4):323–344.
[10]M.D.S.Aliyu.Adaptive solution of Hamilton-Jacobi-Isaac equation and practical H-infinity stabilization of non linear system s.Proceedings of the IEEE International Conference on Control Applications,Anchorage:IEEE,2000:343–348.
[11]A.Al-Tam im i,F.L.Lew is,M.A.Khalaf.Discrete-time non linear HJB solution using approximate dynamic programming:convergence p roof.IEEE Transactions on System s Man and Cybernetics–Part B:Cybernetics,2008,38(4):943–949.
[12]T.Basar,P.Bernhard.H∞Optimal Control and Related Minim ax Design.New York:Birkhauser,1991.
[13]R.W.Beard,G.N.Saridas,J.T.Wen.Galerkin approximations of the generalized HJB equation.Automatica,1997,33(12):2159–2177.
[14]R.W.Beard,G.N.Saridas,J.T.Wen.Successive Galerkin approximation algorithm s for nonlinear optimal and robust control.International Journal of Control,1998,71(5):717–743.
[15]S.T.Glad.Robustness of nonlinear state-feedback–a survey.Automatica,1987,23(4):425–435.
[16]D.L.Lukes.Optimal regulation of nonlinear dynamical system s.SIAM Journal on Control,1969,7(1):75–100.
[17]T.Ohtsuka.Solutions to the Hamilton-Jacobi equation with algebraic gradients.IEEE Transactions on Automatic Control,2011,56(8):1874–1885.
[18]P.Tsiotras,M.Corless,M.Rotea.An L2disturbance attenuation solution to the nonlinear benchmark problem.International Journal of Robust and Nonlinear Control,1998,8(4/5):311–330.
[19]Y.Feng,B.D.O.Anderson,M.Rotkowitz.A game theoretic algorithm to compute local stabilizing solutions to HJBI equations in nonlinear H∞control.Automatica,2009,45(4):881–888.
[20]L.C.Evans.Partial Differential Equations.Providence:AMS,1998.
[21]W.H.Fleming,M.Soner.Controlled Markov Processes and Viscosity Solutions.2nd ed.London:Sp ringer,2006.
[22]P.L.Lions.Generalized Solutions of Hamilton-Jacobi Equations.Research Notes in Mathematics.London:Pitman Advanced Publishing Program,1982.
[23]J.Yong,X.Zhou.Stochastic Controls,Hamiltonian Systems and HJB Equations.New York:Springer,1999.
[24]A.Isidori,W.Lin.Global L2-gain design for a class of nonlinear system s.System s&Control Letters,1998,34(5):245–252.
[25]J.Huang.An algorithm to solve the discrete HJI equation arising in the L2-gain optimization problem.International Journal of Control,1999,72(1):49–57.
[26]H.Guillard,S.Monaco,D.N.Cyrot.Approximate solutions to nonlinear discrete-time H∞Control.IEEE Transactions on Automatic Control,1995,40(12):2143–2148.
[27]M.Abu-Khalaf,F.L.Lew is,J.Huang.Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞state-feedback control with input saturation.IEEE Transactions on Automatic Control,2006,51(12):1989–1993.
[28]M.Abu-Khalaf,F.L.Lew is.Nearly optimal control law s for nonlinear systems with saturating actuators using a neural network HJB approach.Automatica,2005,41(5):779–791.
[29]E.Zeidler.Nonlinear Functional Analysis and Its Applications:Fixed Point Theorems.Hiedel berg:Springer,1985.
[30]J.M.Ortega,W.C.Rheinboldt.Iterative Solution of Nonlinear Equations in Several Variables.London:Academic Press,1970.
[31]H.K.Khalil.Nonlinear System s.New York:M cmillan Publishers,1992.
[32]D.Cox,J.Little,D.O’Shea.Ideals,Varieties and Algorithm s:An Introduction to Com putational Algebraic Geometry and Commutative Algebra.3rd ed.New York:Springer,2007.
[33]S.Bittanti,A.J.Laub,J.C.Willems.The Riccati Equation.Berlin:Springer,1991.
[34]D.L.Kleinmann.On an iterative technique for Riccati equation com putations.IEEE Transactions on Autom atic Control,1968,13(1):114–115.
[35]G.G.L.Meyer,H.J.Payne.An iterative method of solution of the algebraic Riccati equation.IEEE Transactions on Autoatic Control,1972,17(6):550–551.
[36]K.Vit.Iterative solution of the Riccati equation.IEEE Transactions on Autom atic Control,1972,17(2):258–259.
[37]J.M.Sanuik,I.B.Rhodes.A matrix inequality associated with bounds on solutions of algebraic Riccati and Lyapunov equations.IEEE Transactions on Automatic Control,1987,32(8):739–740.
[38]F.Zhang,Q.Zhang.Eigenvalue inequalities for matrix products.IEEE Transactions on Automatic Control,2006,51(9):1506–1509.
[39]W.Cheney,D.Kincaid.Numerical Mathematics and Computing.7th ed.Belmont,CA:Brooks/Cole,2012.
Control Theory and Technology2018年1期