亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

Subspace Minimization Conjugate Gradient Method Based on Cubic Regularization Model for Unconstrained Optimization

2021-12-10 05:49:56TingZhaoandHongweiLiu

Journal of Harbin Institute of Technology(New Series) 2021年5期

Ting Zhaoand Hongwei Liu

(School of Mathematics and Statistics, Xidian University, Xi'an 710126, China)

Abstract: Many methods have been put forward to solve unconstrained optimization problems, among which conjugate gradient method(CG)is very important.With the increasing emergence of large-scale problems, the subspace technology has become particularly important and widely used in the field of optimization.In this study, a new CG method was put forward, which combined subspace technology and a cubic regularization model.Besides, a special scaled norm in a cubic regularization model was analyzed.Under certain conditions, some significant characteristics of the search direction were given and the convergence of the algorithm was built.Numerical comparisons show that for the 145 test functions under the CUTEr library, the proposed method is better than two classical CG methods and two new subspaces conjugate gradient methods.

Keywords: cubic regularization model; conjugate gradient method; subspace technique; unconstrained optimization

0 Introduction

The general unconstrained optimization problem is considered in this study.Its form is as follows:

minf(x),x∈Rn

(1)

wheref:Rn→Ris the objective function.CG method is one of the most important methods for solving the large-scale unconstrained optimization problem(1).Letαkbe a stepsize, then the iterations satisfy the iterative format:

xk+1=xk+αkdk

(2)

wheredkis search direction and denoted by

(3)

where ‖.‖ points to the Euclidean norm andyk=gk+1-gk.In recent years, other effective CG methods have different views[6-8].

With the increasing scale of optimization problems, the subspace technique is favored by more and more researchers.Subspace procedures are a kind of extremely effective numerical procedures to solve wide-ranging optimization issues instead of wide-ranging subproblems, as it is not necessary to solve wide-ranging subproblems in per iteration[9].In 1995, Yuan and Stoer[10]first stated the subspace minimization CG(SMCG)algorithm, in which the search direction was calculated by minimizing a quadratic estimated model on Ωk+1={sk,gk+1}, i.e.:

d=vsk+μgk+1

(4)

wheresk=xk+1-xkandμ,vare parameters.They considered the following problem:

(5)

whereBk+1is a rough calculation to Hessian matrix.SMCG method is an extension of the classical CG method and is a kind of efficient optimization method, which has attracted the attention of some researchers.Because a three-dimensional subspace contains more information of the iteration point than a two-dimensional one, inspired by SMCG, Li et al.[11]extended SMCG to three-dimensional(Ωk+1={gk+1,sk,sk-1})subspace minimization conjugate gradient method(SMCG_NLS), and numerical comparisons showed that the proposed method worked effectively.

Generally, a quadratic mold may well estimatef(x)in a smaller neighborhood of the minimizer, so the iterative procedures are usually on account of this model.However, when the objective function has high non-linearity, a quadratic model may not approximate it very well, since iteration dot is far from the minimizer.To address the drawback, some cubic regularization algorithms for unconstrained optimization are getting increasingly more attention[12-13].The idea is to incorporate a local quadratic approximation of the objective function with a cubic regularization term and then globally minimize it at each iteration.The cubic regularization was first introduced by Griewank[14]and was later considered by many authors with global convergence and complexity analysis[15].In fact, the cubic regularization algorithm needs to carry out multiple iterative solution exploratory steps in the subspace, which makes the solution of the subproblem more complex and requires a lot of calculation and storage space.The solution of the problem is the most important and the most cost-consuming part of the algorithm.Therefore, how to construct an approximate model closer to the objective function and how to efficiently solve the subproblem of the algorithm are the core of the challenge.

In this study, a new three-term SMCG method inspired by SMCG_NLS is raised on account of a cubic regularization model with a special scaled norm(compared with RC, when thel2-norm is used to define the regularization term in the subproblem, the dominant computational cost of the resulting algorithm is mainly the cost of successful iterations, as the unsuccessful ones are inexpensive).Generally, when the objective function approaches a quadratic function, the performance of a quadratic model is better, so the quadratic model is chosen.Therefore, this study considers the dynamic selection of the appropriate approximate model, i.e., a quadratic model or a cubic regular model, further proposes a new algorithm, and analyzes the convergence of the new algorithm.Finally, numerical experiments verify the effectiveness of the algorithm.

1 Form and Solution of the Cubic Regularized Subproblem

In this section, the form of the cubic regularized subproblem is briefly introduced by using a special scaled norm and the solution of the problem is provided.

The ordinary form of the cubic regularized subproblem is given by

(6)

wherec∈Rn,σ>0, andH∈Rn×nis a symmetric positive definite matrix.

From Theorem 3.1 of Ref.[15], it is known the pointx*is a global minimizer of Eq.(6)if and only if

(H+σ‖x*‖I)x*=-c

(7)

whereIis the unit array.IfH+σ‖x*‖Iis positive definite,x*is unique.

Subsequently, another form of the cubic regularized subproblem with a special scaled norm is given:

(8)

(9)

(10)

(1+σz)Ia=-β

The last expression can be equivalently written as

whereaiandβiare the components of vectorsaandβ, respectively.

(11)

Based on the above derivation and analysis, the following corollary is obtained:

2 Search Direction and Wolfe Line Search

2.1 Deduction of the New Search Direction

In this study, the following was defined at each iteration:

According to Ref.[16],tkis a quantity which displays how close the objective function is to a quadratic function on the line segment between the current iteration point and the previous iteration point.If the condition

tk≤ω1or(tk≤ω2andtk-1≤ω2)

(12)

holds, where 0<ω1≤ω2, it is believed that objective function is extremely close to a quadratic on the line segment betweenxk-1andxk, then a quadratic estimation model is considered.Otherwise, the cubic regularization model might be more suitable because they basically improve the gradient evaluation complexity and worst-case iteration[17].Sodkis gained by minimizing a cubic regularization mold or a quadratic mold in the three-dimensional subspace.

The following regularization model off(x)was considered:

(13)

whereσk+1is a dynamic non-negative regularization parameter,Hk+1is a positive and symmetric definite matrix, satisfying the equationHk+1sk=ykand Ωk+1=span{gk+1,sk,sk-1}.

It can be found thatmk+1(d)has the following properties: ifσk+1=0, Eq.(13)produces a quadratic approximate model.So letσk+1=0 when condition Eq.(12)holds; otherwise,σk+1is obtained by interpolation function.Obviously, the dimension of Ωk+1maybe 3, 2, or 1.The search direction are exported from the following three situations.

Situation1: dim(Ωk+1)=3.In this situation, the search directiondk+1may be given by

dk+1=μgk+1+νsk+τsk-1

(14)

whereμ,ν, andτare coefficients to be determined.

By substituting Eq.(14)into Eq.(13)，the following cubic regularized subproblem aboutμ,ν, andτis obtained:

(15)

where

(16)

where

which can be seen in Ref.[11].

From Corollary 1.1, the unique solution to Eq.(15)can be obtained:

(17)

Sodk+1can be calculated by Eq.(17)and Eq.(14)under the following conditions:

(18)

(19)

and

(20)

Remark1:Bk+1in Eq.(15)is different fromBk+1in Eq.(5).In model Eq.(13),Hk+1was used to represent the rough calculation of Hessian matrix.

Situation2: dim(Ωk+1)=2.If only condition(19)successfully holds, Ωk+1=span {gk+1,sk} is considered.dk+1may be described by

dk+1=μgk+1+vsk

(21)

whereμandvare parameters to be determined.Substituting Eq.(21)into Eq.(13), the corresponding problem Eq.(15)reduces to

(22)

From Corollary 1.1, the unique solution to Eq.(22)is obtained:

(23)

For convex quadratic functions, the search direction generated by Eq.(23)will collimate the HS direction, when the exact line search condition is used.Sounder certain conditions, the HS direction can be considered as a particular situation of Eq.(23).The advantage of the HS method lies in the finite-termination property, which may make the choice of this direction accelerate the convergence speed of the proposed algorithm.So when the conditions

(24)

Situation3: dim(Ωk+1)=1.In this situation, the negative gradient search direction is considered, if none of(18),(19), and(20)holds, namely,

dk+1=-gk+1

(25)

2.2 Choice of the Nonmonotonic Line Search

Obviously, it is very important to choose the suitable line search for an optimization method.In this part, a modified nonmonotone Wolfe line search was developed.

It can be easily inferred that for the overall competence of most optimization methods, the line search is a significant element.This study focuses on the ZH line search, which was put forward by Zhang and Hager[19].On this basis, some improvements were made to get more suitable stepsize and better convergence results.The nonmonotonic line search put forward by Zhang and Hager is as follows:

(26)

(27)

where 0<δ<σ<1,Q0=1,f0=C0, andQk+1andCk+1are upgraded by

(28)

whereηk∈[0,1].

Our improvements are as follows:

C1=min{C0,f1+1.0},Q1=2.0

(29)

whenk≥1,Ck+1, andQk+1are renewed by Eq.(28)，whereηkis taken as

(30)

where mod(k,l)is defined as the remainder forkmodulol,l=max(20,n), andη=0.7 whenCk-fk>0.999|Ck|; otherwiseη=0.999.Such option ofηkmay be used to dynamically monitor the nonmonotonicity of line search[20].

3 Algorithm

A new CG method is introduced in this section, which combines subspace technology and a cubic regularization model.

Algorithm1SMCG method with cubic regularization(SMCG_CR)

Step2When ‖gk‖∞≤ε, break off.

Step3Calculate a stepsizeαk>0 sufficing Eq.(26)and Eq.(27).Letxk+1=xk+αkdk.When ‖gk‖∞≤ε, break off.

Step4Calculate the direction.

When conditions Eq.(18), Eq.(19), and Eq.(20)hold, go to 4.1; when condition Eq.(19)holds, go to 4.2; otherwise, go to 4.3.

4.1Calculatedk+1by Eq.(14)and Eq.(17), then go to Step 5.When condition(12)holds, letσk+1=0.

4.2Compute the search directiondk+1by Eq.(21)and Eq.(23), then go to Step 5.When condition(12)holds, letσk+1=0.

Step5UpdateQk+1andCk+1using Eq.(28), Eq.(29), with Eq.(30).

Step6Letk∶=k+1, and go to Step 2.

4 Convergence Analysis

The global convergence of SMCG_CR is established in this section.First, some theoretical properties of the directionsdk+1are analyzed.It is presumed that ‖gk‖≠0 for eachk, or for somekthere is a stationary point.In addition, it is supposed that the following assumptions are satisfied by the objective functionf.Defineκas a neighborhood of the level set Θ(x0)={x∈Rn:f(x)≤f(x0)}, wherex0is the initial point.

Assumption1fis continuously differentiable and bounded from below inκ.

Assumption2The gradientgis Lipchitz continuous inκ, namely, letL>0 be a constant, then the following is obtained:

‖g(x)-g(y)‖≤L‖x-y‖,?x,y∈κ

Lemma4.1Assume the search directiondk+1is worked out by SMCG_CR.Letc1>0, then the following inequation is obtained:

Proof: DenoteT=(1+σk+1z*)-1then 0.5≤T≤1 is built, since 0≤σk+1z*≤1.Therefore, the proving process is similar to the Lemma 3 in Ref.[11].

Lemma4.2Assume the search directiondk+1is worked out by SMCG_CR.Then the following inequation is obtained:

‖dk+1‖≤c2‖gk+1‖

wherec2>0.

Proof: The proving process is similar to the Lemma 4 in Ref.[11].

Lemma4.3Assume Assumption 1 holds.Thenfk≤Ckis obtained for perkwhen the iterative array {xk} is produced by the SMCG_CR.

Proof: As a result of Eq.(26)and descent directiondk+1,fk+1

Lemma4.4Assume Assumption 2 is satisfied and the iterative array {xk} is created by the SMCG_CR.Then,

Proof: By Assumption 2 and Eq.(27), the following is obtained:

which completes the proof.

Theorem4.5Assume Assumption 1 and 2 are satisfied.If the iterative array {xk} is created by the SMCG_CR, the following equation is obtained:

Proof: By Lemma 4.4, Eq.(26), Lemma 4.1, and Lemma 4.2, it follows

(31)

According to Eq.(30), an upper bound ofQk+1in Eq.(28)is given.Let ?.」 be a floor function, then whenk≥1,Qk+1may be written as

Then the following is obtained:

DenoteM=1+(l+1)/(1-η), which yields the factQk+1≤M.

Combing Eq.(28)and Eq.(31), the following is obtained:

According to Eq.(29), it can be inferred thatC1≤C0, which implies thatCkis monotonically decreasing.Due to Assumption 1 and Lemma 4.3, it can be found thatCkis bounded from below.Then

5 Numerical Results

In order to verify the effectiveness of the proposed method in the 145 test functions in CUTEr library[21], some experiments were carried out to compare the property of SMCG_CR with that of CG_DESCENT(5.3)[5], CGOPT[22], SMCG_BB[23], and SMCG_Conic[24].Among them, the first two methods are classical conjugate gradient methods, and the last two are new subspace conjugate gradient methods.The dimensions and names for these 145 functions are the same as that of the numerical results in Ref.[25].The code of CGOPT can be downloaded at http://coa.amss.ac.cn/wordpress/?page_id=21, while the codes of CG_DESCENT(5.3)and SMCG_BB can be loaded at http://users.clas.ufl.edu/hager/papers/Software and http://web.xidian.edu.cn/xdliuhongwei/paper.html, respectively.

The following parameters in SMCG_CR are used:

and

For the update of the parameterσk+1, the interpolation method[26]was used.By imposing the interpolation condition

The following equation is obtained:

In order to ensureσk+1≥0, let

Both CG_DESCENT(5.3)and CGOPT use default argument values.All test algorithms are ended if the number of iterations exceeds 200000 or ‖gk‖∞≤10-6is satisfied.

The performance profiles introduced by Dolan and More[27]were used to show the performances of these test algorithms.Two groups of the numerical experiment are presented.They all run in Ubuntu 10.04 LTS which is fixed in a VMware Workstation 10.0 installed in Windows 7.SMCG_CR was compared with CGOPT and CG_DECENT(5.3)in the first series of numerical tests.SMCG_CR smoothly settled 142 functions, which were 8 functions more than CGOPT, while CG_DECENT(5.3)solved 144 functions.After testing the functions for which the iteration continued via ‖gk‖∞≤10-6, 133 functions remained.Figs.1-4 describe the effectiveness of each algorithm for the 133 functions.

Regarding the number of iterations in Fig.1, it is worth noting that SMCG_CR is more efficient than CGOPT and CG_DESCENT(5.3), and it smoothly settles about 54% of the test functions with the least number of iterations.The number of test functions it settles are 17% more than CG_DESCENT(5.3)and 31% more than CGOPT.As shown in Fig.2, it is observed that SMCG_CR is more advanced than CGOPT and CG_DESCENT(5.3)for the number of function evaluations.

Fig. 1 Performance profile based on the number of iterations

Fig. 2 Performance profile based on the number of function evaluations

Fig. 3 Performance profile based on the number of gradient evaluations

Fig.3 displays the property picture with regard to the number of gradient evaluations.It is easy to observe that the SMCG_CR has the best property, and it smoothly settles about 57% of the test functions with the least number of gradient evaluations.The number of test functions it settles are 28% more than CG_DESCENT(5.3)and 37% more than CGOPT.Fig.4 displays the performance profile with regard to the CPU time.It can be observed that SMCG_CR is the fastest for about 66% of the test functions, which are 58% test functions faster than CG_DESCENT(5.3)and 31% test functions more than CGOPT.Figs.1-4 indicate that SMCG_CR outperforms CGOPT and CG_DESCENT(5.3)for the 145 test functions in the CUTEr library.

Fig. 4 Performance profile based on CPU time

Fig. 5 Performance profile based on the number of iterations

Fig. 6 Performance profile based on the number of function evaluations

Fig. 7 Performance profile based on the number of gradient evaluations

Fig. 8 Performance profile based on CPU time

In the second series of the numerical tests, SMCG_CR was compared with SMCG_BB and SMCG_Conic.SMCG_CR successfully solved 142 problems, while SMCG_BB and SMCG_Conic solved 140 and 138 problems, respectively.As shown in Figs.5-8, SMCG_CR is significantly better than SMCG_BB and SMCG_Conicit for the 145 test functions in the CUTEr library.

6 Conclusions

In this paper, a new CG method is presented, which combines subspace technology and a special cubic regularization model.In this method, the sufficient descent condition was satisfied by the search direction.The global convergence of SMCG_CR was built under mild conditions.The numerical experiments showed that SMCG_CR is extraordinarily prospective.

Journal of Harbin Institute of Technology(New Series)2021年5期

Journal of Harbin Institute of Technology(New Series)的其它文章: Comparative Study on Microstructures of Zincalume Steel(G550)Welded Joint Between Metal Inert Gas and Laser Beam Welding; Rethinking about the Formulae of the Relationship between Euler Angles and Texture; On the Transient Numerical Simulation of Solid Rocket Motor by Coupling Quasi One-Dimension Internal Flow with Three-Dimension Propellant Grain Burnback; Shear Behavior of Open Steel Tube Connectors in Steel-UHPC Composite Decks; GPPre: A Python-Based Tool in Grasshopper for Office Building Performance Optimization; Method for Calculating Cartesian Coordinates of Operator’s Arm Joints for Anthropomorphic Manipulator Master-Slave Control Using Exoskeleton