Research Article  Open Access
Shengwei Yao, Yuping Wu, Jielan Yang, Jieqiong Xu, "A ThreeTerm Gradient Descent Method with Subspace Techniques", Mathematical Problems in Engineering, vol. 2021, Article ID 8867309, 7 pages, 2021. https://doi.org/10.1155/2021/8867309
A ThreeTerm Gradient Descent Method with Subspace Techniques
Abstract
We proposed a threeterm gradient descent method that can be well applied to address the optimization problems in this article. The search direction of the obtained method is generated in a specific subspace. Specifically, a quadratic approximation model is applied in the process of generating the search direction. In order to reduce the amount of calculation and make the best use of the existing information, the subspace was made up of the gradient of the current and prior iteration point and the previous search direction. By using the subspacebased optimization technology, the global convergence result is established under Wolfe line search. The results of numerical experiments show that the new method is effective and robust.
1. Introduction
Gradient descent and conjugate gradient (CG) methods have profound significance for dealing with unconstrained optimization problems:where is a continuousdifferential function. They are widely used because of the low storage requirements and the strong convergence property. Starting from an initial iteration point , in each iteration, the method produces an approximate solution sequence of (1):in which is the current iteration point, is the step length, and is the search direction that usually has the following form (for typical CG methods):where is the gradient of at and is the CG parameter. The stepsize is usually determined by performing a specific line search, among which Wolfe line search [1, 2] is one of the most commonly used:andwhere . Sometimes, the strong Wolfe line search given in (4) andis also widely used in the CG method for the establishment of convergence results. Different choices of form different versions of the CG methods. The research results of conjugate gradient algorithm are very rich, including PRP, HS, LS, and DY [3–10]. For a detailed survey of conjugate gradient methods, refer to [11].
Recently, subspace technique has attracted more and more researchers’s attention. Various subspace techniques are applied to construct different approaches to deal with various optimization problems. For detailed description of subspace technique, refer to [12–14]. In [15], Yuan and Stoer came up with the SMCG method by using subspace technique in the conjugate gradient method. Specifically, a twodimensional subspace is using in [15], to determine the search direction, that is,where and are parameters and . In [16, 17], Li et al. obtained the values of and by minimizing the dynamically selected conic approximate model and tensor model, respectively. In order to solve nonlinear optimization problems, Conn et al. [18] brought up a class of iteratedsubspace minimization methods. Similar to Yuan’s idea [15], Yang et al. [19] raised a new CG method in the same subspace .
For better convergence and robust numerical performance, some authors pay attention to study conjugate gradient methods with different forms. Zhang et al. [20] presented a threeterm conjugate gradient method, which possesses sufficient descent property without line search. And, they further extended twovariant method and established the global convergence results for general function under the standard Wolfe line search. In [21], Narushima et al. proposed a specific threeterm CG method, drawing on the idea of multistep quasiNewton method. Deng and Wan [22] put forward a threeterm CG algorithm, in which the direction is formed by , , and . For some other threeterm conjugate gradient methods, refer to [23–27].
The outline of the article is shown below. In Section 2, we use the technique of subspace minimization to derive the search direction on the subspace constituted by . We elaborate the proposed algorithm in Section 3. Section 4 provides the convergence analysis of the given algorithm under some suitable conditions. In Section 5, the numerical experiments compared with other methods are presented.
2. Derivation of the Search Direction
Different from the abovementioned subspace , in this paper, the following subspaceis used to construct the search direction. In order to simplify the calculation process, for each iteration point , the following quadratic model is used to approximate the function where can be viewed as an approximation of Hessian matrix. Moreover, is supposed to be positive definite and meet quasiNewton equation . Model has two advantages, which is not only a good approximation of [14] but also convenient to minimize in the subspace . Since and are usually not collinear, we discuss the case of dimensions 3 and 2. Case I. dim () = 3. In this case, the direction has the form where are parameters to be calculated. Adding (10) to the minimizing problem (9), we get the following minimizing problem: then and are the solutions of the following linear system: To solve system (12), we need to estimate two quantities and . Inspired by BBCG [28], we adopt Next, we choose the BFGS [29–32] update initialized with the identity matrix to compute : This choice not only retains useful information but also gives numerical results that perform better than the scaling matrix . Substituting (13) and (14) into the linear system (12), we get Combining (13), the determinant of the system (15) is calculated as Then, and are obtained as Case II. dim() = 2. By observing numerical experiments, we find that produces better numerical results. In this case, we chose as the subspace. The direction is expressed as Substituting (18) into (9), we get Then, the solution of (19) can be expressed as Therefore, If it is an exact line search, namely, , then, from (17) and (21), we get which means , and the obtained method is degraded to the HS method (Hestenes and Stiefel) [4].
3. The Proposed TCGS Algorithm
This section is mainly to elaborate the threeterm CG method with subspace techniques (TCGS), in which the acceleration scheme [33] is used (Algorithm 1).

4. Convergence Analysis
The main content of this section is to study the convergence properties of the TCGS algorithm. Some necessary assumptions for the objective function are given as follows.
Assumption 1. The level set is bounded.
Assumption 2. Suppose that is continuousdifferential, and its gradient is Lipschitz continuous with the constant .,Based on the above assumptions, it can be showed that there exists a constant such thatIt is well known that, for optimization algorithm, the relevant properties of the search direction are very important for the convergence of the algorithm. Firstly, we will study some properties of the search direction generated by the TCGS algorithm. The following lemmas show that the direction is decreasing, and Dai–Liao conjugacy condition is satisfied.
Lemma 1. Suppose that . Then, generated by the TCGS algorithm is a descent direction.
Proof. According to (9), we can get . Since and is generated by the TCGS algorithm, . Furthermore,
Lemma 2. The search direction satisfies the Dai–Liao conjugacy condition with , namely, .
Lemma 3. Suppose that is generated by the proposed TCGS algorithm and the step size possesses the conditions (4) and (5), then
Proof. Based on condition (5) and assumption (23), we getSince , we prove it.
Lemma 4. Assume that Assumptions 1 and 2 hold. Consider the algorithm TCGS in which Wolfe line search is used to compute the step size . Then, the Zoutendijk condition [34] holds:
Proof. Combining equation (4) and Lemma 3, we haveCondition (29) and Assumption 1 deduce (28) directly.
Lemma 5. Suppose that the objective function satisfies Assumptions 1 and 2, consider the algorithm TCGS in which the strong Wolfe line search (4) and (6) is used to compute the step size . Ifholds, then
Theorem 1. Under the Assumptions 1 and 2, consider the sequence generated by algorithm TSGS, satisfies the conditions (4) and (6). For uniformly convex function , i.e., there exists a constant such that
Then, (31) holds.
Proof. From (23) and (32), we get, respectively,On the basis of (33), combined with Cauchy inequality, it follows thatFurthermore, add the triangle inequality and (24), and we haveNext, we analyze the convergence in two cases: Case I. dim() = 3. and are computed by (16), and using (24), (33), (35), and (36), we get Similarly, we obtain Therefore, Case II. dim() = 2. Using (20), (24), and (33), we obtain Therefore, From Lemma 3, (31) holds.
5. Numerical Results
This section aims to observe the performance of the TCGS algorithm through numerical experiments and verify the effectiveness of the algorithm in dealing with unconstrained problems. For experimental purposes, we compared the numerical performance of TCGS with the PRP and MTHREECG [22] methods, in which MTHREECG has a structure similar to CGS, and PRP is a classic and effective CG method.
In [22], Deng and Wan presented a similar threeterm conjugate gradient method (MTHREECG) with the following form:
We chose a total of 75 test functions, and all the test functions come from [35]. The dimension of each function is from 1000, 2000, … to 10000 for the numerical experiments. The code is written by Fortran and available at https://camo.ici.ro/neculai/THREECG/threecg.for [36]. The default parameter values of the algorithm are consistent with [36].
The performance profile introduced by Dolan and Moré [37] is one of the most used tools for evaluating the performance of different methods. In this paper, we use it to investigate the numerical performance of the proposed TCGS algorithm.
The performance profile for the number of iterations can be seen in Figure 1. Looking at Figure 1, it can be found that the TCGS algorithm can solve about of the test problems with the least iterations. With the increasing factor , TCGS method outperforms both MTHREECG and PRP methods. Figure 2 represents the performance profile with respect to the number of function evaluations. The result shows the similar performance with the result in Figure 1, in which TCGS also outperforms both MTHREECG and PRP methods. From the numerical experiments, it can be seen that the proposed TCGS method is efficient and robust in dealing with a set of unconstrained test problems.
6. Conclusion
By using the idea of subspace minimization, we come up with a new type of subspace gradient descent method in this paper. In our method, the subspace is spanned by , and the quadratic approximation of the objective function is minimized to obtain the search direction. Therefore, the direction has the form , where the values of and are computed by discussing in two cases. In addition, we prove the descent property of the direction and show that Dai–Liao conjugacy condition is satisfied. Under the Wolf line search, the convergence of the proposed TCGS algorithm is established. Observing the numerical results, the performance of the TCGS algorithm in a set of unconstrained problems is competitive.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that there are no conflicts of interest.
Acknowledgments
This research was supported by the Guangxi Natural Science Foundation (nos. 2018GXNSFAA281340 and 2020GXNSFAA159014) and Program for Innovative Team of Guangxi University of Finance and Economics.
References
 P. Wolfe, “Convergence conditions for ascent methods,” SIAM Review, vol. 11, no. 2, pp. 226–235, 1969. View at: Publisher Site  Google Scholar
 P. Wolfe, “Convergence conditions for ascent methods. II: some corrections,” SIAM Review, vol. 13, no. 2, pp. 185–188, 1971. View at: Publisher Site  Google Scholar
 B. T. Polyak, “The conjugate gradient method in extremal problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, no. 4, pp. 94–112, 1969. View at: Publisher Site  Google Scholar
 M. R. Hestenes and E. Stiefel, Methods of Conjugate Gradients for Solving Linear systems, NBS, Washington, DC, USA, 1952.
 Y. Liu and C. Storey, “Efficient generalized conjugate gradient algorithms, part 1: Theory,” Journal of Optimization Theory and Applications, vol. 69, no. 1, pp. 129–137, 1991. View at: Publisher Site  Google Scholar
 Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal on Optimization, vol. 10, no. 1, pp. 177–182, 1999. View at: Publisher Site  Google Scholar
 Z. Wei, S. Yao, and L. Liu, “The convergence properties of some new conjugate gradient methods,” Applied Mathematics and Computation, vol. 183, no. 2, pp. 1341–1350, 2006. View at: Publisher Site  Google Scholar
 G. Yuan, Z. Meng, and Y. Li, “A modified Hestenes and stiefel conjugate gradient algorithm for largescale nonsmooth minimizations and nonlinear equations,” Journal of Optimization Theory and Applications, vol. 168, no. 1, pp. 129–152, 2016. View at: Publisher Site  Google Scholar
 G. Yuan, Z. Wei, and X. Lu, “Global convergence of BFGS and PRP methods under a modified weak WolfePowell line search,” Applied Mathematical Modelling, vol. 47, pp. 811–825, 2017. View at: Publisher Site  Google Scholar
 G. Yuan, Z. Wei, and Y. Yang, “The global convergence of the PolakRibièrePolyak conjugate gradient algorithm under inexact line search for nonconvex functions,” Journal of Computational and Applied Mathematics, vol. 362, pp. 262–275, 2019. View at: Publisher Site  Google Scholar
 W. W. Hager and H. Zhang, “A survey of nonlinear conjugate gradient methods,” Pacific Journal of Optimization, vol. 2, no. 1, pp. 35–58, 2006. View at: Google Scholar
 Y.X. Yuan, “Subspace techniques for nonlinear optimization,” Series in Contemporary Applied Mathematics, Chinese Academy of Sciences, Beijing, China, 2007. View at: Publisher Site  Google Scholar
 Y.X. Yuan, “Subspace methods for large scale nonlinear equations and nonlinear least squares,” Optimization and Engineering, vol. 10, no. 2, pp. 207–218, 2009. View at: Publisher Site  Google Scholar
 Y. Yuan, “A review on subspace methods for nonlinear optimization,” in Proceedings of the International Congress of Mathematics, pp. 807–827, Seoul, South Korea, August 2014. View at: Google Scholar
 Y.X. Yuan and J. Stoer, “A subspace study on conjugate gradient algorithms,” ZAMMJournal of Applied Mathematics and Mechanics/Zeitschrift für Angewandte Mathematik und Mechanik, vol. 75, no. 1, pp. 69–77, 1995. View at: Publisher Site  Google Scholar
 Y. Li, Z. Liu, and H. Liu, “A subspace minimization conjugate gradient method based on conic model for unconstrained optimization,” Computational and Applied Mathematics, vol. 38, no. 1, p. 16, 2019. View at: Publisher Site  Google Scholar
 T. Wang, Z. Liu, and H. Liu, “A new subspace minimization conjugate gradient method based on tensor model for unconstrained optimization,” International Journal of Computer Mathematics, vol. 96, no. 10, pp. 1924–1942, 2019. View at: Publisher Site  Google Scholar
 A. Conn, N. Gould, A. Sartenaer, and P. Toint, “On iteratedsubspace minimization methods for nonlinear optimization,” in Proceedings of the Linear and Nonlinear Conjugate GradientRelated Methods, Seattle, WA, USA, January 1996. View at: Google Scholar
 Y. Yang, Y. Chen, and Y. Lu, “A subspace conjugate gradient algorithm for largescale unconstrained optimization,” Numerical Algorithms, vol. 76, no. 3, pp. 813–828, 2017. View at: Publisher Site  Google Scholar
 L. Zhang, W. Zhou, and D. Li, “Some descent threeterm conjugate gradient methods and their global convergence,” Optimization Methods and Software, vol. 22, no. 4, pp. 697–711, 2007. View at: Publisher Site  Google Scholar
 Y. Narushima, H. Yabe, and J. A. Ford, “A threeterm conjugate gradient method with sufficient descent property for unconstrained optimization,” SIAM Journal on Optimization, vol. 21, no. 1, pp. 212–230, 2011. View at: Publisher Site  Google Scholar
 S. Deng and Z. Wan, “A threeterm conjugate gradient algorithm for largescale unconstrained optimization problems,” Applied Numerical Mathematics, vol. 92, pp. 70–81, 2015. View at: Publisher Site  Google Scholar
 N. Andrei, “A new threeterm conjugate gradient algorithm for unconstrained optimization,” Numerical Algorithms, vol. 68, no. 2, pp. 305–321, 2015. View at: Publisher Site  Google Scholar
 S. Yao and L. Ning, “An adaptive threeterm conjugate gradient method based on selfscaling memoryless BFGS matrix,” Journal of Computational and Applied Mathematics, vol. 332, pp. 72–85, 2018. View at: Publisher Site  Google Scholar
 H. Kobayashi, Y. Narushima, and H. Yabe, “Descent threeterm conjugate gradient methods based on secant conditions for unconstrained optimization,” Optimization Methods and Software, vol. 32, no. 6, pp. 1313–1329, 2017. View at: Publisher Site  Google Scholar
 X. Y. Wang, S. J. Li, and X. P. Kou, “A selfadaptive threeterm conjugate gradient method for monotone nonlinear equations with convex constraints,” Calcolo, vol. 53, no. 2, pp. 133–145, 2016. View at: Publisher Site  Google Scholar
 P. Gao, C. He, and Y. Liu, “An adaptive family of projection methods for constrained monotone nonlinear equations with applications,” Applied Mathematics and Computation, vol. 359, pp. 1–16, 2019. View at: Publisher Site  Google Scholar
 Y. Dai and C. Kou, “A BarzilaiBorwein conjugate gradient method,” Science China Mathematics, vol. 59, no. 8, pp. 1511–1524, 2016. View at: Publisher Site  Google Scholar
 C. G. Broyden, “The convergence of a class of doublerank minimization algorithms 1. General considerations,” IMA Journal of Applied Mathematics, vol. 6, no. 1, pp. 76–90, 1970. View at: Publisher Site  Google Scholar
 R. Fletcher, “A new approach to variable metric algorithms,” The Computer Journal, vol. 13, no. 3, pp. 317–322, 1970. View at: Publisher Site  Google Scholar
 D. Goldfarb, “A family of variablemetric methods derived by variational means,” Mathematics of Computation, vol. 24, no. 109, p. 23, 1970. View at: Publisher Site  Google Scholar
 D. F. Shanno, “Conditioning of quasiNewton methods for function minimization,” Mathematics of Computation, vol. 24, no. 111, p. 647, 1970. View at: Publisher Site  Google Scholar
 N. Andrei, “Acceleration of conjugate gradient algorithms for unconstrained optimization,” Applied Mathematics and Computation, vol. 213, no. 2, pp. 361–369, 2009. View at: Publisher Site  Google Scholar
 G. Zoutendijk, “Nonlinear programming, computational methods,” Integer and Nonlinear Programming, pp. 37–86, 1970. View at: Google Scholar
 N. Andrei, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008. View at: Google Scholar
 N. Andrei, “A simple threeterm conjugate gradient algorithm for unconstrained optimization,” Journal of Computational and Applied Mathematics, vol. 241, pp. 19–29, 2013. View at: Publisher Site  Google Scholar
 E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2021 Shengwei Yao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.