Finding minimizer Gradient, Newton’s Algo., Conjugate gradient, Rank one, DFP Algo.

Finding minimizerGradient, Newton’s Algo., Conjugate gradient, Rank one, DFP Algo. Intelligent Information System Lab. Dept. of Computer and Information Science Korea Univ. Sohn Jong-Soo mis026@korea.ac.kr

Index • Algorithms • Gradient, Newton’s method, Conjugate direction, Quasi-newton method • Problems • 10.7, 10.8 • Probem solving • To solve the problems using each algorithms • Performance analysis • Conclusion

Algorithms • Gradient • Conjugate direction method • Conjugate gradient method • Newton’s method • Quasia-Newton methods • Rank – one algorithm • DFP algorithm

Gradient • Gradient method • f(x(0) – α∇f(x(0)) < f(x(0)) • x(k+1) = x(k) – αk∇f(x(k)) • Method of steepest desendent • αk = arg min ( f(x(k) – α∇f(x(k))

Conjugate gradient • Set k := 0; select the initial point x(0) • g(0) = ∇f(x(0)). If g(0) = 0, stop, else set d(0) = -g(0) • αk = - ( g(k)Td(k) / d(k)TQd(k) ) • x(k+1) = x(k) + αkd(k) • g(k+1) = ∇f(x(k+1)).If g(k+1) = 0, stop • βk = ( g(k+1)TQd(k) / d(k)TQd(k) ) • d(k+1) = -g(k+1) + βkd(k) • Set k := k +1 ; go to step 3.

Newton’s method • f(x) ≈ f(x(k)) + (x – x(k))Tg(k) + ½(x-x(k))TF(x(k))(x-x(k)≜ q(x) • 0 = ∇q(x) = g(k) + F(x(k))(x – x(k)) • x(x+1) = x(k) – F(x(k))-1g(k)

Rank one algorithm • Set k := 0; select x(0), and a real symmetric positive definite H0 • If g(k) = 0, stop; else d(k) = -Hkg(k) • Compute • αk = arg min f(x(k) + αd(k)) • x(k+1) = x(k) + αkd(k) • Compute • ∆x(k) = αkd(k) • ∆g(k) = g(k+1) - g(k) • Hk+1 = Hk + ( (∆x(k) - Hk ∆g(k))(∆x(k) - Hk ∆g(k))T / ∆g(k)T(∆x(k) - Hk ∆g(k)) ) • Set k := k + 1 ; go to step 2

DFP algorithm • Set k := 0 ;select x(0), and a real symmetric positive definite H0 • If g(k) = 0, stop; else d(k) = -Hkg(k) • Compute • αk = arg min f(x(k) + αd(k)) • x(k+1) = x(k) + αkd(k) • Compute • ∆x(k) = αkd(k) • ∆g(k) = g(k+1) – g(k) • Hk+1 = Hk + (∆x(k) ∆x(k)T / ∆x(k)T ∆g(k)) –( [Hk ∆g(k)][Hk ∆g(k)]T / ∆g(k)THk ∆g(k) ) • Set k := k + 1; go to step 2

Problems • Exercise 10.7 • Let f(x), x = [x1, x2]T ∈ R2, be given by • Exercise 10.8 • Rosenbrock’s function

Problem solving – 10.7 • Gradient method • Starting point • [1 , 1] • Tolerance : 0.001 . . step(4): X : 0.9049 -0.7717 g : -0.0014 -0.0011 f(x) : -0.9948 alpha: 5.00000 step(5): X : 1.0000 -0.9620 g : 0.0347 0.0767 f(x) : -0.9993 alpha: 0.17241 step(6): X : 0.9869 -0.9685 g : -0.0000 -0.0000 f(x) : -0.9999 alpha: 5.00000 step(7): X : 1.0000 -0.9948 g : 0.0007 0.0015 f(x) : -1.0000 alpha: 0.17241 step(8): X : 0.9982 -0.9957 g : -0.0000 -0.0000 f(x) : -1.0000 alpha: 5.00000 step(9): X : 1.0000 -0.9993 g : 0.0000 0.0000 f(x) : -1.0000 alpha: 0.17241

Problem solving – 10.7 • Gradient method • Starting point • [5 , 1] • Tolerance : 0.001 step(1): X : 5.0000 1.0000 alpha: 0.17157 step(2): X : 0.8822 -0.7157 g : -0.0023 -0.0018 f(x) : -0.9919 alpha: 5.82759 step(3): X : 1.0006 -0.9997 g : 0.0001 0.0001 f(x) : -1.0000 alpha: 0.17157

Problem solving – 10.7 • Gradient method • Starting point • [-1 , 4] • Tolerance : 0.001 . . . step(48): X : 0.9882 -0.9764 g : 0.0006 0.0013 f(x) : -0.9999 alpha: 0.20000 step(49): X : 0.9906 -0.9764 g : -0.0000 0.0000 f(x) : -0.9999 alpha: 1.00000 step(50): X : 0.9906 -0.9811 g : 0.0004 0.0008 f(x) : -1.0000 alpha: 0.20000 step(51): X : 0.9924 -0.9811 g : 0.0000 0.0000 f(x) : -1.0000 alpha: 1.00000 step(52): X : 0.9924 -0.9849 g : 0.0002 0.0005 f(x) : -1.0000 alpha: 0.20000 step(53): X : 0.9940 -0.9849 g : 0.0000 0.0000 f(x) : -1.0000 alpha: 1.00000 step(54): X : 0.9940 -0.9879 g : 0.0001 0.0003 f(x) : -1.0000 alpha: 0.20000

Problem solving – 10.7 • ConjugateGradient method • Starting point • [1 , 1] • Tolerance : 0.001 stage(1): g = [ -0.137931 0.275862 ] , x = [ 0.310345 0.655172] stage(2): g = [ -0.000000 -0.000000 ] , x = [ 1.000000 -1.000000]

Problem solving – 10.7 • ConjugateGradient method • Starting point • [5 , 1] • Tolerance : 0.001 stage(1): g = [ -0.020305 0.048731 ] , x = [ 0.882234 -0.715736]

Problem solving – 10.7 • ConjugateGradient method • Starting point • [-1 , 4] • Tolerance : 0.001 stage(1): g = [ -2.000000 0.000000 ] , x = [ -1.000000 3.000000] stage(2): g = [ 0.000000 0.000000 ] , x = [ 1.000000 -1.000000]

Problem solving – 10.7 • Newton’s method • Starting point • [1 , 1] • Tolerance : 0.001 step(1): X : 1.0000 -1.0000 g : 0.0000 4.0000 f(x) : -1.0000 alpha: 1.00000 step(2): X : 1.0000 -1.0000 g : 0.0000 0.0000 f(x) : -1.0000 alpha: 1.00000

Problem solving – 10.7 • Newton’s method • Starting point • [5 , 1] • Tolerance : 0.001 step(1): X : 1.0000 -1.0000 g : -0.0000 116.0000 f(x) : -1.0000 alpha: 1.00000 step(2): X : 1.0000 -1.0000 g : 0.0000 0.0000 f(x) : -1.0000 alpha: 1.00000

Problem solving – 10.7 • Newton’s method • Starting point • [300 , 300] • Tolerance : 0.0000001 stage(1): g = [ -10.739497 25.050862 ] , x = [ -59.841222 145.733306] stage(2): g = [ 0.000000 0.000000 ] , x = [ 1.000000 -1.000000]

Problem solving – 10.7 • Newton’s method • Starting point • [-1 , 4] • Tolerance : 0.001 step(1): X : 1.0000 -1.0000 g : 0.0000 5.0000 f(x) : -1.0000 alpha: 1.00000 step(2): X : 1.0000 -1.0000 g : 0.0000 -0.0000 f(x) : -1.0000 alpha: 1.00000

Problem solving – 10.7 • Rank-one • Starting point • [1 , 1] • Tolerance : 0.001 step(1): X : 1.0000 1.0000 g : 4.0000 2.0000 alpha: 0.17241 step(2): X : 0.3103 0.6552 g : -0.1379 0.2759 alpha: 5.83333 step(3): X : 1.0000 -1.0000 g : 0.0000 0.0000 alpha: 5.83333

Problem solving – 10.7 • Rank-one • Starting point • [5 , 1] • Tolerance : 0.001 step(1): X : 5.0000 1.0000 g : 24.0000 10.0000 alpha: 0.17157 step(2): X : 0.8822 -0.7157 g : -0.0203 0.0487 alpha: 5.82843 step(3): X : 1.0000 -1.0000 g : 0.0000 0.0000 alpha: 5.82843

Problem solving – 10.7 • DFP • Starting point • [1 , 1] • Tolerance : 0.001 Stage(1): g=[ 4.000000 2.000000 ] alpha=0.172414 x=[ 1.000000 1.000000 ] Stage(2): g=[ -0.137931 0.275862 ] alpha=5.827586 x=[ 0.310345 0.655172 ] Stage(3): g=[ 0.000000 0.000000 ] alpha=5.827586 x=[ 1.000000 -1.000000 ]

Problem solving – 10.7 • DFP • Starting point • [5 , 1] • Tolerance : 0.001 Stage(1): g=[ 24.000000 10.000000 ] alpha=0.171574 x=[ 5.000000 1.000000 ] Stage(2): g=[ -0.020305 0.048731 ] alpha=5.828426 x=[ 0.882234 -0.715736 ] Stage(3): g=[ 0.000000 0.000000 ] alpha=5.828426 x=[ 1.000000 -1.000000 ]

Problem solving – 10.7 • DFP • Starting point • [-1 , 4] • Tolerance : 0.001 Stage(1): g=[ 0.000000 1.000000 ] alpha=1.000000 x=[ -1.000000 4.000000 ] Stage(2): g=[ -2.000000 0.000000 ] alpha=5.000000 x=[ -1.000000 3.000000 ] Stage(3): g=[ -0.000000 -0.000000 ] alpha=5.000000 x=[ 1.000000 -1.000000 ]

Performance analysis of Algorithms • Exercise 10.7

Problems • Exercise 10.7 • Let f(x), x = [x1, x2]T ∈ R2, be given by • Exercise 10.8 • Rosenbrock’s function

Problem solving – 10.8 • Conjugate gradient • Starting point • [-2, 2] • Tolerance : 0.001 stage(1): g = [ -332.501269 -101.418013 ] , x = [ -1.613492 2.096266 ] stage(2): g = [ -142.448702 -37.455614 ] , x = [ -1.826114 3.147416 ] stage(3): g = [ -41.353455 -10.765649 ] , x = [ -1.672388 2.743052 ] stage(4): g = [ -33.996781 -8.468644 ] , x = [ -1.689618 2.812465 ] stage(5): g = [ -36.948543 -11.218869 ] , x = [ -1.430106 1.989107 ] stage(6): g = [ -21.953885 -5.840852 ] , x = [ -1.458436 2.097830 ] . . . stage(32): g = [ 0.547441 -0.390674 ] , x = [ 0.915901 0.836922 ] stage(33): g = [ 0.413761 -0.245702 ] , x = [ 0.968836 0.937414 ] stage(34): g = [ 0.330438 -0.203535 ] , x = [ 0.968164 0.936324 ] stage(35): g = [ 0.240281 -0.126425 ] , x = [ 0.994421 0.988241 ] stage(36): g = [ 0.066990 -0.040594 ] , x = [ 0.993178 0.986200 ] stage(37): g = [ 0.006614 -0.003654 ] , x = [ 0.999654 0.999290 ] stage(38): g = [ 0.003991 -0.002363 ] , x = [ 0.999633 0.999255 ] stage(39): g = [ 0.000042 -0.000022 ] , x = [ 0.999999 0.999998 ]

Problem solving – 10.8 • Conjugate gradient • Starting point • [-2, 2] • Tolerance :0.0000001 stage(1): g = [ -332.501269 -101.418013 ] , x = [ -1.613492 2.096266 ] stage(2): g = [ -142.448702 -37.455614 ] , x = [ -1.826114 3.147416 ] stage(3): g = [ -41.353455 -10.765649 ] , x = [ -1.672388 2.743052 ] stage(4): g = [ -33.996781 -8.468644 ] , x = [ -1.689618 2.812465 ] stage(5): g = [ -36.948543 -11.218869 ] , x = [ -1.430106 1.989107 ] stage(6): g = [ -21.953885 -5.840852 ] , x = [ -1.458436 2.097830 ] stage(7): g = [ -33.435609 -12.506830 ] , x = [ -1.163693 1.291647 ] . . . stage(33): g = [ 0.413761 -0.245702 ] , x = [ 0.968836 0.937414 ] stage(34): g = [ 0.330438 -0.203535 ] , x = [ 0.968164 0.936324 ] stage(35): g = [ 0.240281 -0.126425 ] , x = [ 0.994421 0.988241 ] stage(36): g = [ 0.066990 -0.040594 ] , x = [ 0.993178 0.986200 ] stage(37): g = [ 0.006614 -0.003654 ] , x = [ 0.999654 0.999290 ] stage(38): g = [ 0.003991 -0.002363 ] , x = [ 0.999633 0.999255 ] stage(39): g = [ 0.000042 -0.000022 ] , x = [ 0.999999 0.999998 ] stage(40): g = [ 0.000013 -0.000008 ] , x = [ 0.999999 0.999998 ] stage(41): g = [ 0.000000 -0.000000 ] , x = [ 1.000000 1.000000 ]

Problem solving – 10.8 • Conjugate gradient • Starting point • [2, 2] • Tolerance : 0.001 stage(1): g = [ 330.655526 -102.025414 ] , x = [ 1.614434 2.096271 ] stage(2): g = [ 135.762583 -36.574527 ] , x = [ 1.833191 3.177717 ] stage(3): g = [ 42.049249 -11.979660 ] , x = [ 1.696857 2.819425 ] stage(4): g = [ 19.978221 -5.261054 ] , x = [ 1.755153 3.054258 ] stage(5): g = [ 4.384471 -0.924629 ] , x = [ 1.658623 2.746409 ] stage(6): g = [ 11.937054 -3.274119 ] , x = [ 1.630401 2.641836 ] stage(7): g = [ 23.733754 -7.660550 ] , x = [ 1.485688 2.168967 ] stage(8): g = [ 0.557551 0.135458 ] , x = [ 1.479136 2.188521 ] stage(9): g = [ 106.318780 -54.952470 ] , x = [ 0.967954 0.662172 ] stage(10): g = [ 82.367678 -35.628666 ] , x = [ 1.151662 1.148182 ] stage(11): g = [ 21.540766 -10.429636 ] , x = [ 1.029813 1.008366 ] stage(12): g = [ 8.471463 -3.771332 ] , x = [ 1.097331 1.185279 ] stage(13): g = [ 2.442372 -1.111322 ] , x = [ 1.052036 1.101222 ] stage(14): g = [ 1.522402 -0.657932 ] , x = [ 1.062288 1.125166 ] stage(15): g = [ 0.320738 -0.128530 ] , x = [ 1.028213 1.056579 ] stage(16): g = [ 0.536080 -0.236203 ] , x = [ 1.025754 1.050991 ] stage(17): g = [ 0.261000 -0.125680 ] , x = [ 1.004282 1.007954 ] stage(18): g = [ 0.089963 -0.039049 ] , x = [ 1.005710 1.011257 ] stage(19): g = [ 0.004208 -0.001811 ] , x = [ 1.000292 1.000576 ] stage(20): g = [ 0.004793 -0.002109 ] , x = [ 1.000287 1.000563 ] stage(21): g = [ 0.000036 -0.000017 ] , x = [ 1.000001 1.000001 ]

Problem solving – 10.8 • Conjugate gradient • Starting point • [300, 300] • Tolerance : 0.001 stage(1): g = [ 3186655073.835467 -7957781.328504 ] , x = [ 200.222559 300.166296 ] stage(2): g = [ 1387227021.306565 -2660822.303905 ], x = [ 260.676277 54648.009682 ] stage(3): g = [ 416041624.719386 -923574.649366 ], x = [ 225.234190 46112.567133 ] stage(4): g = [ 157179285.355639 -319357.541938 ], x = [ 246.085930 58961.497052 ] . . . stage(196): g = [ 0.089707 -0.054861 ] , x = [ 0.990513 0.980842 ] stage(197): g = [ 0.012199 -0.006749 ] , x = [ 0.999355 0.998676 ] stage(198): g = [ 0.007436 -0.004404 ] , x = [ 0.999317 0.998612 ] stage(199): g = [ 0.000145 -0.000076 ] , x = [ 0.999997 0.999993 ] stage(200): g = [ 0.000044 -0.000026 ] , x = [ 0.999996 0.999992 ] stage(201): g = [ 0.000000 -0.000000 ] , x = [ 1.000000 1.000000 ] stage(202): g = [ 0.000000 -0.000000 ] , x = [ 1.000000 1.000000 ] stage(203): g = [ 0.000000 0.000000 ] , x = [ 1.000000 1.000000 ]

Problem solving – 10.8 • Newton’s method • Starting point • [-2, 2] • Tolerance : 0.001 stage(1) g = [ -0.067163 799.776243 ] x = [ -1.992534 3.966104 ] alpha = 0.997985 stage(2) g = [ 2934.178339 23509.879768 ] x = [ -1.757191 3.028832 ] alpha = 0.142922 stage(3) g = [ -0.777980 2.240276 ] x = [ -1.447038 2.023481 ] alpha = 1.437352 stage(4) g = [ -0.834947 -0.446648 ] x = [ -1.048088 1.042159 ] alpha = 2.459842 stage(5) g = [ -0.631373 0.036704 ] x = [ -0.719034 0.463415 ] alpha = 1.970720 . . . stage(31) g = [ 0.055759 0.897840 ] x = [ 1.234722 1.510596 ] alpha = 0.815693 stage(32) g = [ -0.026218 -0.013331 ] x = [ 1.061162 1.121055 ] alpha = 2.801268 stage(33) g = [ -0.002457 0.000322 ] x = [ 1.008215 1.017365 ] alpha = 1.733026 stage(34) g = [ 0.000055 0.000552 ] x = [ 0.999582 0.999205 ] alpha = 0.868278 stage(35) g = [ 0.000000 0.000001 ] x = [ 1.000003 1.000005 ] alpha = 0.998378 stage(36) g = [ 0.000050 -0.000022 ] x = [ 1.000003 1.000005 ] alpha = 0.998378

Problem solving – 10.8 • Newton’s method • Starting point • [-2, 2] • Tolerance : 0.0000001 stage(1) g = [ -0.067163 799.776243 ] x = [ -1.992534 3.966104 ] alpha = 0.997985 stage(2) g = [ 2934.178339 23509.879768 ] x = [ -1.757191 3.028832 ] alpha = 0.142922 stage(3) g = [ -0.777980 2.240276 ] x = [ -1.447038 2.023481 ] alpha = 1.437352 stage(4) g = [ -0.834947 -0.446648 ] x = [ -1.048088 1.042159 ] alpha = 2.459842 stage(5) g = [ -0.631373 0.036704 ] x = [ -0.719034 0.463415 ] alpha = 1.970720 stage(6) g = [ -0.506735 -0.207360 ] x = [ -0.380849 0.100645 ] alpha = 2.305497 . . stage(32) g = [ -0.026218 -0.013331 ] x = [ 1.061162 1.121055 ] alpha = 2.801268 stage(33) g = [ -0.002457 0.000322 ] x = [ 1.008215 1.017365 ] alpha = 1.733026 stage(34) g = [ 0.000055 0.000552 ] x = [ 0.999582 0.999205 ] alpha = 0.868278 stage(35) g = [ 0.000000 0.000001 ] x = [ 1.000003 1.000005 ] alpha = 0.998378 stage(36) g = [ -0.000000 0.000000 ] x = [ 1.000000 1.000000 ] alpha = 0.999314 stage(37) g = [ 0.000000 -0.000000 ] x = [ 1.000000 1.000000 ] alpha = 0.999314

Problem solving – 10.8 • Quasi-Newton DFP Method • Starting point • [-1 1] • Tolerance : 0.001

Problem solving – 10.8 • Quasi-Newton DFP Method • Starting point • [-5 5] • Tolerance : 0.001

Performance analysis of Algorithms • Exercise 10.8

Conclusion • Ex 10.7 • Well defined function • Easy to solve • Good performance at each Algorithms • Ex 10.8 • Rosenbrock’s function • Hard to solve • Hard to make quadratic formula • Bad performance

Finding minimizer Gradient, Newton’s Algo., Conjugate gradient, Rank one, DFP Algo.