1 / 21

Gradient Descent with Numpy

Gradient Descent with Numpy. COMP 4332 Tutorial 3 Feb 25. Outline. Main idea of Gradient Descent 1 dimension example 2 dimension example Useful tools in Scipy Simply ploting. Main idea of Gradient Descent. Goal: optimize a function

alessa
Download Presentation

Gradient Descent with Numpy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gradient Descent with Numpy COMP 4332 Tutorial 3 Feb 25

  2. Outline • Main idea of Gradient Descent • 1 dimension example • 2 dimension example • Useful tools in Scipy • Simply ploting

  3. Main idea of Gradient Descent • Goal: optimize a function • To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. • Result: A local minimum

  4. 1 dimension example • Local minimum is • (-1,2)

  5. 1 dimension example • Randomly assign a value to , • Find the gradient at () • Repeat this process • Let lengh of a step be 0.2 • … after many iterations

  6. 1 dimension example • Python code """ 1 dimension example to optimize Created on Feb 1, 2012 @author: kxmo """ import numpy def f(x): y = x*x + 2*x +3 return y def diff(x): y = 2*x+2 return y defgradient_descent(): print "gradient_descent \n" x = 3 step = 0.2 for iter in xrange(100): dfx = diff(x) x = x-dfx*step y = f(x) print iter print x,y if __name__ == '__main__': print "Begin" gradient_descent() print "End"

  7. 1 dimension example • Python code result 0 1.4 7.76 1 0.44 4.0736 2 -0.136 2.746496 3 -0.4816 2.26873856 4 -0.68896 2.0967458816 5 -0.813376 2.03482851738 6 -0.8880256 2.01253826626 7 -0.93281536 2.00451377585 8 -0.959689216 2.00162495931 9 -0.9758135296 2.00058498535 10 -0.98548811776 2.00021059473 11 -0.991292870656 2.0000758141 12 -0.994775722394 2.00002729308 13 -0.996865433436 2.00000982551 … 48 -0.999999999946 2.0 49 -0.999999999968 2.0 50 -0.999999999981 2.0 51 -0.999999999988 2.0 52 -0.999999999993 2.0 53 -0.999999999996 2.0 54 -0.999999999997 2.0 55 -0.999999999998 2.0 56 -0.999999999999 2.0 57 -0.999999999999 2.0 58 -1.0 2.0 59 -1.0 2.0 60 -1.0 2.0

  8. 2 dimension example • Rosenbrockfunction • Minimum at

  9. 2 dimension example • Input: a vector

  10. 2 dimension example • Let • … repeat and repeat • Choosing steps should be very • careful.

  11. 2 dimension example • Python code defgradient_descent(): print "gradient_descent \n" x = array([1,-1.5]) step = 0.002 for iter in xrange(10000): dfx = diff(x) x = x-dfx*step # x -= dfx*step y = f(x) print iter, print x,y if __name__ == '__main__': print "Begin" gradient_descent() print "End" """ 2 dimension example to optimize Rosenbrock function Created on Feb 2, 2012 @author: kxmo ""“ deff(x): y = (1-x[0])**2+100*((x[1]-x[0]**2)**2) return y def diff(x): ## diff on x[0] and x[1] x1 = -2*(1-x[0])-400*(x[1]-x[0]**2)*x[0] x2 = 200*(x[1]-x[0]**2) y = array([x1,x2]) return y

  12. 2 dimension example • Python code result 6571 [ 0.99850149 0.99699922] 2.24914102097e-06 6572 [ 0.99850269 0.99700162] 2.24554097151e-06 6573 [ 0.99850389 0.99700402] 2.2419466913e-06 6574 [ 0.99850508 0.99700642] 2.23835817108e-06 6575 [ 0.99850628 0.99700881] 2.2347754016e-06 …. 0 [-1. -0.5] 229.0 1 [ 0.208 0.1 ] 0.9491613696 2 [ 0.22060887 0.0773056 ] 0.689460178665 3 [ 0.22878055 0.06585067] 0.613031790076 4 [ 0.23433811 0.06044662] 0.589298719311 5 [ 0.2384379 0.05823371] 0.580167571748 6 [ 0.24174759 0.05768128] 0.575004572635 7 [ 0.2446335 0.05798553] 0.570924522072 8 [ 0.24729094 0.05872954] 0.567158149611 9 [ 0.24982238 0.05969885] 0.563502163823 10 [ 0.252281 0.0607838] 0.559902757103 …

  13. Useful tools in Scipy • We can use a L-BFGS solver in Scipy • In put of a L-BFGS solver is • Objective function: F • Gradient of F • Initial guess of a value • x, f, d = fmin_l_bfgs_b(f, x0, fprime=diff)

  14. Using L-BFGS solver in scipy from numpy import * from scipy.optimize import *; def f(x): y = (1-x[0])**2+100*((x[1]-x[0]**2)**2) return y def diff(x): ## diff on x[0] and x[1] x1 = -2*(1-x[0])-400*(x[1]-x[0]**2)*x[0] x2 = 200*(x[1]-x[0]**2) y = array([x1,x2]) return y if __name__ == '__main__': print "Begin" ##gradient_descent(f,diff) x0 = array([1,-1.5]) x, f, d = fmin_l_bfgs_b(f, x0, fprime=diff, iprint = 1) print "The result is", x print "smallest value is", f print "End"

  15. Result >>> Begin The result is [ 1.00000001 1.00000002] smallest value is 1.05243324104e-16 End >>>

  16. Simple Ploting import numpy import pylab # Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2 mu, sigma = 2, 0.5 v = numpy.random.normal(mu,sigma,10000) # Plot a normalized histogram with 50 bins pylab.hist(v, bins=50, normed=1) # matplotlib version (plot) pylab.show() # Compute the histogram with numpy and then plot it (n, bins) = numpy.histogram(v, bins=50, normed=True) # NumPy version (no plot) pylab.plot(.5*(bins[1:]+bins[:-1]), n) pylab.show()

  17. Result

More Related