1 / 48

Supervised learning

Supervised learning. Early learning algorithms First order gradient methods Second order gradient methods. Early learning algorithms. Designed for single layer neural networks Generally more limited in their applicability Some of them are Perceptron learning LMS or Widrow- Hoff learning

cais
Download Presentation

Supervised learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Supervised learning Early learning algorithms First order gradient methods Second order gradient methods

  2. Early learning algorithms • Designed for single layer neural networks • Generally more limited in their applicability • Some of them are • Perceptron learning • LMS or Widrow- Hoff learning • Grossberg learning

  3. Perceptron learning • Randomly initialize all the networks weights. • Apply inputs and find outputs ( feedforward). • compute the errors. • Update each weight as • Repeat steps 2 to 4 until the errors reach the satisfactory level.

  4. Performance Optimization Gradient based methods

  5. Basic Optimization Algorithm

  6. Steepest Descent (first order Taylor expansion)

  7. Example

  8. Plot

  9. LMS or Widrow- Hoff learning • First introduce ADALINE (ADAptive LInear NEuron) Network

  10. LMS or Widrow- Hoff learningor Delta Rule • ADALINE network same basic structure as the perceptron network

  11. Approximate Steepest Descent

  12. Approximate Gradient Calculation

  13. LMS AlgorithmThis algorithm inspire from steepest descent algorithm

  14. Multiple-Neuron Case

  15. Difference between perceptron learning and LMS learning • DERIVATIVE • Linear activation function has derivative but • sign (bipolar, unipolar) has not derivative

  16. Grossberg learning (associated learning) • Sometimes known as instar and outstar training • Updating rule: • Where could be the desired input values (instar training, example: clustering) or the desired output values (outstar) depending on network structure. • Grossberg network (use Hagan to more details)

  17. First order gradient method Back propagation

  18. Multilayer Perceptron R – S1 – S2 – S3 Network

  19. Example

  20. Elementary Decision Boundaries

  21. Elementary Decision Boundaries

  22. Total Network

  23. Function Approximation Example

  24. Nominal Response

  25. Parameter Variations

  26. Multilayer Network

  27. Performance Index

  28. Chain Rule

  29. Gradient Calculation

  30. Steepest Descent

  31. Jacobian Matrix

  32. Backpropagation (Sensitivities)

  33. Initialization (Last Layer)

  34. Summary

  35. Summary • Back-propagation training algorithm • Backprop adjusts the weights of the NN in order to minimize the network total mean squared error. Network activation Forward Step Error propagation Backward Step

  36. Example: Function Approximation

  37. Network

  38. Initial Conditions

  39. Forward Propagation

  40. Transfer Function Derivatives

  41. Backpropagation

  42. Weight Update

  43. Choice of Architecture

  44. Choice of Network Architecture

  45. ConvergenceGlobal minium (left) local minimum (rigth)

  46. Generalization

  47. Disadvantage of BP algorithm • Slow convergence speed • Sensitivity to initial conditions • Trapped in local minima • Instability if learning rate is too large • Note: despite above disadvantages, it is popularly used in control community. There are numerous extensions to improve BP algorithm.

  48. Improved BP algorithms(first ordergradient method) • BP with momentum • Delta- bar- delta • Decoupled momentum • RProp • Adaptive BP • Trinary BP • BP with adaptive gain • Extended BP

More Related