1 / 21

Non-Bayes classifiers.

Non-Bayes classifiers. Linear discriminants, neural networks. Discriminant functions(1). Bayes classification rule:. Instead might try to find a function:. is called discriminant function. - decision surface. Class 1. Class 1. Class 2. Class 2. Discriminant functions (2).

prentice
Download Presentation

Non-Bayes classifiers.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Non-Bayes classifiers. Linear discriminants, neural networks.

  2. Discriminant functions(1) Bayes classification rule: Instead might try to find a function: is called discriminant function. - decision surface

  3. Class 1 Class 1 Class 2 Class 2 Discriminant functions (2) Linear discriminant function: Decision surface is a hyperplane

  4. Linear discriminant – perceptron cost function Replace Thus now decision function is and decision surface is Perceptron cost function: where

  5. Class 1 Class 2 Linear discriminant – perceptron cost function Perceptron cost function: Value of is proportional to the sum of distances of all misclassified samples to the decision surface. If discriminant function separates classes perfectly, then Otherwise, and we want to minimize it. is continuous and piecewise linear. So we might try to use gradient descent algorithm.

  6. Linear discriminant – Perceptron algorithm Gradient descent: At points where is differentiable Thus Perceptron algorithm converges when classes are linearly separable with some conditions on

  7. Sum of error squares estimation Let denote as desired output function, 1 for one class and –1 for the other. Want to find discriminant function whose output is similar to Use sum of error squares as similarity criterion:

  8. Sum of error squares estimation Minimize mean square error: Thus

  9. Neurons

  10. f Artificial neuron. Above figure represent artificial neuron calculating:

  11. 1 1 0 0 Artificial neuron. Threshold functions f: Step function Logistic function

  12. Combining artificial neurons Multilayer perceptron with 3 layers.

  13. Discriminating ability of multilayer perceptron Since 3-layer perceptron can approximate any smooth function, it can approximate - optimal discriminant function of two classes.

  14. Training of multilayer perceptron f f f f f f Layer r-1 Layer r

  15. Training and cost function Desired network output: Trained network output: Cost function for one training sample: Total cost function: Goal of the training: find values of which minimize cost function .

  16. Gradient descent Denote: Gradient descent: Since , we might want to update weights after processing each training sample separately:

  17. Gradient descent Chain rule for differentiating composite functions: Denote:

  18. Backpropagation If r=L, then If r<L, then

  19. Backpropagation algorithm • Initialization: initialize all weights with random values. • Forward computations: for each training vector x(i) compute all • Backward computations: for each i, j and r=L, L-1,…,2 compute • Update weights:

  20. MLP issues • What is the best network configuration? • How to choose proper learning parameter ? • When training should be stopped? • Choose another threshold function f or cost function J?

More Related