1 / 13

Kernel Methods and SVM’s

Kernel Methods and SVM’s. Predictive Modeling. Goal: learn a mapping: y = f ( x ;  ) Need: 1. A model structure 2. A score function 3. An optimization strategy Categorical y  { c 1 ,…, c m } : classification Real-valued y : regression

isaura
Download Presentation

Kernel Methods and SVM’s

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kernel Methods and SVM’s

  2. Predictive Modeling Goal: learn a mapping: y = f(x;) Need: 1. A model structure 2. A score function 3. An optimization strategy Categorical y  {c1,…,cm}: classification Real-valued y: regression Note: usually assume {c1,…,cm} are mutually exclusive and exhaustive

  3. Simple Two-Class Perceptron Initialize weight vector Repeat one or more times (indexed by k): For each training data point xi If … endIf “gradient descent”

  4. Perceptron Dual Form Notice that ends up as a linear combination of yjxj: Thus: +ve; bigger for “harder” examples This leads to a dual form of the learning algorithm:

  5. Perceptron Dual Form Initialize weight vector Repeat until no more mistakes For each training data point xi If … endIf Note: the training data only enter the algorithm via This is generally true for linear models (eg linear regression, ridge regression).

  6. Learning in Feature Space We have already seen the idea of changing the representation of the predictors: is called the feature space

  7. Linear Feature Space Models Now consider models of the form: equivalently: A kernel is a function K, such that for all x,zX where  is a mapping from X to an inner product feature space F just need to know K, not  !

  8. Making Kernels What properties must K satisfy to be a kernel? 1. Symmetry 2. Cauchy-Schwarz + other conditions

  9. K “pos. semi-definite” Mercer’s Theorem Mercer’s Theorem gives necessary and sufficient conditions for a continuous symmetric function K to admit this representation: “Mercer Kernels” This kernel defines a set of functions HK, elements of which have an expansion as: So, some kernels correspond to infinite numers of transformed predictor variables

  10. Reproducing Kernel Hilbert Space Define an inner product in this function space as: Note then that: This is the reproducing property of HK Also note, Mercer kernel implies:

  11. Regularization and RKHS A general class of regularization problems has the form: Suppose f lives in a RKHS with Some loss function (e.g. squared loss) Penalize complex f and Let: Then need to solve this “easy” problem:

  12. RKHS Examples For regression with squared error loss, have so that: generalizes smoothing splines… Choosing: leads to the thin-plate spline models

  13. Support Vector Machine Two-class classifier with the form: parameters chosen to minimize: Many of the fitted ’s are usually zero; x’s corresponding the the non-zero ’s are the support vectors.

More Related