110 likes | 463 Views
CS 9633 Machine Learning Feature Selection. References: W. M. Weiss and C. A. Kulikowski, Computer Systems that Learn, 1991, Morgan Kaufmann. R. Kohavi, and G. John, Wrappers for Feature Subset Selection, 1998, Artificial Intelligence. Feature Selection.
E N D
CS 9633 Machine LearningFeature Selection References: W. M. Weiss and C. A. Kulikowski, Computer Systems that Learn, 1991, Morgan Kaufmann. R. Kohavi, and G. John, Wrappers for Feature Subset Selection, 1998, Artificial Intelligence Computer Science Department CS 9633 KDD
Feature Selection • Some features are more informative than others • Some learning methods use all of the features (k-nn) and are quite sensitive to correlated attributes or to noise (uninformative attributes). • Some learning algorithms include feature selection as part of functionality Computer Science Department CS 9633 KDD
Feature Selection as Search Computer Science Department CS 9633 KDD
Search strategies • Starting point • Forward selection • Backward elimination • Organization of search • Exhaustive • Greedy • Stepwise selection or elimination (add or remove features at each point) • Strategy for selecting among alternative subsets Computer Science Department CS 9633 KDD
Strategy for Selection • Feature filter • features are selected independent of the learning algorithm • Wrapper approach • Generate set of candidate features • Run induction algorithm • Measure performance with the learning algorithm to evaluate feature set • Accuracy with training set • Cross validation Computer Science Department CS 9633 KDD
Filter Approach Input Features Feature subset selection Induction algorithm Computer Science Department CS 9633 KDD
Wrapper Approach Test set Feature selection search Induction algorithm Training set Feature set Performance estimation Feature set Feature evaluation Feature set Hypothesis Induction algorithm Test set Estimated Accuracy Final Evaluation Computer Science Department CS 9633 KDD
Filter versus Wrapper • Wrapper approach finds best features for a specific learning algorithm • Wrapper approach generally has much higher computational cost Computer Science Department CS 9633 KDD
Criterion for Halting Search • Filter approach • Rank features by usefulness score and determine breakpoint • When each combination of values for the selected attributes maps onto a single class value • Wrapper approach • Stop adding or removing when performance does not increase • Revise as long as performance does not degrade • Keep adding and removing until reach the other end of the search space and pick the best Computer Science Department CS 9633 KDD
Measures for filter methods • Pearson correlation coefficient (2 class problems) • Residual sum of squares • Adjusted R-square • Minimum mean residual Computer Science Department CS 9633 KDD
Search strategies for wrapper approach • Forward greedy • Backward greedy • Beam search • Branch and bound • Genetic algorithm Computer Science Department CS 9633 KDD