1 / 12

GECCO 2013 Industrial Competition

GECCO 2013 Industrial Competition. Computer Engineering Lab, School of Electrical and IT Engineering. Rommel Vergara. Introduction. Machine Learning Algorithm used: Kernel Recursive Least Squares

kalani
Download Presentation

GECCO 2013 Industrial Competition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GECCO 2013 Industrial Competition Computer Engineering Lab, School of Electrical and IT Engineering Rommel Vergara

  2. Introduction • Machine Learning Algorithm used: Kernel Recursive Least Squares • Used an open-source C++ library, dlib (http://dlib.net) which has a full implementation of the KRLS algorithm. • Challenges: • Aperiodic and missing data samples • Selection of the feature set to describe the Temperature and Humidity values.

  3. Data Preprocessing • All data sets contained missing data and were aperiodic in nature. • SOLUTION: • Missing Data: Missing data occurrences were linearly approximated between their adjacent available data points. • Aperiodic Nature: Data was standardized into 10 minutes intervals (as required by the format for the competition)

  4. Data Preprocessing • Data set was also narrowed to contain the target weekdays only: Tuesday, Wednesday and Thursday. • Able to capture more accurately minor and specific changes that occur on these particular weekdays.

  5. Data Preprocessing Tue Tue Wed Thu

  6. Feature Set Selection • The ‘Feature Set’ is a selection of inputs that contribute and explain the output • Important to choose the feature set to describe the outputs we are predicting • There are many ways to represent the feature set to describe an output. • Each feature set is represented as a column vector and is fed into the KRLS algorithm.

  7. Temperature Feature Set • The temperature feature set chosen contained 145 values • Current weather value • 144 temperature values 10 minute rolling window of the previous and current weekday lagged by 1 week. • REASONING: • Remove noise that exists in other weekdays • Allow KRLS to concentrate and focus more on the specific weekdays that needed to be judged in the competition.

  8. Temperature Feature Set • Example: • To predict the following data point: • 20/02/2013 00:00 (Wednesday) • The following temperature values were used: • 12/02/2013 00:00 (Tuesday) to • 12/02/2013 23:50 (Tuesday) • To predict the next data point: • 20/02/2013 00:10 (Wednesday) • The following temperature values were used (10 minute rolling window): • 12/02/2013 00:10 (Tuesday) to • 13/02/2013 00:00 (Wednesday)

  9. Temperature Results

  10. Humidity Feature Set • The humidity feature set chosen contained 2 values • Current weather value • Predicted KRLS temperature value • This proved to be ineffective, providing an RMSE of 0.12 in the competition. • CHALLENGE: Humidity data set was aperiodic and observed to have discrete-like behaviour. • IMPROVEMENT: Given a more continuous data set, I would have chosen the same technique as the Temperature Feature Set, which is to take a rolling window of previous and current weekday humidity values lagged by 1 week.

  11. Humidity Results

  12. References • “The Kernel Recursive Least Squares Algorithm” (2003), Yaakov Engel, ShieMannor, Ron Meir. • dlib library: http://dlib.net • Contact: • Rommel Vergara (University Of Sydney) • rommel_vergara@hotmail.com

More Related