Task 1 of PP Interpretation

Task 1 of PP Interpretation 1.1 Further applications of boosting:This talk 1.2 Publication on boosting:Paper of Oliver Marchand submitted, but not yet published

Thunderstorm Prediction with Boosting: Verification and Implementation of a new Base Classifier André Walser (MeteoSwiss) Martin Kohli (ETH Zürich, Semester Thesis)

Overview • Boosting Algorithm • Impact of learn data • Verification results • Mapping to probability forecast • New base classier: decision tree

New Data yes/no Supervised Learning Learner Rules Historic Data Classifier

COSMO-7 assml cycle Data for 79 SYNOP stations in Switzerland At least on year, every hour e.g. SI, CAPE, W, date, time LABEL DATA a thunderstorm „yes“ if an appropriate ww-code was reported in the SYNOP or at least 3 lightnings were registered within 13.5 km Learn data 13.5 km station

Iteration 1determine base classifier G 2calculate error, weights w 3adapt the weights of falselyclassified samples Input Weighted learn samples Number of base classifier M AdaBoost Algorithm

Output of the Learn process • M base classifier • Threshold classifier:

Iteration 1determine base classifier G 2calculate error, weights w 3adapt the weights of falselyclassified samples Input Weighted learn samples Number of base classifier M AdaBoost Algorithm Classifier:

Output of the Classifier: C_TSTORM Biased! 17 UTC 18 UTC Biased! 19 UTC

Reason: Inappropriate learn data… • SYNOP messages contain events and non-events, but are only available every 3 hours (most messages for 6, 12, 18 UTC). • Lightning data only contains events

New learn data sets • B – biasedSYNOP messages; only events from lightning data • F – fullSYNOP messages; all missing values are considered as non events • AL1 – at least 1SYNOP messages; when lightning data shows at least 1 events, all non missing value are considered as non-events

Without bias… 17 UTC 18 UTC 19 UTC

Verification • POD and FAR for different C_TSTORM values between 0.3 and 0.6 FAR = False Alarms / #Alarms • Learn data:Model: COSMO-7 assimilation cycle Jun 06 – May 07Obs: B / AL1 / F • Verification data: Model: COSMO-7 forecasts July 06 and May/June 07Obs: F

Verification: earlier results • Results reported last year for 2005:POD = 72%, FAR = 34% • Unfortunately not realistic, verification done with obs data B

July 2006 ~7% events Random forecast

18 May – 24 June 2007

Comparison with other system • DWD Expert-System: • Periode April 2006 - September 2006: POD = 0.346, FAR = 0.740

Mapping to a probability forecast Polygon fit in a reliability diagram: PC_TSTORM

Mapping to a probability forecast 0 ifx ≤ 0.4; ax2 + bx + c if 0.4 < x < 0.6; a0.62 + b0.6 + c if x ≥ 0.6. PC_TSTORM = Limitedresolution: Thesystempredictsprobabilitiesonlybetween 0 and ~40%

New Base Classifier: Decision Tree threshold classifier 1 1 0

New Base Classifier: Decision Tree threshold classifier 1 class 1 class 0 threshold classifier 2 threshold classifier 3 0 1 0 1

Decision Tree: Example

Conclusions & Outlook • Boosting • is a simple, efficient and effective machine learning method for model post-processing • is completely general • can employ a number of redundant indicators • computes a certainty of the classification mapped to probability forecast • First verification results promising, extended verification required • Benefit of decision trees?

Task 1 of PP Interpretation

Task 1 of PP Interpretation

Presentation Transcript

Task 1

Task 1

TASK 1

TASK 1

Task 1

Task 1

PP 1 Nutrition

Task 1

Task 1

Review PP #1

Task 1

Task # 4, pp. 42

Lecture 1 Interpretation of data

Integrated Interpretation and Generation of Task-Oriented Dialogue

Landscape image interpretation task

Task 1:

PP 1-24

PP 1

Task 1

Kempler Interpretation Task

Task 1