Phonotactic using SVM for LRE2009

Phonotactic using SVM for LRE2009 Tomáš Mikolov, 2009

Task - numbers • TRAIN set: 9810 utterances • DEV set (30s condition): 13 331 • 23 classes (languages)

Features • Feature vectors for HU recognizer • 35 000 trigrams • 80 000 four-grams • Large amount of features results in huge data files – 80 000 * 13 331 ~= 1 billion numbers (a few GB on disk)

SVMTorch vs SVMLib • Both work almost the same • In my work, SVM Torch was used • When using all features, training & testing phase is terribly slow (1/2 day on 10 machines)

PCA for dimensionality reduction • Idea: many features tend to co-occure • Experiments: reduction of features to ~500 dimensions works almost as good as using original features • Speed-up of training phase depends on original number of features – reduction from 80 000 to 500 may result in speed-up ~1000x or more

PCA for dimensionality reduction

Tuning C parameter of SVM • When tuning C for each language, Cavg* goes from 2.62 goes to 2.44 (with fourgram features) • Olda: with trigrams, from 2.47 to 2.33

Improvements of accuracy • Linear interpolation of trigram + fourgram score: 2.33 + 2.44 => 2.22 • Using multiple recognizers: • HU fourgram (80 000 => 500): 2.44 • RU trigram (110 000 => 500): 1.96 • HU+RU features (1000): 1.64

Current work • Using features from EN recognizer • Scoring eval set

Additional results • Using 5 SVMs: RU3, RU4, HU3, HU4, EN3 for 30sec condition: • DEV set C* grand average: 1.06 (~0.65 primary system) • EVAL set cavg: 2.42 (~2.29 primary system) • For 10sec and 3sec condition, results are not so good – probably because no unigram/bigram features are used (not tested yet)

Conclusion • PCA based feature extraction provides great speed-up (seems to work much better than feature selection) • Most further improvement can come from different phoneme recognizers (different features) • Tuning C separately for each language provides ~8% relative improvement

Phonotactic using SVM for LRE2009

Phonotactic using SVM for LRE2009

Presentation Transcript

Music Classification Using SVM

Phonotactic Language Recognition using i -vectors and Phoneme Posteriogram Counts

Exemplar-SVM for Action Recognition

Exemplar-SVM for Action Recognition

SVM for Regression

Adult Image Detection Using SVM

Annual Income Prediction Modeling Using SVM

EWTG Assessment Using IERM/SVM

Linear SVM

iVector approach to Phonotactic LRE

Phonotactic calculator: exercises

Masquerader detection using SVM with String Kernel

Efficient Behavior Targeting Using SVM Ensemble Indexing

Speaker Verification System using SVM

Clustering High Dimensional Data Using SVM

SVM Implementation

Text Classification using SVM-light