1 / 17

Topic Learning in Text & Conversational Speech

Topic Learning in Text & Conversational Speech. Constantinos Boulis. Introduction. Definition of Topic Learning Supervised : Learn a mapping of data to topics Unsupervised : Discover new topics Applications of Topic Learning Crucial step for information access

dong
Download Presentation

Topic Learning in Text & Conversational Speech

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic Learning in Text & Conversational Speech Constantinos Boulis

  2. Introduction • Definition of Topic Learning • Supervised : Learn a mapping of data to topics • Unsupervised : Discover new topics • Applications of Topic Learning • Crucial step for information access • Google News, call-center automation • Challenges of Topic Learning • A learning problem of very high dimensionality

  3. An Example B.16: especially, you know, smaller areas, A.17: Uh-huh. B.18: smaller towns. A.19: Uh-huh. Yeah. Probably the hardest thing in, in my family, uh, my grandmother, she had to be put in a nursing home and, um, she had used the walker for, for quite some time, probably about six to nine months. And, um, she had a fall and, uh, finally, uh, she had Parkinson's disease, B.20: Oh. A.21: and it got so much that she could not take care of her house. B.22: Right. A.23: Then she lived in an apartment and, uh, that was even harder -- B.24: Uh-huh.

  4. Impact • Interdisciplinary research on Natural Language Processing, Data Mining and Speech Recognition • Core technology can leverage fields such as Bioinformatics • All these technologies come together on the 311 line (TIME, Feb. 7th 2005)

  5. My work Past work Dimensions of Topic Learning Less Supervision More Structured Input Less More

  6. Dissertation Contributions • General Topic Learning Contributions (applicable to text, speech, gene expression etc) • Combining Multiple Clustering Partitions (*) • Feature Construction (*) • Topic Learning in Conversational Speech • Speech-to-text errors • Role of disfluencies • Separating content & style • Role of prominence (*)

  7. Combining Multiple Clustering Partitions • Classifier combination is studied extensively but not much work on combining clustering systems • Fundamental problem: Missing correspondence between clusters of different systems {1,2,2,1,3,1,2,3,3,3,2,1} {3,1,1,3,2,3,1,2,1,2,3,2} • Contribution : New algorithms that estimate the correspondence of clusters then combine them using linear programming techniques and singular value decomposition

  8. Feature Construction • A lot of work on supervised topic learning methods but not much on constructing feature spaces • Bag-of-words representation too coarse but hard to improve • Contribution : Add only those word pairs that contribute sufficiently new information than their constituting words, i.e. the whole is much more than the sum of its parts • “second hand” >> “second” + “hand” • “big brother” >> “big” + “brother”

  9. Role of Prominence • Speech is a richer medium than text; it is not only what we say is also how we say it. • Prominence is the emphasis we put on words • Contribution : The first study to show that prominence can be combined with lexical saliency measures to yield improved feature subsets for topic learning

  10. Summary • Topic learning a key step for information access (retrieval, extraction) • Key contribution : Advancing language processing for spoken documents • Unique elements of this work: Combining speech, language and data mining technology

  11. Journal Publications Resulting from PhD • Deng, L., Wang, Y., Wang, K., Acero, A., Hon, H.-W., Droppo, J., Boulis, C., Mahajan, M., and Huang, X.D, February-March 2004, “Speech and Language Processing for Multimodal Human-Computer Interaction”, Journal of VLSI Signal Processing Systems, 36(2-3):161-187. • Boulis, C., Ostendorf, M., Riskin, E., Otterson, S. November 2002. “Graceful Degradation of Speech Recognition Performance Over Packet-Erasure Networks”, IEEE Transactions on Speech and Audio Processing, 10(8):580-590.  • Deng, L., Wang, K., Acero, A., Hon, H.-W., Droppo, J., Boulis, C., Wang, Y.-Y., Jakoby, D., Mahajan, M., Chelba C., and Huang, X.D. November 2002. “Distributed Speech Processing in MiPad's Multimodal User Interface”, IEEE Transactions on Speech and Audio Processing, 10(8):605-619.

  12. Conference Publications Resulting from PhD • Boulis, C., Kahn, J., Ostendorf, M., July 2005. “The Role of Disfluencies in Topic Classification of Natural Human-Human Conversations”, Proc. of the Workshop on Spoken Language Understanding, in press. • Boulis, C., Ostendorf, M., June 2005. “A Quantitative Analysis of Lexical Differences Between Genders in Telephone Conversations”, Proc. of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), in press. • Boulis, C., Ostendorf, M. April 2005. “Text Classification by Augmenting the Bag-of-Words Representation with Redundancy-Compensated Bigrams”, Proc. of the International Workshop on Feature Selection in Data Mining, pp 9-16. • Boulis, C., Ostendorf, M. September 2004. “Combining Multiple Clustering Systems”. Proc. of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), LNAI 3202, pp. 63-74. • Boulis, C. May 2004. “Speaker Recognition with Mixtures of Gaussians with Sparse Regression Matrices”, Proc. of the Student Research Workshop of Human Language Technology/North American Chapter of the Association for Computational Linguistics (HLT/NAACL), companion volume, pp. 55-60. • Riskin, E., Boulis, C., Otterson, S., Ostendorf, M. September 2001. “Graceful Degradation of Speech Recognition Performance Over Lossy Packet Networks”. Proc. of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001), pp. 2715-2719.  

  13. Future Publications & Awards Resulting from PhD Manuscripts under review • Boulis, C., Ostendorf, M., “Combining Multiple Clustering Partitions”, Journal of Machine Learning. • Boulis, C., Ostendorf, M., “Using Symbolic Prominence to Help Design Feature Subsets for Topic Classification and Clustering of Natural Human-Human Conversations”, Interspeech-05. Manuscripts under preparation • Boulis, C. Ostendorf, M., “Unsupervised Estimation of Word Confusability and its Use in Topic Classification of Human-Human Conversations”  Awards • Best Student Paper Award, PKDD 2004. 581 total submissions, 17% acceptance rate

  14. Backup Slides The following slides are not used in the main presentation

  15. Speech-to-Text Errors • Output of STT systems contain errors (~20%) • Some words have higher error rates than others • Contribution : Design algorithm that adaptively clusters confusable words, modifying the vocabulary provided for topic learning tasks • Provided gains in classification performance of 25% relative

  16. Role of Disfluencies • Disfluencies are very common in conversational speech That’s all you need you only need one boxcar (repetition) So it’ll take um so you want to do what (repair) • Contribution : Demonstrate that removing disfluencies in topic classification performance does not impact the bag-of-words model, but does impact more complex representations

  17. Separating Content & Style • When two people talk they bring into the discussion their idiosyncracies. Are there idiosyncracies in the gender level? • Can this affect topic classification? • Contribution : The first quantitative study to show that there are lexical differences between genders in telephone conversations • Almost 100% accuracy in detecting the gender of a speaker based on what he/she said • The gender of the speaker of one side can influence lexical patterns in the other side

More Related