60 likes | 75 Views
Explore how neural network models can reveal new language facts despite their opacity, focusing on sound organization and word formation complexity. Discover insights into language patterns and irregular morphology persistence. Learn about the potential of GPT-2 models in natural language processing tasks.
E N D
LING/C SC 581: Advanced Computational Linguistics Lecture 28 April 18th
2019 HLT Lecture Series talk • Teaching demo slides on course website Speaker: Mans Hulden, University of Colorado Boulder Time: 12-1 pm, Wednesday, April 24 Location: Chavez 308 Title: Black-box Linguistics Abstract: Neural networks have in a short time brought about previously unimaginable advances in computational linguistics and natural language processing. The main criticism against them from a linguistic point of view is that neural models - while fine for language engineering tasks - are thought of as being black boxes, and that their parameter opacity prevents us from discovering new facts about the nature of language itself, or specific languages. In this talk I will examine that assumption and argue that there are ways to uncover new facts about language, even with a black box learner. I will discuss specific experiments with neural models that reveal new information about the organization of sound systems in human languages, give us insight into the limits of complexity of word-formation, give us models of why and when irregular morphology - surely an inefficiency in a communication system - can persist over long periods of time, and reveal what the boundaries of pattern learning is.
A look forwards? GPT-2 https://www.wired.com/story/ai-text-generator-too-dangerous-to-make-public/
A look forwards? GPT-2 https://openai.com/blog/better-language-models/
A look forwards • Language Models are Unsupervised Multitask Learners • Alec Radford * 1 Jeffrey Wu * 1 Rewon Child 1 David Luan 1 Dario Amodei ** 1 Ilya Sutskever ** 1 • *, **Equal contribution 1OpenAI, San Francisco, California, United States. Correspondence to: Alec Radford <alec@openai.com>. • Natural language processing tasks, such as ques- tion answering, machine translation, reading com- prehension, and summarization, are typically approached with supervised learning on task- specific datasets. We demonstrate that language models begin to learn these tasks without any ex- plicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the an- swers generated by the language model reach 55 F1 on the CoQA dataset - matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and in- creasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested lan- guage modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain co- herent paragraphs of text. These findings suggest a promising path towards building language pro- cessing systems which learn to perform tasks from their naturally occurring demonstrations.