210 likes | 374 Views
Compiling an oral corpus of child language (G.S.C.C). Gavriilidou Zoe (Democritus University of Thrace) Elina Chadjipapa (FLEXSEM, Autonomous University of Barcelona ) Anna Giannakopoulou (FLEXSEM, Autonomous University of Barcelona ). PLAN. Purposes Definition of a corpus Characteristics
E N D
Compiling an oral corpus of childlanguage (G.S.C.C) Gavriilidou Zoe (Democritus University of Thrace) Elina Chadjipapa (FLEXSEM, Autonomous University of Barcelona ) Anna Giannakopoulou (FLEXSEM, Autonomous University of Barcelona)
PLAN • Purposes • Definition of a corpus • Characteristics • Application fields • Construction • Description • Research based on G.S.C.C. • Perspectives
Purpose • CONSTRUCTION OF A REPRSENTATIVE ORAL CORPUS OF CHILDREN IN PRE-SCHOOL AGE • ITS USE IN FURTHER APPLICATIONS/RESEARCHES
WHAT A CORPUS IS • A corpus is a collection of pieces of language text in electronic form, selected according to external criteria to represent, as far as possible, a language or language variety as a source of data for linguistic research(Intuition and annotation - the discussion continues.Sinclair J. 2004 ). • The concept of carrying out research on written or spoken texts is not restricted to corpus linguistics. ("Corpus Linguistics“ Tony McEnery Andrew Wilson).
Corpora can either be:(Borodal, 2002) • Tagged – where all words have been marked in some way e.g. evaluation test • Untagged – that is not processed at all (spontaneous)
Representative sample of a language Quantitive and Qualitive analysis of the sample Direct and quick enrichment of the sample Electronic availability CHARACTERISTICS OF ACORPUS
Teaching Literature Lexicography Linguistics Sociolinguistics Psycholinguistics Computational Linguistics APPLICATION FIELDS
CONSTRUCTING THE GREEK SPEAKING CHILDREN CORPUS (G.S.C.C)
BASED ON THE CONSTRUCTION RULES • Size of the Sample • Authenticity of the Corpus • Range of the Sample
SIZE OF THE G.S.C.C 151.380 WORDS 45 approximately hours of speech Available in http://utopia.duth.gr/~zgabriil
10 children 3-4 years 35 children 4-5 years 15 children 5-6 years DESCRIPTION OF THE G.S.C.C • Interviews of 60 children 3-6 years old • 35 females • 25 males
De-recordingand Transcription of the interviews • 35 Greek speaking children (Standard Greek) • 15 Cypriot dialect speaking children • Phonetic Transcription of the Interviews (IPA)
CONTENTS (DATA) OF THE INTERVIEWS • Children with speech problems • Greek speaking resident (bilingual) • Words and phrases used from children 3-6 years old (vocabulary) • Idioms from many Greek regions
CORPUS’S COLLECTION REGIONS • Greece • Orestiada • Aleksandroupolis • Kavala • Thessalonica • Edessa • Athens • Cyprus
Applications • Within the frame of Linguistic research • Difficulties encountered from children 3-6 years • Level of the communication language. • Amount of words and level of frozen phrases • Basic Vocabulary
Research based on G.S.C.C. • “Phonological and Phonetic analysis of cases of G.S.C.C.” • 23 subjects, 15 females and 8 males • SPSS was used for the statistical analysis
AIMS • Frequency of phonological errors • Frequency of distorted phonemes • Comparison between sexes • Language acquisition among age groups • Context (environment) of the distorted phonemes • Accented/not accented syllables in the word
FUTURE WORK • Enlarge the G.S.C.C • Raise the speaking hours • Add more dialects • Add non native Greek Speaking children
PERSPECTIVES Use of the G.S.C.C in: • Further linguistic analysis, such as syntax, grammar, vocabulary and morphology. • As a tool in Corrective Phonetics for the Greek linguistic system. • For further application in other fields of Linguistics (Psycholinguistic, Computational Linguistic, Sociolinguistics, etc.)