CORPUS OF LANGUAGE PREDICTABILITY IN BRAZILIAN PORTUGUESE
João Vieira and Elisângela Teixeira- Universidade Federal do Ceará
Predictability is an influential factor in language processing. A quantity of studies in psycholinguistics have found strong indication of that in recent years (see Kuperberg & Jaeger, 2016 for a comprehensive overview). Some recent corpora aimed to study that factor in different languages such as English (Luke & Christianson, 2018) and German (Kennedy et al, 2013). However, there still isn’t a similar corpus in Brazilian Portuguese. In this research, our objective is to assemble a large corpus of linguistic predictability in Brazilian Portuguese, more specifically of predictability during reading. In it, we will construct: i) a corpus with predictability values for orthography and grammatical categories and ii) a corpus with the costs of language processing in reading. The first corpus will be assembled through a cloze test, while the second with eye tracking. Crossing those data, we will also build a more complete corpus based on the previously acquired parameters for predictability and the processing costs. The data will be collected from undergraduate students at the Federal University of Ceará. As an example of applicability for the data, we will evaluate the reading abilities of students at the beginning and at the end of their Major in Letters, aiming to see what level of development they acquire throughout their studies.
Comments