Language Documentation and Computational Methods
I focus heavily on language documentation – more specifically on video-based documentation of natural discourse, including a large amount of conversation. This documentation has been crucial to my work on narrative and linguistic ethnography. In addition to a grammar of Arapaho, I am currently working on an Arapaho lexical database/dictionary project, which is largely usage-based. The lexical database contains over 30,000 entries, and this is cross-linked to a text database, which contains around 75,000 lines of time-aligned, transcribed, translated and interlinearized natural discourse in Arapaho (with current data in process to get to 90-100,000 lines). remains under development, but can be examined on-line, including viewing examples of each lexical item as it occurs in the text database.
A large amount of my data has been publicly deposited at the .
Current work with other faculty and students is focused on creating morphological parsers and automated dependency labeling for the data, and then leveraging these results to do the same for Cheyenne, Ojibwe and Meskwaki, all using computational approaches. The database also provides many opportunities for computational corpus-based studies of both linguistic topics such as syntax and pragmatics and socio-cultural features of language use. Students are also using the database to study multi-modal interaction and conversational practices such as preferred question forms from a cross-linguistic perspective. A team of undergraduates are also working with a Sierra Miwok database.