Patrick J. Burns
Associate Research Scholar, Digital Projects @ Institute for the Study of the Ancient World / NYU | Formerly Culture Cognition, and Coevolution Lab (Harvard) & Quantitative Criticism Lab (UT-Austin) | Fordham PhD, Classics | LatinCy developer
The Collocations You Really Need to Know
Abstract for paper presentation given at CAAS2024
Abstract
The systematic study of Latin vocabulary has long been focused—and almost exclusively—on individual words, whether in print settings (Lodge 1907) or more recently in computational settings (Dee 2002; Rydberg-Cox and Mahoney 2002; Francese 2021). Yet in developing reading skills, it is clear that there needs to be more focus on multiword expressions like collocations, that is words frequently appearing in immediate or close proximity (Nation 2022: 83–84). By leveraging the widespread availability of digitized text and the increased quality of computational text analysis tools, we are in a good position to address the question: Which collocations appear with sufficient frequency and are uniquely informative so as to demand specific pedagogical attention? Accordingly, this presentation makes the following contributions: 1. it argues that frequent, informative collocations should be taught as a fundamental class of Latin vocabulary, i.e. not merely as a supplement to vocabulary consisting of a single word, but as a standalone complement; and 2. it presents a corpus-driven method for identifying and ranking the collocations of greatest interest in Latin texts. With respect to the first, in the preliminary research presented here on the works of Cicero, a number of collocative categories can be identified such as common multiword noun phrase (e.g. res publica, ius civile, tribunus plebis), short phrases functioning in an adverbial or other grammatical role (e.g. eo tempore, non solum, sed etiam), set prepositional phrases (e.g. ad modum, in causis), and personal names (e.g. Appius Claudius, Q. Metellus) in an automated manner and at scale—but also dynamically for specific authors, works, collections of texts, and so on—using corpus methods (like, e.g., pointwise mutual information). To showcase this process, I introduce a web application that generates on-the-fly top-scoring collocations for works commonly taught in introductory and intermediate Latin classes.
Works Cited
- Dee, J.H. 2002. “The First Downloadable Word-Frequency Database for Classical and Medieval Latin.” Classical Journal 98(1): 59–67.
- Francese, C. 2021. “Latin Core Vocabulary.” Dickinson College Commentaries. https://dcc.dickinson.edu/vocab/core-vocabulary.
- Lodge, G. 1907. The Vocabulary of High School Latin. New York: Teachers College, Columbia University.
- Nation, I. 2022. Learning Vocabulary in Another Language. 3rd ed. Cambridge: Cambridge University Press.
- Rydberg-Cox, J.A., and Mahoney, A. 2002. “Vocabulary Building in the Perseus Digital Library.” Classical Outlook 79(4): 145–49.