Patrick J. Burns

Associate Research Scholar, Digital Projects @ Institute for the Study of the Ancient World / NYU | Formerly Culture Cognition, and Coevolution Lab (Harvard) & Quantitative Criticism Lab (UT-Austin) | Fordham PhD, Classics | LatinCy developer

The Collocations You Really Need to Know

Abstract for paper presentation given at CAAS2024

Abstract

The systematic study of Latin vocabulary has long been focused—and almost exclusively—on individual words, whether in print settings (Lodge 1907) or more recently in computational settings (Dee 2002; Rydberg-Cox and Mahoney 2002; Francese 2021). Yet in developing reading skills, it is clear that there needs to be more focus on multiword expressions like collocations, that is words frequently appearing in immediate or close proximity (Nation 2022: 83–84). By leveraging the widespread availability of digitized text and the increased quality of computational text analysis tools, we are in a good position to address the question: Which collocations appear with sufficient frequency and are uniquely informative so as to demand specific pedagogical attention? Accordingly, this presentation makes the following contributions: 1. it argues that frequent, informative collocations should be taught as a fundamental class of Latin vocabulary, i.e. not merely as a supplement to vocabulary consisting of a single word, but as a standalone complement; and 2. it presents a corpus-driven method for identifying and ranking the collocations of greatest interest in Latin texts. With respect to the first, in the preliminary research presented here on the works of Cicero, a number of collocative categories can be identified such as common multiword noun phrase (e.g. res publica, ius civile, tribunus plebis), short phrases functioning in an adverbial or other grammatical role (e.g. eo tempore, non solum, sed etiam), set prepositional phrases (e.g. ad modum, in causis), and personal names (e.g. Appius Claudius, Q. Metellus) in an automated manner and at scale—but also dynamically for specific authors, works, collections of texts, and so on—using corpus methods (like, e.g., pointwise mutual information). To showcase this process, I introduce a web application that generates on-the-fly top-scoring collocations for works commonly taught in introductory and intermediate Latin classes.

Works Cited

rss facebook twitter github youtube mail spotify instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora hcommons