Patrick J. Burns

Research Associate at Harvard Human Evolutionary Biology | Formerly Quantitative Criticism Lab, ISAW Library | Fordham PhD, Classics | CLTK contributor

Distant Reading Alliteration in Latin Literature

2014 Annual Meeting of the American Philological Association
Poster Session
January 4, 2014

Abstract

In this poster, I propose to analyze alliteration systematically across a large body of Latin literature using an algorithmic approach. For my dataset, I will use the Latin texts found in the Perseus Digital Library. The texts will be analyzed using the programming language Python, with each line (and group of adjacent lines) scored and ranked for “alliterative density.” I recently presented preliminary work on this subject at a recent digital methodologies conference; that paper, however, dealt primary with methodological and theoretical concerns. Moreover, it treated a much smaller dataset; specifically, it was restricted to the study of Latin hexameter poetry from Lucretius to Juvenal, a sample size of roughly 500,000 words. The Latin word count in the Perseus Greco-Roman collection as of 4/25/2013 is 10,525,102 words, and, as far as I am aware, there has not been a study on Latin alliteration conducted on this scale. Accordingly, the study will be what Jockers (2013, 27) has described as a “macroanalysis,” a study aspiring to yield “contextualization on an unprecedented scale.” Other studies, such as Wright (1974), Greenberg (1980), and Mayrhofer (1989) inter alia proposed algorithmic methods for studying alliteration in Latin literature, but in the intervening time, massive improvements have taken place in text-processing solutions and speeds, in the sophistication of visualizations, and in the ability to make data publicly available. Accordingly, it seems worthwhile to continue research on the topic. This poster will present my findings from new studies of alliterative density and other patterns of alliteration across various subsets of Latin literature, e.g. between prose and poetry, among genres, among authors, etc, and will do so using refactored Python code and a refined scoring methodology incorporating feedback from the earlier presentation. This presentation will consist of three parts: 1. a brief overview of the algorithms used for scoring alliteration, 2. visualizations, with representative diachronic and synchronic examples, and 3. interactive testing of the program. The goal of this study is to use quantitative methods to understand better general alliterative patterns and trends in Latin literature while leveraging the the “benefits of speed, automation, and scale that computational representations afford” (Ramsay 2011, 8)—that is, perform a “distant reading” of Latin alliteration—and, accordingly, to present meaningful, comparable and repeatable results on the device using widely available reference texts.

Select Bibliography

rss facebook twitter github youtube mail spotify instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora hcommons