Patrick J. Burns

Research Associate at Harvard Human Evolutionary Biology | Formerly Quantitative Criticism Lab, ISAW Library | Fordham PhD, Classics | CLTK contributor

Cicero’s Hardest Sentence?: Measuring Readability in Latin Literature

Invited talk for Drew University Classics Department
December 1, 2016

Abstract

In the third book of Cicero’s De oratore, as the interlocutors discuss the the features of effective eloquence, Crassus explains a number of different ways in which a speaker can illustrate an argument and he does so at great length in a sentence (De orat. 202-205) which with modern punctuation runs 279 words and 1914 characters. Because of the length of this sentence as well as the ratio of characters to words, a version of the Automated Readability Index, a measure that has been used determine the “grade level” of English texts, ranks this as Cicero’s hardest sentence. But are these features applicable to Latin? Can we compare the difficulty of sentences, paragraphs, or even entire works based on formal features? In this talk, I will produce readabilty scores such as the ARI, the Flesch-Kincaid grade level, the Gunning fog index, and the Dale-Chall formula for the works of Cicero to explore whether these standard measures of English readabilty can help us better understand the relative ease or difficulty of specific Latin texts. I will demonstrate a natural language processing workflow using Python and the Classical Language Toolkit that we can use to generate the statistics necessary for calculating such scores including word length, sentence length, and relative word frequency. By way of conclusion, I will discuss what these scores can tell us about such pedagogical issues as which works of Latin literature should be assigned to intermediate students, in which order, and in what quantity.

Code

Jupyter notebook with code and data for this talk available here.