Determining a ‘Cultural Literacy Quotient’ for Latin Readability
Abstract for poster DH2024
Abstract
Over two decades ago, Latinist Kenneth Kitchell Jr. proposed the idea of a “cultural literacy quotient” (CLQ) for better gauging the relative difficulty of passages of Latin for intermediate students. The quotient amounts more or less to a ratio of Latin text to supporting notes, specifically notes meant to gloss not vocabulary or grammar but rather to provide background information on the “cultural information which informs the text, and without which it might make little or, sometimes, no sense” (Kitchell 2000: 213). While there has been much attention in recent years to assessing the difficulty of Latin texts, especially for beginning and intermediate students, research in this area has focused primarily on measures of vocabulary distribution (e.g. Keeline & Kirby 2023) and “lexical complexity” (e.g. Gruber-Miller & Mulligan 2022) as determinants of Latin readability. This poster takes up Kitchell’s idea of CLQ as an additional readability measure by taking advantage of recent advances in computational text analysis for Latin; specifically, it leverages improved named entity recognition (NER) for the language through the LatinCy trained pipelines (Burns 2023). By using NER annotations as a proxy for at least certain types of information—and at present the LatinCy tagger supports annotations of the names of people (e.g. Catullus), the names of locations (e.g. Roma, “Rome”), and the names of groups of people (e.g. Romani, “the Romans”)—that make culturally-specific demands on readers, we are now in a position to compare Latin sentences by their density of NER-tagged words.
Works Cited
- Beersmans, M., de Graaf, E., Van de Cruys, T., and Fantoli, M. 2023. “Training and Evaluation of Named Entity Recognition Models for Classical Latin.” In Anderson, A., Gordin, S., Li, B., Liu, Y., and Passarotti, M.C. eds. Proceedings of the Ancient Language Processing Workshop. Varna, Bulgaria: 1–12.
- Burns, P.J. 2023. “LatinCy: Synthetic Trained Pipelines for Latin NLP.” https://arxiv.org/abs/2305.04365v1.
- Dale, E. & Chall, J. 1949. “The Concept of Readability,” Elementary English 26: 19-26.
- Erdmann, A., Brown, C., Joseph, B., Janse, M., Ajaka, P., Elsner, M., and de Marneffe, M.-C. 2016. “Challenges and Solutions for Latin Named Entity Recognition.” In Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities: 85–93.
- Francese, C. 2021. “Latin Core Vocabulary.” Dickinson College Commentaries. https://dcc.dickinson.edu/vocab/core-vocabulary.
- Gruber-Miller, J., and Mulligan, B. 2022. “Latin Vocabulary Knowledge and the Readability of Latin Texts: A Preliminary Study.” NECJ 49(1): 80–101. doi:10.52284/NECJ.49.1.article.gruber-millerandmulligan.
- Keeline, T., and Kirby, T. 2023. “Latin Vocabulary and Reading Latin: Challenges and Opportunities.” TAPA 153(2): 531–59. doi:10.1353/apa.2023.a913472.
- Kitchell, Jr., K.F. 2000. “Latin III’s Dirty Little Secret: Why Johnny Can’t Read.” NECJ 27(4): 206–26.
- Laurs, T. 2024. “Towards a Readability Formula for Latin.” In Sprugnoli, R. and Passarotti, M. eds. Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024. 170–75.
- Mulligan, B. 2024. “Bridge: Customizable Vocabulary Lists.” https://bridge.haverford.edu/.
- Schulz, K. 2021. “Natural Language Processing for Teaching Ancient Languages.” In Teaching Classics in the Digital Age. Kiel: Universitätsverlag Kiel: 37–48.