Patrick J. Burns

Research Associate at Harvard Human Evolutionary Biology | Formerly Quantitative Criticism Lab, ISAW Library | Fordham PhD, Classics | CLTK contributor

The Future of Ancient Literacy: Classical Language Toolkit and Google Summer of Code

Article in submission for Classics@


The Classical Language Toolkit (CLTK) is a collection of software and texts researchers bringing natural language processing to the languages of ancient, classical, and medieval Eurasia and North Africa. This essay chronicles the CLTK’s participation in the 2016 Google Summer of Code, a program run by Google to encourage the growth of open source software. Google pays a stipend to student programmers, who in turn contribute code to an approved project between the months of May and August. GSoC accepted the CLTK and allotted it funding slots for two students. The CLTK, having received over 100 student applications, chose Patrick Burns (then a doctoral student in Classics at Fordham University) and Suhaib Khan (an undergraduate at the Netaji Subhas Institute of Technology). Patrick proposed to write a multiple-pass, rules-based lemmatizer for the Latin language. Suhaib’s proposal was to rework a codebase for the Classical Language Archive, a front-end JavaScript application for use as a reading environment by non-programmers. Kyle Johnson acted as the supervisor for the former project and Luke Hollis for the latter. We first offer a brief introduction to the CLTK and then turn to the two summer projects, what each of the three are and the motivations behind their creation. We conclude with a statement of how we envision the entire CLTK ecosystem working together to offer readers of Latin and Greek a presentation of texts and supporting materials not available in print editions.

rss facebook twitter github youtube mail spotify instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora hcommons