The Humboldt Chair of Digital Humanities and Department of Computer Science at the University of Leipzig is looking for candidates for two possible collaborating research groups, one focused on reinventing scholarly communication for Greek and Latin, as a case study for historical languages in general, with the other helping the University Library develop methods to manage and visualize billion of words and associated annotations of many kinds. Details of the funding are being finalized but positions will ideally start in May 2013 and with an initial one year contract that could be extended to a second year that could include one semester residence at a US university.
Candidates must have received their most recent degree after January 4, 2011. Current degree candidates may also be considered. We are building a team includes varied backgrounds, with team members having expertise in Greek and Latin, in software analysis and development, and in working with metadata models that are relatively well established (TEI XML, Functional Requirements for Bibliographic Records, CIDOC CRM) and that are just beginning to be exploited (e.g., the full potential of the Europeana Data Model). Project members should be prepared to participate in all forms of intellectual life, including research, both within the humanities and the information sciences, and software development, supervising student researchers, delivering presentations before specialist and general audiences, writing, and participation in teaching activities.
Interested candidates should send a letter of interest, briefly describing how they could contribute to one of these teams, a CV, and the names of three references to email@example.com.
The work will have several complementary tracks:
- Open Greek and Latin: Classicists need comprehensive, open collections of Greek and Latin that anyone can download, modify, and then republish. The long term goal of the Open Greek and Latin Project is to represent the full surviving corpus of Greek and Latin sources, including transcriptions from every print source, this will include not only print books but also manuscripts, inscriptions, ostraca, papyri, vases, etc. and will cover the full range of Greek and Latin sources, from the Homeric Epics through post-classical Greek and Latin to the present. In the short run, we focus on providing comprehensive coverage for the c. 100 million words of Greek and Latin that survive through c. 600 CE and opportunistic coverage for the billions of words of surviving post-classical Greek and Latin. The Open Greek and Latin Project integrates the growing body of Greek and Latin available under a Creative Commons license while drawing upon vast collections of scanned editions and new Canadian-Italian research on generating and correcting Greek and Latin. Coverage will include TEI XML transcriptions of editions that are in the public domain and machine actionable RDF equivalents for traditional indices of places where a new edition differs from its most significant predecessor. The whole collection — including textual transcriptions, structural metadata, as well as linguistic and other machine actionable data — will be available in an RDF format developed to interoperate as closely as possible with the Europeana Data Model.
- Decentralized editing and annotation: The Open Greek and Latin corpora represent a foundation and starting point for further work. We need methods by which to support annotations of every kind, including not only corrections of OCR errors but also new translations (which are a kind of annotation), data driven studies of textual transmission, textual reuse and the general circulation of ideas across time, space, language and culture, prosopography, and morphological, syntactic, semantic and lexical analyses. We need to support a growing range of machine actionable annotations, each of which represents a nano-publication that may be accompanied by expository prose argumentation and/or additional machine actionable annotations. As students of Greek and Latin begin to confront the opportunities and challenges presented by global access to collections measured in billions — rather than millions — of words, we need to be able to manage contributions from student researchers and citizen scholars as well as from faculty and library professionals.
- Transnational systems for Greek, Latin, and other historical languages: Greek and Latin are fundamentally transnational languages — no nation has a unique claim to the intellectual and linguistic heritage of these languages, which together provide a major cultural foundation for what is now the European community. More than 20 organizations representing communities speaking Croatian, Czech, Danish, Dutch, German, English, French, Italian, Lithuanian, Macedonian, Polish, Portuguese, Romanian, Spanish, and Swedish have representatives in the General Assembly of Euroclassica, a European federation of associations of teachers of classical languages and civilisation, while a European Curriculum Framework for Classical Languages (ECFRCL) is already under active development. Language learning based upon interacting with, and then contributing to, richly annotated corpora can provide rapid and plentiful feedback, allowing learners to engage immediately with primary sources while also enabling them to begin making substantive contributions to the field early and often. At the same time, many scholarly arguments include statements that can be represented in a machine actionable format that can be made available to many different language communities. In some cases (as in publications about prosopography or textual criticism) the conclusions of the argument can be represented as machine actionable annotations. Greek and Latin studies provide a particularly interesting space within which to develop methods by which speakers of many languages can collaborate in learning and research because these classical languages are not (unlike English or French) associated with modern hegemonic nations.