Announcing the Perseus Lexical Inventory – an open linked data set.

Many different linguistic services and tools are dependent on lexical information as it is commonly found in Latin and Greek dictionaries. Most of these applications rely on their own implementation of dictionaries, stem databases etc. but there is no centralized open-access resource on which these services can draw for supporting data. The Perseus Digital Library is releasing its lexical data as an open linked data set, starting with Latin and to be followed by Greek,  in the hopes that it may eventually become such a resource. Work on producing this data set has been a collaborative effort, and would not have been possible without the guidance of Neel Smith of Holy Cross and Helma Dik of the University of Chicago.

The core of the Perseus Lexical Inventory is a CITE collection of Lexical Entity URIs. Each Lexical Entity identifier has associated properties including a normalized form of the lexical entity (or lemma) and a short definition.   The accompanying linked data set includes links between the Lexical Entity URIs, morpheus lemmas, and entries in the Lewis and Short lexicons on Perseus, Alpheios and Logeion.  A VOID file describing the data set is available at http://data.perseus.org/ds/lexical/void and a SPARQL endpoint for querying the data set is at http://services.perseus.tufts.edu/fuseki/sparql.html.   There is also a simple demonstration query form that looks up entries based upon the Latin form at http://perseids.org/tools/lexical/query.html.  The Tufts Morphology Service (currently available at http://services.perseids.org/bsp/morphologyservice ) also supplies the corresponding Lexical Entity URIs for lemmas returned by Morpheus.

Subsequent updates to the data set will include links to ontologies and other collections of uniquely identifiable entities, including part of speech, lexical tokens or forms, stems, prefixes and suffixes, morphological analyses, metrical data, orthographical variants, and named entities.  The lexical entities and tokens will also be linked to their occurrences in dictionaries and other lexica, texts (i.e. of the Perseus corpus, among others), treebanks, etc. Finally we expect to link to other established and emerging data sets, including the Pleiades Gazetteer and the SNAP dataset of ancient prosopography, among others.

Our ultimate goal is for the lexical data sets to be completely open with various channels, including both user interfaces and service-based APIs, through which people and systems can contribute new data and corrections.

In keeping with the approach we have been taking with the release of our data (see the Perseus Catalog’s Roadmap towards Linked Data standards compliance) we are releasing the data knowing we have much work to do still, and will make progress towards the larger vision in incremental steps.  Our next steps will include release of a companion Greek Lexical Inventory, followed by the addition of the stem and lexical token data sets and development of APIs and interfaces for using and contributing to the data.

 

Posted in Uncategorized | Comments Off

Pelagios used in Tufts Classes

Pelagios, Pleiades, and Perseids workshop took place at week-long hackathon


On Monday, March 3, students in Marie-Claire Beaulieu’s Medieval Latin class and Maxim Romanov’s Geography of the Classical Islamic World held a workshop together with the Pelagios team. Leif Isaksen (University of Southampton), Elton Barker (Open University), and Rainer Simon (Austrian Institute of Technology) directed the students in using the Pelagios interface to annotate place names in Latin, English, and Arabic documents. We were fortunate to also have Tom Elliott (New York University Institute for Studies of the Ancient World), the co-managing editor of the Pleiades Gazetteer used by Pelagios, participating in the workshop.

Read more.

Posted in Uncategorized | Comments Off

Announcing the Leipzig Open Fragmentary Texts Series (LOFTS)

The Humboldt Chair of Digital Humanities at the University of Leipzig is pleased to announce a new effort within the Open Philology Project: the Leipzig Open Fragmentary Texts Series (LOFTS). In the first phase of LOFTS we invite public discussion as we finalize the goals, technological methods and editorial practices.

The Leipzig Open Fragmentary Texts Series is a new effort to establish open editions of ancient works that survive only through quotations and text re-uses in later texts (i.e., those pieces of information that humanists call “fragments”).

As a first step in this process, the Humboldt Chair announces the Digital Fragmenta Historicorum Graecorum (DFHG) Project, whose goal is to produce a digital edition of the five volumes of Karl Müller’s Fragmenta Historicorum Graecorum (FHG) (1841-1870), which is the first big collection of fragments of Greek historians ever realized.

For further information, please visit: http://www.dh.uni-leipzig.de/wo/open-philology-project/the-leipzig-open-fragmentary-texts-series-lofts/

Posted in Uncategorized | Comments Off

Publishing Text for a Digital Age

Publishing Text for a Digital Age

Update: Submissions now being accepted!

March 27-30, 2014

http://sites.tufts.edu/digitalagetext/2014-workshop/

As a follow-on to “Working with Text in a Digital Age,” an NEH-funded Institute for Advanced Technologies in the Digital Humanities, and in collaboration with the Open Philology Project at the University of Leipzig, Tufts University announces a 2-day workshop for on publishing textual data that is available under an open license, that is structured for machine analysis as well as human inspection, and that is in a format that can be preserved over time. The purpose of this workshop is establish specific guidelines for digital publications that publish and/or annotate textual sources from the human record. The registration for the workshop will be free but space will be limited. Some support for travel and expenses will be available. We particularly encourage contributions from students and early-career researchers.

 

Posted in Uncategorized | Comments Off

New courses on Digital Philology at the University of Leipzig

October 2013 – January 2014: Overview of Digital Philology (5 credits)

April – July 2014: Current Topics in Digital Philology (10 credits)

[Please re-circulate]

*Research assistantships for enrolled students are available to students enrolled in these classes*

The Humboldt Chair of Digital Humanities at the University of Leipzig is developing a sequence of English-language courses on digital philology that will begin in the Wintersemester and Sommersemester of the 2013/2014 academic year. The courses may be taken in sequence or individually. We particularly encourage participation by graduate students, not only from Leipzig but from elsewhere in Europe and beyond, who are preparing to begin careers as researchers, teachers or library professionals. A semester or an academic year at Leipzig can help you transform your career and to acquire the skills by which you can flourish in an intensively network, profoundly global intellectual world.

These courses are particularly unusual in that they are offered within a Computer Science department and provide students with an opportunity to connect more directly with experts in advanced technologies than is often feasible. Germany also is unusual in that Computer Science and the Humanities are both instances of Wissenschaft — we do not face the boundaries between funding for research in the Humanities and in Computer Science that many in the English-speaking world face. If you wish to acquire the full range of skills needed for both teaching and research, these courses in this environment provide you with an excellent space in which to develop.

Note: particularly promising students enrolled in these classes will have an opportunity to work as research assistants, where they can apply the skills that they acquire in their classes. We particularly encourage ambitious students from outside Leipzig to consider this option to help support their stay.

An Overview of Digital Philology (5 credits, Wintersemester) provides students with programming skills needed to work with text in a digital age. We particularly focus upon the integration of methods from computational and especially corpus linguistics, both of which fields are fundamental to the study of language and critical to all who wish to develop flourishing careers as teachers and researchers in philology. The course is organized so that students can also take the Leipzig eHumanities Seminar (5 credits). In 2013, the course will focus particularly upon familiarizing students with XML and with the use of associated technologies (e.g., xslt, xquery).

While students who have taken the Overview of Digital Philology will be able to build on their knowledge in developing course projects, the Sommersemester course, Current Topics in Digital Philology (10 credits, Sommersemester), is open to anyone with advanced experience in either computer science or philology. Current Topics in Digital Philology provides a framework within which students of language from various backgrounds can develop projects informed by new advances in corpus and computational linguistics and in the digital humanities. In 2014, students will develop skills in the use of Python to work with richly annotated linguistic corpora and then use these skills in course projects.

Contact: teaching@e-humanities.net

[Please re-circulate]

Posted in Uncategorized | Comments Off