In order to make the Glossary of Greek Birds accessible to a broad audience, we are working on producing a digital edition of the text, which will be rendered as an online interface as well as a downloadable dataset in XML.

First, we are producing a full re-transcription of the text. We obtained high-quality OCR results from the Lace project at Mount Allison University. We continued refining the output with the help of the Greek and Latin Demixer Tool produced by Zachary Fletcher of the Perseids Project.

We then parse the Glossary for our audience, for instance by adding translations of the Greek and Latin words and quotations as well as hyperlinks to the original texts and other resources such as ornithological databases.

As we go, we also mark the text up with XML tags so as to make it possible to break it down into its constituent categories and conduct analysis. For instance, mentions of birds are noted as

<scientific-name>Larus marinus</scientifc-name>,
the <english-name>Black-backed Gull</english-name>/

so as to be able to extract all the birds mentioned in the Glossary by scientific name or English name. Similarly, modern authorities cited by Thompson as well as ancient texts are marked up so as to facilitate analysis, and the reasoning statements he makes in reaching identification decisions are also marked up as shown below.

An example of the XML markup we are doing on the text of the Glossary to facilitate data extraction by categories. Click to enlarge