Catalog of Ancient Greek and Latin Primary Sources

Overview

  • Alison Babeu. Building a “FRBR-Inspired” Catalog : The Perseus Digital Library Experience. (PDF)
  • Alison Babeu. A Continuing Plan for the “FRBR-Inspired” Catalog 2.1? (Fall 2012). (PDF)

Current Development Efforts

Preliminary Metrics

Ultimately we want to be able to answer questions like the following, and to produce visualizations driven from them:

  1. How many works exist in our universe?
  2. Do we have any idea how many words each work should contain?
  3. For what percentage of these works do we have 1 TEI XML transcription? 2 transcriptions? 3 transcriptions?
  4. For what percentage of these works do we 1 image book edition? 2 image book editions?
  5. How big works are in relation to each other

We have started by gathering some preliminary metrics.

We have at least 3,395 distinct Greek and Latin works identified and cataloged. We have an additional 1,660 distinct works identified which have not yet been linked to catalog data.

Of the 3,395 cataloged works:

972 have TEI XML transcriptions for one edition in Perseus
22 have TEI XML transcriptions for two editions in Perseus
3 have TEI XML transcriptions for three editions in Perseus
541 have TEI XML transcriptions for one translation in Perseus
133 have TEI XML transcriptions for two translations in Perseus
3 have TEI XML transcriptions for three editions in Perseus

For 234 of the works we don’t have any links to image books in either Google, Hathi or Internet Archive
For 1,006 of the works we have a single link to an image book in one of Google, Hathi or Internet Archive (across all editions and translations)
For 2,102 of the works we have two links to image books from Google, Hathi and/or Internet Archive  (across all editions and translations)
For 53 of the works we have three links to image books from Google, Hathi and Internet Archive (across all editions and translations)

We have approximate word counts for 2,569 of the works, for a total of 32,653,581words.

Additional desired and pending metrics:

  • percentage of works for which we do not have any editions cataloged
  • percentage of works for which we do not have any translations cataloged
  • percentage of works for which we do not have any summaries, commentaries, etc. cataloged
  • min,max,avg number of editions cataloged per work
  • min,max,avg number of translations cataloged per work
  • min,max,avg number of summaries, commentaries, etc. cataloged per work
  • percentage of cataloged editions for which we have image links but no XML transcriptions, and vice-versa
  • percentage of cataloged translations for which we have image links but no XML transcriptions, and vice-versa
  • estimations of word counts where we don’t have them

 

Comments are closed.