Tufts University Logo SITE_NAME

Search  GO >

this site tufts.edu people
SITE_NAME SITE_NAME SITE_NAME  
 
SITE_NAME

Recent Posts

Categories

Archives

SITE_NAME 6
Printer-friendly version

Data migration success
Posted on February 27, 2014 by Erin Faulder | Categories: news | |  Tagged:  , , |

Here at the DCA we manage a lot of material. We have images, folders of documents, ground breaking shovels and other three dimensional objects, digital files, books, and A/V material. In order to manage the material effectively, we need a system that helps us identify, describe, and locate it all.

In 2010 we undertook a project to replace our old collection management system with a new tool that would help us do our job better. This tool is an open source web-based application called CIDER (source code can be found on GitHub.) However, creating the application was only the first step of replacing our old system. We also had to migrate all of the data that existed in the old system to CIDER.

The migration process was no small feat. We had over 600 collections to migrate. Some collections had only a handful of records. Others had thousands of records. We had to standardize, clean up, check the accuracy, and transform all of the existing data before we could import it into CIDER. Once it was in CIDER, we had a complex QA process to ensure that every piece of information was migrated accurately and completely. Each record was touched at least four times before it was considered complete. It took some excellent coordination, motivation, and commitment from those who helped the process go forward and the staff who had to work in an environment where our our data was in two different places at once!

We are happy to announce that the migration and QA process is now complete and we wanted to share some statistics from the process:

  • There were nearly 250,000 records migrated as part of this process.
  • The first collection was frozen in the old system to prevent changes to the data was on December 20, 2011.
  • The QA process was finished on the final collection, 26 months later, on February 24, 2014.
  • A quarter million records in 616 collections averages to about 406 records per collection.
  • On average, we were able to migrate approximately 2200 records per week.
  • Three people worked part-time on the first step of migrating data from one system to another
  • Six people worked part-time on the final QA steps.

Chi Omega Christmas party for South End House

Time to pull out the tinsel and throw a party!

Making Sense of Our Stuff
Posted on June 9, 2011 by Aaron Rubinstein | Categories: features, news | |  Tagged:  |

Here at DCA, like most archives, we have tons of stuff. In fact, we have 7,000 linear feet of stuff on our shelves (that’s almost two miles of documents, books, and photographs!) and 100,000 digital objects in our digital library. And it grows daily! Keeping track of all this stuff is one of the central challenges of being an archivist and we rely on a variety of systems to help us do this.

E. Whimpers. Coster-girl, 1851. From Edwin Bolles's London Labour London Poor, Volume 1

At DCA, we rely on our collection management system, as it’s called, to do more than just help us find things. Our collection management system reflects how we think about our collections; it reflects the intellectual organization of the materials we collect. This intellectual map also translates into how we provide access to the rich content we house and how we put that material in context, filling it with historical meaning. As we begin to grapple with more complicated collections — collections that reflect both physical and digital content as well as the increasingly “online” nature of Tufts — it’s become clear that our old collection management system is no longer able to keep up with our evolving conceptual model.

It’s with this problem in mind that we’ve decided to design our own collection management system that we’re calling CIDER. CIDER takes a new approach to modeling archival collections and will provide us with tons of flexibility in how we map, document, and present information about our collections. Oh, and it will also make it easier to find stuff, too. Though CIDER is a tool for us at DCA, its benefit will directly affect those who use our material as it allows us to improve almost every aspect of our workflows — helping us get more stuff available to researchers faster — and nearly unlimited potential for bringing new material to light in exciting ways.

When the first stable version of CIDER is complete, we intend to make the source code available using the AGPL license so anyone can download and install it for free, as well as modify the software and adapt it to their own needs and way of thinking about their collections.

We’ll save more details about CIDER and how it’s different than other collection management systems for future posts so stay tuned!