Earlier this year I blogged about why I think Perseus, and the digital humanities in general, needs infrastructure. In that post I discussed one strategy we’ve been following at Perseus – that of participating in the efforts of the Research Data Alliance (RDA).
I was recently fortunate to be able to attend the Triangle Scholarly Communications Institute, an Andrew W. Mellon Foundation funded workshop where the theme was “Validating and valuing digital scholarship.” While we did not spend much time talking specifically about “infrastructure”, it was implicit in all the solutions we discussed, from specific data models for representing scholarly assertions as a graph, to taxonomies for crediting work, to approaches for assessing quality. Cameron Neylon, a researcher at Curtin University and one of the institute participants blogged some thoughts upon returning home, including the following:
“Each time I look at the question of infrastructures I feel the need to go a layer deeper, that the real solution lies underneath the problems of the layer I’m looking at. At some level this is true, but it’s also an illusion. The answers lie in finding the right level of abstraction and model building (which might be in biology, chemistry, physics or literature depending on the problem). Principles and governance systems are one form of abstraction that might help but it’s not the whole answer. It seems like if we could re-frame the way we think about these problems, and find new abstractions, new places to stand and see the issues we might be able to break through at least some of those that seem intractable today.” (Cameron Neylon, “Abundance Thinking“)
I think this neatly sums up what I hope to gain from participation in the multidisciplinary community of RDA. It’s easy, especially when time and resources are constrained, to get locked into thinking that our problems are unique and that we need to design custom solutions, but when we examine the problem from other perspectives, the abstractions begin to rise to the surface.
One of the challenges I took on when I agreed to serve as a liaison between RDA and the Alliance of Digital Humanities Organizations (ADHO) was to try to engage humanities researchers in participating in designing solutions for data sharing infrastructure. Collaboration is not easy, even when you’re all working on the same team. We have been struggling with this even in our own small group of developers at Perseus and our sister community in Leipzig, the Open Philology Project. In a recent developers’ meeting we talked about how hard it is to find the time to consider another developer’s solution before going off and designing your own.
But the rewards might be great. Imagine if by adding support to your project to acquire Persistent Identifiers for your data objects and for communicating the types of data represented by those identifiers you got for free the ability to, with the same code, interoperate with data objects from thousands of other projects. Or if implementing support for building and managing collections of objects meant that your data could participate seamlessly with collections of other projects. These are not abstract use cases. The New York Times just reported on the millions of dollars museums worldwide are spending to digitize their collections. What if we had a standard approach to managing digital objects in collections that allowed us to easily write software that could build new virtual museums from these collections? And are the requirements for such a solution very different from the needs of the Perseids project to manage collections of diverse types of annotations?
A disillusioned colleague said to me not long ago that nothing we build is going to save the world. While this might be true, I think that we all work too hard to keep reinventing the wheel. It’s going to take us a little longer to build a solution for managing our collections of annotations if we do it in the context of an RDA working group, but to me the benefit of having the expertise and perspective of colleagues from other communities and disciplines, such as researchers from DKRZ (climate science), NoMaD (materials science) and the PECE project (ethnography) is worth the effort. Even if we don’t save the collections of the world, at the very least we’re certain to end up with a solution that is a little better than one we would have built on our own. I urge everyone to be part of the conversation and the solutions. Feel free to start by commenting on the RDA Research Data Collections Working Group case statement or subscribing to contribute to the effort!