RD@E have been at a couple of noteworthy events in the last two weeks, which I'm going to report on in this post.
DataCite Workshop: Describe, disseminate, discover: metadata for effective citation
On the 6th July I attended the next instalment of the DataCite workshop series, this time focusing on metadata for data citation. We heard from an array of speakers from different institutional roles, which gave some nice background to how we should/could be describing data. It was useful from the perspective of our project to see some examples of metadata profiles used by other data providers and resource discovery portals.
I was particularly enthused by David Shotton and teams work on turning the DataCite metadata schema (or parts of) into an RDF ontology. This allows the description of data citations in machine readable language i.e. it's linked data friendly. This kind of work is going to be a huge boon in years to come as LOD becomes an increasing priority. You can catch up on this and all the other presentations here.
This ties in with another recent event in Berlin (hosted by the German Data Forum), where Joachim Wackerow from GESIS presented on machine actionable persistent identifiers. He made the point that the machine actionable chain from PID through to rich metadata is not yet fully formed. As data providers, we need to think about ways in which, rather than resolve an identifier to a flat html page, also allow content negotiation using the DOI to request structured and machine-readable metadata beyond the scope of the elements stored by DataCite. In order to do this, should we provide a secondary identifier in the DataCite metadata, referencing the rich metadata stored we hold? Something to consider.
Open Repositories 2012: EPrints User Group
Last Friday (13th!), RDE presented to the OR2012 EPrints panel, a gathering of the EPrints community and a great place to exchange knowledge and develop ideas.
Our paper was detailing progress on piloting the EPrints data repository. 'Data' was highlighted as a key theme of the conference in the closing keynote; see this rather neat Wordle, constructed from the text content of all the #OR2012 tagged tweets (thanks to Adam Field for this). It's great to be part of this major emerging theme.
The presentation was really well received by the audience, and generated a lot of questions(!) It seems that what we're doing is still very new to the repository scene, and quite ambitious. The key issue we raised is that research data collections can be much more complex than articles - EPrints will require customisation to meet these requirements. We've been making big steps in this direction of course, and we'll be exploring how well the customisations work in the field too in the coming months - using real researchers with real data! You can download the presentation from the UK Data Archive website here or view the embeded version below.
We had some really useful discussions outside of the sessions too - great to be able to talk in depth with the EPrints team about some the challenges we've come up against. We're hoping this will result in a greater level of exchange between different parties working on data customisations and plugins.