Sharing data any which way

Keen to open up your collections information to public view but held back by the state of your data? No need to dress to impress, says Kevin Gosling.

Sharing museums collections data has traditionally been a right faff. Before sending your records on to data-sharing platforms, you usually had to edit them to fit a prescribed format. Your collections management system (CMS) would then export these records according to a data-harvesting schema such as LIDO (Lightweight Information Describing Objects).

Even though your CMS might already be, or could be, configured to export according to such schemas, knocking your object records into the required shape could be time-consuming. This was especially the case if you wanted to send information to more than one place, such as your paintings to Art UK and your natural history specimens to the Global Biodiversity Information Facility.

Happily, things are moving on. Those diabolical schemas are still needed behind the scenes, but they should soon be the aggregator’s problem, not yours. In the scoping report of our recent feasibility study for DCMS on Mapping Digitised Collections in England, Knowledge Integration set out high-level design principles for the kind of national aggregator we think UK museums and their users need:

“In short, the proposed architecture describes an aggregator that could gather data in all the ways [currently available], and also be flexible enough to respond to emerging approaches too. It therefore aims to take data in whatever form, and by whatever means, contributing institutions can manage.

“The architecture also acknowledges that the incoming data will be messy, and that the aggregator will have to use various techniques and tools to mitigate this. Finally, the architecture proposes that, as well as being flexible about how data gets into the aggregator, it should also be possible for users (both human enquirers and other systems, such as third-party websites) to get the data they want through any of the means currently available.”

The demonstrator Knowledge Integration built for the feasibility study, and described in its final report, showed how these principles can be put into practice.

However, Art UK is going one better. It is currently working with seven pilot collections (including Tate, National Gallery, National Museums Wales and National Galleries Scotland), together with Knowledge Integration and Intelligent Heritage, to create a detailed requirements specification for a harvesting mechanism that Art UK will then seek funding for. This pilot is going well, with LIDO agreed on as the preferred data exchange format, but – crucially – the job of mapping and transforming the incoming data will be done by the harvesting mechanism if it is not supplied in that format, making things easier for smaller participants. Collections Trust is involved in this exciting initiative and we will blog more about it as the project develops.

So if the thought of plunging into LIDO is giving you cold feet about sharing your collections data, don’t worry. The next generation of aggregators will simply specify CAYA: come as you are.