Love Data Week – Thursday – Open Data

Open data for government, citizens, researchers, and businesses can mean very different things. The people who created the Open Data Handbook describe it as “…data that can be freely used, re-used and redistributed by anyone – subject only, at most, to the requirement to attribute and share-alike.”

In 2016, the FORCE11 community released the Fair Data Principles, which have driven the open data discussion across many research communities. In the EU, this movement has developed into an implementation effort due to the GoFAIR Initiative. In the US, the Enabling FAIR Data Project is working to “develop standards that will connect researchers, publishers, and data repositories in the Earth, space, and environmental sciences to enable FAIR (findable, accessible, interoperable, and reusable) data on a large scale.”

A wide range of funders and publishers worldwide are supporting Open Data, and as such it is much more common to be required to make supporting data available online. There are numerous repositories for data available online, as listed on re3data.org. Often these repositories will support the FAIR principles but there are some things you should consider:

Findable: Make sure your dataset is able to have a persistent URL that you can share, a DOI is ideal for this as it is unique to your data and will have appropriate metadata to make it distinguishable. It also has the additional benefit of making your data trackable using tools like Altmetric.

Accessible: Make sure that your data can be found, both by people and computers. For example, when looking at repositories for data it is worth making sure that the data will be able to be found easily by others without a fixed link, for example by simply using a search engine.

Interoperable: Check that the file types you are using is sustainable (aka not tied to one specific programme unless absolutely necessary) and that the repository is going to keep and protect your data long-term. It is also essential to write accompanying documentation to your data.

Reusable: The data should be sufficiently well-described and rich that it can be linked or integrated with other data sources and have rich enough metadata to enable proper citation.