What is data reuse?
Data reuse means using data for other purposes than it was originally collected for. Reuse of data is particularly important in science, as it allows different researchers to analyse and publish findings based on the same data independently of one another. Reusability is one key component of the FAIR principles.
Data that is well-described, curated and shared under clear terms and conditions is more likely to be reused. Integration with other data sources is also important, since that can enable new, yet unanticipated, uses for the data.
Why is data reuse important?
By reusing existing data you can:
- obtain reference data for your research
- avoid doing new, unnecessary experiments
- run analyses to verify that reported findings are correct, and thereby making subsequent findings more robust
- make research more robust by aggregating results obtained from different methods or samples
- gain novel insights by connecting and meta-analysing datasets
What should be considered for data reuse?
Consider the following when reusing data:
- Explore different sources for reusable data. A starting point can be to look for value added databases with curated content. Other possibilities include searching data deposition repositories for suitable datasets based on their annotation, or obtaining data directly from the author of a scientific article.
- Check under which terms and conditions the data is shared. Make sure that there is a licence, and that the licence gives you permission to do what you intend to.
- Check whether there is sufficient metadata to enable data reuse. Some types of data can be straightforward to reuse (e.g. genome data), while other may require extensive metadata to interpret and reuse (e.g. gene expression experiment data).
- Assess the quality of the data. Is the data from a trusted source? Is it curated? Does the data adhere to a standard?
- Verify that the data has been ethically collected and that your reuse of the data conforms with policies and regulations you are expected to follow. For personal (sensitive) data, there are usually legal and technical requirements that have to met before data can be accessed. Getting access to personal (sensitive) data will therefore involve additional steps.
- If the data has been updated, make sure to document which version of the data you are using. Also consider what impact the changes may have on your results.
- Cite the data properly. Include a persistent identifier (such as a DOI) in the citation if there is one.