What is data reuse?
Data reuse means using data for other purposes than it was originally collected for. Reuse of data is particularly important in science, as it allows different researchers to analyse and publish findings based on the same data independently of one another. Reusability is one key component of the FAIR principles.
Data that is well-described, curated and shared under clear terms and conditions is more likely to be reused. Integration with other data sources is also important, since that can enable new, yet unanticipated, uses for the data.
Why is data reuse important?
By reusing existing data you can:
- obtain reference data for your research;
- avoid doing new, unnecessary experiments;
- run analyses to verify that reported findings are correct, and thereby making subsequent findings more robust;
- make research more robust by aggregating results obtained from different methods or samples;
- gain novel insights by connecting and meta-analysing datasets.
What should be considered for data reuse?
Reusing existing data implies checking the necessary conditions for reuse are met.
- Explore different sources for reusable data. A starting point can be to look for value added databases with curated content. Other possibilities include searching data deposition repositories for suitable datasets based on their annotation, or obtaining data directly from the author of a scientific article.
- Check under which terms and conditions the data is shared. Make sure that there is a licence, and that the licence gives you permission to do what you intend to.
- Check whether there is sufficient metadata to enable data reuse. Some types of data can be straightforward to reuse (e.g. genome data), while other may require extensive metadata to interpret and reuse (e.g. gene expression experiment data).
- Assess the quality of the data.
- Evaluate if the data comes from a trusted source and if it is curated.
- Check if the data adheres to a standard.
- Verify that the data has been ethically collected and that your reuse of the data conforms with policies and regulations you are expected to follow. For personal (sensitive) data, there are usually legal and technical requirements that have to be met before data can be accessed. Getting access to personal (sensitive) data will therefore involve additional steps.
- If the data you are reusing has been updated, make sure to document which version of the data you are using. Also consider what impact the changes may have on your results.
- Cite the data properly by include a persistent identifier (such as a DOI) in the citation, if there is one.
How to make data analysis FAIR. Data transfer
How to transfer data files. Existing data
How to find and reuse existing data. Identifiers
How to use identifiers for research data. Licensing
How to license research data. Data provenance
How to record information about data provenance. Data quality
How to ensure high quality of research data.