Why should you assign a licence to your research data?
Loosely said, a licence defines what a user is allowed to do with a dataset. This can take into account ownership rights (copyright) as well as subject rights if the data is describing human beings.
There are large differences between how copyrights and subject rights are to be addressed!
- Complying with copyright is primarily the responsibility of the user of the data. Copyright laws allow only the creator of a work to reproduce and use it. If anyone else wants to use the work, then that person needs explicit permission from the holder of the copyright. A copyright licence describes the nature of this agreement, and does not need a signature: the user can never deny the existence of any conditions, because without the licence they would not be able to use the work at all.
- Complying with subject rights is primarily the responsibility of the controller (frequently called: owner) of the data. In Europe, the GDPR is an important law specifying the rights of subjects, and it is the controller of the data who needs to ensure that any usage of the data has a legal basis; not only his or her own use of the data, but also the use by others. If others use data about human subjects, this will require contracts between the controller and such others. These contracts, unlike copyright licences, will require a signature. Important contract forms are Data Transfer Agreements and Data Processing Agreements.
Licensing is an important aspect of meeting the R (reusable) principle in FAIR data management. As part of the publication process, you need to decide under which licence the data is released.
- If you are producing a dataset and make it available to others, you should be explicit about what others are allowed to do with this. A document describing that is called a licence.
- If you are reusing a dataset that comes from somewhere, you will want to have a licence that explains what you can do with it. Without a licence, reusing a dataset could be setting you up for legal trouble.
- Note that different interpretations of copyright laws think differently about copyrights on data. Under some jurisdictions, some data is explicitly not subject to copyrights. An example is data describing the earth under the laws of the United States of America. Copyright law specifies that it only applies to a “creative work”, and arguably, just collecting data does not have sufficiently creative steps to claim copyrights.
- Relying on this as a reuser of data, however, is dangerous. Look for a licence and apply it.
- As a data producer you should be aware that you may not be able to uphold licence restrictions in court, because it may be decided that the dataset is not copyrightable in the first place. It is therefore best to apply a permissive licence, not asserting strong copyrights.
- Be sure of data ownership before publishing data.
- Are there rights belonging to a third party?
- Make your research data available under an appropriate licence, which defines the degree of publicity and rights to use your data.
- Choose a licence that ensures your data is correctly attributed and makes the terms of reusing your data explicit to the user.
What licence should you apply to your research data?
This depends on what rights protect your research data. Which licence to choose might be governed by university policy or funders’ mandates. Research data can have varying degrees of publicity. There are circumstances in which data may be subject to restrictions eg. if datasets contain sensitive information.
- If possible, choose and apply the least restrictive licence to ensure the widest possible reuse.
- Remember that if you publish your data in a data repository of your choice, a licence agreement will be applied to your data.
- Repositories can be selected based on data licence and sharing policy by using re3data.org.
- Remember that the rights granted in a licence cannot be revoked once it has been applied.
- Apply to your data one of the recommended licenses conformant to the Open Definition, so that your data can be shared and reused. The Open Definition sets out principles that define the meaning of “open” in relation to data and content.
- Creative Commons licenses are the best known open data licences and are available in human-readable and machine-readable forms, with different levels of permissions.
- The following tools helps you find the right licence for your software and data:
- If your research data is a database or a dataset, consider putting it in the public domain by using the Creative Commons CC0 tool. CC0 let you waive all your rights to the work (“No Rights Reserved”).
Relevant tools and resources
|Tool or resource||Description||Tags||Registry|
|Choose a license||Choose an open source license||licensing researcher data manager policy officer|
|Creative Commons License Chooser||It helps you choose the right Creative Commons license for your needs.||licensing researcher data manager policy officer|
|data.world Data License list||Overview of typical licenses used for data resources||licensing biomol sim|
|EUDAT licence selector wizard||EUDAT's wizard for finding the right licence for your data or code.||licensing researcher data manager policy officer|
|How to License Research Data - DCC||Guidelines about how to license research data from Digital Curation Centre||licensing researcher data manager policy officer|
|Open Definition Conformant Licenses||Licenses that are conformant with the principles laid out in the Open Definition.||licensing researcher data manager policy officer|