Skip to Main Content

Data Management

Resources in documenting, storing and preserving research data

Data and Copyright

Generally, data have no copyright protection under United States’ law because one cannot copyright facts. Consider: a researcher can possess a set of daily temperature readings, but they cannot own those collected readings – daily temperature exists outside of human creation. Rather, copyright protects works for which some level of original human inquiry or creative expression is demonstrated (such as in a scholarly publication interpreting daily temperature readings over an extended period of time).

Along this line of thinking, copyright protection may apply to a researcher’s particular arrangement or compilation of data, such as a database.* For example, if one creates a database of daily temperature readings, the creator has made personal choices in naming variables and relating them to one another. The raw data in the database is not protected, only the database container – an end user has every right to say that the temperature was xyz degrees on xyz date.

*Note: not all databases would apply for copyright protection by virtue of being a database; the database would need to be sufficiently original for copyright protection.

Resource: https://datamanagement.hms.harvard.edu/share-publish/intellectual-property 

Licensing

Even though a dataset cannot be protected by copyright, it is good practice to license data through Open Data Commons or Creative Commons. Licensing encourages proper attribution, and protects one’s data in countries in copyright does protect datasets. Licenses can be applied to any material (e.g., sound, text, image, multimedia, software) where some exploitation or usage rights exist.

The most used licenses for scientific content are Creative Commons licenses. In general, a CC BY license (requiring only attribution) is a good option for works such articles, books, working papers, and reports while a dedication to the public domain using CC Zero (CC0) is recommended for datasets and databases.

You can apply an Open Data Commons license to your database if the content is “factual” (e.g. daily temperature readings, etc.).

Name Summary Details

Public Domain Dedication and License (PDDL)

Public Domain for data/databases

“Places the data(base) in the public domain (waiving all rights)”

Attribution License (ODC-By)

 “Attribution for data/databases”

Public can use data & create derivatives so long as they provide attribution to the original database

Open Database License (ODC-ODbL)

“Attribution and Share-Alike for Data/Databases”

Public can use data & create derivatives so long as they provide attribution to the original database; derivatives must be distributed using the same terms (attribution & share-alike)

 

Choose a License for Software

Software shared publicly isn't truly open source unless it includes an appropriate license. By default, software (like other creative works) is protected by copyright, giving the creator exclusive rights. Without a license, others cannot legally use, copy, distribute, or modify the code.

To make software open source, you must include an open-source license.

You can choose a software license using (e.g.) choosealicense or the Open Source Initiative.

After selecting a license, add the license text—customized with the author name(s) and year—to your software repository as a plain text LICENSE file.