A DOI, or Digital Object Identifier, is a unique and enduring link to a digital object. DOIs look like and function as web URLs except that they are persistent (the resource will not go missing, unlike regular bookmarked webpages). For example, here is a DOI from the study “Americans and the Arts [1973 - 1992]” (conducted by The National Research Center of the Arts):
https://doi.org/10.3886/ICPSR35575.v1
Be sure to include DOIs in your data citations if they are available. A DOI will make it much simpler for your readers to find a copy of the dataset.
When writing about data in a formal paper, remember:
Data is plural and thus requires a verb in the plural form (e.g. “The data show an increase in activity...”).
Datum (referring to a single data point) is singular and requires a verb in the singular form (e.g. “This datum shows a unique...”).
As with any other resource (e.g. books, journal articles, etc.), it’s important to cite databases and/or data sets contributing to your research. Properly citing data gives creators credit for their work; helps track the impact of the data set; and facilitates data discovery and access.
In many cases, a data repository (such as ICPSR or Dryad) will provide recommended citation(s) for its datasets. You can simply copy and paste the citation(s) into your reference list.
If a data repository does not provide data citations, you can write your own citation. Not all style guides (i.e. MLA, Chicago) provide guidance in citing data. In such cases, it’s generally acceptable to cite data in the same way you would cite a research article according to that style guide. Regardless of the situation, try to include the following elements in your citation:
If you have your data’s DOI, you can use the DOI Citation Formatter to generate a reference.
Several style guides have specific instructions for data citation. Here are a few sample citations.
APA (6th Edition)
Pew Hispanic Center. (2004). Changing channels and crisscrossing cultures: A survey of Latinos on the news media [Data file and code book].
Retrieved from http://pewhispanic.org/datasets/
APSA (Revised 2006)
Purdue University. 2007. Controversial Facilities in Japan, 1955-1995 [computer file] (Study #4725). ICPSR04725-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2007. doi:10.3886/ICPSR04725.
NLM (2nd Edition)
Entrez Genome [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information. [date unknown]. Haloarcula marismortui ATCC 43049plasmid pNG200, complete sequence; [cited 2007 Feb 27]. Available from: http://www. ncbi.nlm.nih.gov/entrez/query.fcgi?db= genome&cmd=Retrieve&dopt=Overview&list_uids=18013