Skip to Main Content

Data Management

Resources in documenting, storing and preserving research data

What is a Data Repository?

At the close of your project, when your data is in a final state and you are ready to share or publish your work, you are also ready to archive your work, it is good practice to deposit your work in a repository that ensures the data’s long-term preservation and accessibility. So, what is a repository?

data repository is a centralized place to store digital data, usually supported and maintained by an organization or institution, that will preserve your data while also making it openly accessible to the public or a subset of users, such as other researchers. Repositories are a great solution for those who are interested in both the long-term preservation of their data and sharing their data.

Where Can I Preserve and Share My Data?

There are many data repositories specific to research domain. Use the Registry of Research Data Repositories to search and/or browse repositories by subject. Each listed repository will include information about sponsoring institutions, terms of data deposit and access, and scope. You might also consult Open Access Directory's List of Disciplinary Repositories.

Notable, domain-specific repositories to be aware of include:

  • ICPSR: The Inter-university Consortium for Political and Social Research “maintains a data archive of more than 250,000 files of research in the social and behavioral sciences.” In addition to data archiving, ICPSR provides leadership/education in and conducts research on social science data curation and analysis.
  • DataONE: Data Observation Network for Earth. This repository hosts environmental research data, and also provides a variety of educational materials in data literacy and data management.
  • GenBank: the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences
  • NOAA National Centers for Environmental Information: NCEI provides environmental data, products, and services covering the depths of the ocean to the surface of the sun to drive resilience, prosperity, and equity for current and future generations.

General or discipline-agnostic repositories. Examples include figShareDryadOpen Science FrameworkZenodo

SJSU ScholarWorks is an institutional repository of the research, scholarship, and creative works of San José State University faculty, students, and staff. The repository can accept any file type for deposit, including research data. Deposited files are indexed in Google and Google Scholar, which improves worldwide discovery of one's work. ScholarWorks is a good option for hosting artefacts from all research phases—for example, SJSU ScholarWorks can host a copy of your research article, along with the data supporting that article. For more information, contact ScholarWorks Coordinators at scholarworks@sjsu.edu.

Dataverse is an open-source software created & distributed by the Institute for Quantitative Social Science (IQSS) in collaboration with the Harvard University Library and Harvard University Information Technology. You can download the software and use it to host “dataverse(s)” (essentially, data repositories). Dataverse is a good option if you’d like extra control in the distribution of your research data, as the software allows you to not only provide your research data and documentation, but also to manage its delivery medium. Visit The Dataverse Project website for more information.

Alternatively, if you like the Dataverse’s style and features, but would rather not manage your own repository, you can create and/or submit repository(ies) and dataset(s) to Harvard’s Dataverse.

  • Online, via a personal or department-hosted website or data portal. This option will not archive and preserve your data, so you should have a secondary plan for that if your data should be made available for a longer period of time than the lifespan of a website.

  • As supplementary materials or via a “data paper” in an appropriate journal. Check with journals about their data policies. Depending on the journal, this solution may not make the data openly available to the public.

Choose a Data Repository

Your choice of repository will depend on factors like:

  • Funder requirements
  • Anticipated publication venues (e.g. journals, conferences, etc.) and their data sharing policies
  • Institutional data policies
  • Repository platform features (Is the repository searchable? Does the repository provide metadata? Etc.)

Having trouble deciding on a repository? Library Data Services (see Librarian email to the left) can help you choose a repository that best fits your needs.