Metadata is documentation about data. Good metadata describes data in a way that allows all research team members (including those not involved in data collection) to understand the material at hand. Similarly, good metadata allows project non-affiliates to understand a dataset enough so as to re-use or replicate it. Metadata can range in formality from less formal to more formal. Your human resources, fiscal resources, domain and time, will affect the best combination of metadata tools for your research project.
Metadata should document both contextual information about the study, as well as data-specific information (also known as a codebook) about the study:
Contextual Information | Data-Specific Information |
|
|
See ICPSR's Best Practices for Creating Metadata and Cornell's Guide to Writing "Readme" Style Metadata for more information.
There are established metadata standards (both discipline-specific and generalized) which you can apply to your research data. Established standards typically employ structured, machine-readable and extensible syntax (such as XML) to annotate research data. The Digital Curation Centre provides a comprehensive catalog of metadata standards and accompanying tools to capture and/or store the metadata.
Because implementing a formal metadata standard can be resource-consuming (both in terms of time and personnel), and because existing metadata formats may not conform well to your research data, alternative documentation may be attractive.
One good option (particularly for internal use) is to create a readme file for each dataset. The readme should be a plain text file (.txt), and should separate important information with blank lines. As with formal metadata schemas, the readme will include both contextual information, and data-specific information, about your study.
Cornell University provides a downloadable readme file template, which may be customized according to your needs.
Dr. Martin Luther King, Jr. Library
One Washington Square | San José, CA 95192-0028 | 408-808-2000