Welcome to Mozilla Science Lab's Open Data Primers!




  • Data dictionary - a text file defining field names and values (sometimes used interchangeably with the term “codebook”). Includes: a list of all field names, a description of fields & values (e.g. units of measurement, formulas used for calculation, abbreviations, value ranges) as well as the relationship of fields to one another. Example of a data dictionary: http://www.utexas.edu/cola/redcap/_files/data_dictionary_example.jpg

  • Metadata schema - are sets of rules for how to describe a certain type of information. There are many different metadata schema primarily organized by information format and/or discipline. See Digital Curation Centre’s List of Disciplinary Metadata for examples.

  • Permanent identifier - A permanent identifier (or PID) is a set of numbers and/or characters, frequently in the form of a URL, that points to the location of a resource. PIDs are set up in such a way that even though the storage location of the resource may change over time (e.g. moving data from one university server to another), the PID will always point to the correct location. DOI (Digital Object Identifier) is a commonly known type of PID.