Data Reuse Checklist

Getting Started

This checklist is designed to help you understand what someone outside your research project (or you in 5-10 years) would need to know about your data in order to build on your work. For more information on preparing your data for reuse, check out our exercise on how to plan for data reuse.

Image attribution: Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, et al. (2014) Troubleshooting Public Data Archiving: Suggestions to Increase Participation. PLoS Biol 12(1): e1001779. doi:10.1371/journal.pbio.1001779

Checklist

WHAT

What is the title of the data set?

List a title that goes beyond just the filename. Be descriptive, provide context.

Example: “Migration patterns on Columbia River Delta” NOT “final.csv”

Are there any related research publications for this data? Are there any existing data sets that were used to create this data set?

Add a citation to any relevant publications for this dataset with a link, preferably a persistent identifer. (For more information on persistent identifiers, see our References page.)

Example: Forstmann BU, et al. (2014) Multi-modal ultra-high resolution structural 7-Tesla MRI data repository. Scientific Data, 1:140050. (http://www.nature.com/articles/sdata201450)
Example: Keating JN, Donoghue PCJ (2016) Data from: Histology and affinity of anaspids, and the early evolution of the vertebrate dermal skeleton. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.k2qc4

WHO

Who is responsible for the data?

List principal investigator/s or research group that collected or contributed to the data.

Example: Dr. Phoebe Marshwana, Agriculture Lab, Michigan State University

Who can answer questions about the data?

Consider a lab email or other contact method that won’t change as people move on in their careers.

Example: Climate Impacts Group, University of Washington: cig @ uw.edu

WHERE

Where was the data collected?

Can be multiple locations, geographic range. Use geographic coordinates, if possible.

Example: Skukuza, Swaziland

Where does the data live?

Add a link to the repository where data is shared, preferably using a persistent identifier. (Need help finding a repository? See the "Resources" on our References page.)

Example: UCSD Digital Collections repository: http://dx.doi.org/10.6075/J08G8HM2

WHEN

When was the data collected? What time span does the data cover?

Use the international standard date format (YYYY-MM-DD hh:mm.ss) and try to be as specific as possible.

Example: 2015-07-01 to 2015-12-31
Example: Collected: June 2015. Data coverage: 1932-1944

HOW

How was the data collected?

Think of the steps taken to collect the data, the instruments and software used.

Simplified example: Minimum and maximum observed temperature for each day was calculated at morning high tide using CoolRead thermometers calibrated using the XYZ method

How was the data processed

Steps taken to clean and analyze the data including tools & software, how null or missing data were handled.

Example: Differences in site mortality were determined through survdiff tests performed using X software version 2.10.3. Comparisons with a p-value less than 0.05 (P gt 0.05) were considered different. Null data is coded as 777 and missing data with 999.

If you wrote code for processing the data, provide information on where it can be found.

Example: Raw data and scripts used in analysis are available in a GitHub repository: https://github.com/rjleveque/tohoku2011-paper1

How may this data be used by others?

Identify a license assigned to this data set. We recommend a CC0 1.0 Universal (CC0 1.0) license. (Not sure what the options are? See the "Resources" on our References page.)

Example: This data set is covered by the CC0 1.0 Universal (CC0 1.0) license. To the extent possible under law, Sharee Davis has waived all copyright and related or neighboring rights to this data set. This work is published from: United States

Feedback

If you have suggestions for improving this checklist, feel free to submit an issue on GitHub here or you can contact Mozilla Science Lab at sciencelab@mozillafoundation.org. We’d like to see what you can do with this. Please fork and make it your own!