Open Data Training

Guide 2: Data Preservation & Archiving

The basics of getting your data ready for preservation and archiving.

Description

Intro

This training module is a very quick introduction to data management and documentation, and for those who know a bit but want to know more. This material was produced by Mozilla Science Lab, a program to encourage the use of open source practices and web technologies to do better science.

Level

Beginner/Novice

Student Prerequisites

None.

Total Time to Complete

About 1 hour, including transitions.

Learning Objectives

Identify common data management errors
Write human-readable metadata
Describe the benefits of machine-readable metadata

Content Outline

Workshop introductions and an introduction to our topics (5 minutes)
Share: Frustrations trying to understand someone else's data (10 minutes)
Common spreadsheet issues: an overview (10 minutes)
Metadata: a love letter to the future (10 minutes)
Writing a DATA-README (25 minutes)
Additional Resources & Wrap Up (5 minutes)

Instructor Guide

Instructor Prerequisites:

Familiarity with Open Data Training Primer 2
Close review of Instructor Guides and all supporting materials for this module

Topic 1: Introductions and Discussion about Open Data
(5 minutes - keep this brief if offered as part of a series/class)
- Introductions
  - Instructor (3 minutes)
    - Explain your background, how you became involved in open data.
  - Why this training, why Mozilla? (1 minutes)
    - • Intro the training series, and how it was created (collaboration, sprints, output of fellows program)
    - • Structure of the session, content exploration through activities
    - • Why MSL and Mozilla are involved, your relationship to Mozilla
  - Why open data now? (1 minutes)
    - • More data than ever-- define types of data here
    - • Pressure from funders, want more impact from data
    - • The web as sharing/collaboration tool
Topic 2: Share frustrations
(10 minutes)
- What problems have you (might you) encounter when trying to understand someone else’s data?
  1. Suggested follow up question: Has there ever been a time you’ve struggled to understand data you produced, yourself? (4 minutes)
    
    Depending on the size of the group, insights can be shared whole group or recorded on a white board, etherpad or other shared document.
Topic 3: Common Spreadsheet Issues
(10 minutes)
Use the list on the Quartz Guide to Bad Data as an example of problematic data formatting issues.

Project the messy dataset file from Data Carpentry at the front of the room or share the file so that students can open it on their personal computers.

At the whiteboard or in an etherpad, develop a list of problems that would need to be addressed with the data for it to be understood or reused.

If you’re working with learners from a particular field, choosing a messy dataset from that field works best.
Topic 4: Metadata
(10 minutes)
The goals for this lesson are to help students understand the benefits of open data, how to encourage others to make their data open, and to identify what you can do with your own data to make it possible for someone to build on your work.

So what exactly is metadata? What does metadata need to include?
What is Metadata? (YouTube Video) (5 minutes)

Review Metadata, A Love Note to the Future example in Primer 2.
Topic 5: Writing a DATA-README
(writing exercise, 25 minutes)
The DATA-README is a human-readable metadata document that students should get in the habit of making with every dataset they produce.

Activity: have students write a DATA-README for a project they’re working on. A template is available here.
Topic 6: Resources and Wrap
(< 5 minutes)
Provide links to Primer 1, and other relevant resources.
- • Nine simple ways to make it easier to (re)use your data
- • Metadata Guide from Australian National Data Service (a simple working-level view of the needs, issues, processes around metadata collection and creation; not discipline specific)
- • Best Practices for Data Management (PDF) from DataONE: Section 5.4 (p.5)
- • Metadata Directory from Research Data Alliance - which provides a list of metadata standards used in various disciplines.

Navigate

Home | Next Lesson

Open Data Training

Guide 2: Data Preservation & Archiving

Description

Intro

Level

Student Prerequisites

Total Time to Complete

Learning Objectives

Content Outline

Instructor Guide

Instructor Prerequisites:

Topic 1: Introductions and Discussion about Open Data

Introductions

Topic 2: Share frustrations

What problems have you (might you) encounter when trying to understand someone else’s data?

Topic 3: Common Spreadsheet Issues

Topic 4: Metadata

Topic 5: Writing a DATA-README

Topic 6: Resources and Wrap

Navigate