Writing NSF Data Management Plans

Introduction

Since 2011, the National Science Foundation (NSF) has required data management plans (DMPs) for incoming grant applications. This guide will help you understand the NSF’s data management requirements and help you write a useful, compliant plan. At first glance, the inclusion of data management plans may seem like another box to check off in the grant application process; however, these plans are becoming an increasingly important part of NSF grant applications and are becoming more thoroughly reviewed. To keep your application competitive you’ll want a DMP that is as good as your research.

Be mindful that NSF directorates and divisions may have more specific requirements!

Quick Links to NSF Resources

The following are links to NSF’s general guidance:

Email us if you need assistance in writing a data management plan for your NSF grant.

Templates and Examples

For reference, check out the DMPTool’s list of templates and public examples.

Requirements

You have 2 pages – and only 2 pages – to write your data management plan. Let’s review the sections your data management plan might require to adhere to NSF’s policy.

Types

The best way to start your data management plan is to think about the types of data you’ll be collecting. When reviewing your data types, be detailed and specific so everyone is on the same page. For instance, if you know you’re using a piece of equipment or data collection methodology be sure to mention that. The policy states you should include:

the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project.

The types of data researchers work with are as diverse as the researchers themselves and their particular interests. Common data types range from numerical data to experimental data to images. A sample from Kimberly Anderson’s DMP on “REU Site: A Multidisciplinary Research Experience in Engineered Bioactive Interfaces and Devices” starts:

The data generated from this project will be of two types. Each REU student will generate experimental data specific to his/her research project related to Engineered Bioactive Interfaces and Devices. In addition, both quantitative and qualitative data will be generated that assesses the outcomes of the REU program.

Data and Metadata Standards

After you’ve figured out the data types, the next step is to think about the metadata standards you’ll be using. This section allows you to explain the specific file formats you’ll be using and why you’re using them.

What’s metadata? Metadata is the information about your data that another researcher or future collaborator will need to know before using your data. This can include, but is not limited to, who created the data, when it was created, and a persistent identifier - like a DOI (digital object identifier) or a URL (universal resource locator). Other likely metadata includes your naming convention documentation, variables, date of creation or collection, and data analysis documentation. The policy states:

the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies).

If there are no common standards for your field, you should document this and provide a possible solution. While metadata can be complex and time consuming to create, much of it can be automated if it is planned before the data is gathered. A great quote that sums up metadata comes from cea+ and the Mozilla Science Lab:

Metadata is a love note to the future.[1]

For multi-year projects it is important to remember that “note to the future” is likely to be to your future self. Although creating metadata takes time, the efforts are almost always rewarded with clear project documentation and enhanced reproducibility of your research.

Access and Sharing

One of the goals of a data management plan is to increase researchers’ ability to not only share their data but also access other’s data. Ideally, this will help verify research results and help researchers build on the work of others. However, the NSF understands that not all data can be made available such as data on human research subjects, patented data, and data involving nationally sensitive projects. However, much of the research that NSF sponsors can be shared with the public. NSF suggests you include:

policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements.

This section of your data management plan details the policies and ground rules you’ll use for sharing your data. Will your data be put in a data repository? Will there be an initial embargo on the access to your data? If your data won’t be publicly available, are there ways you can make it available to others? If so, how will that be done?

As you write this section of your data management plan, you’ll also likely want to think about how others can reuse and redistribute your data.

Reuse and Redistribution

Now we’re at the point where you’ll want to document how others can access your data and any restrictions on the usage of your data. If you are placing restrictions, this is the place to describe why and how others can comply with them. Your NSF data management plan should include:

policies and provisions for re-use, re-distribution, and the production of derivatives.

Or more simply you should document how others can use and share your data, as well as any new data sets that might be produced from your work. The NSF acknowledges that not all data can be made open and freely available. If you do need to limit the use of your data you’ll need to clearly document and justify them.

Archiving and Preserving

The last part of the data management plan is to describe the long-term usage for your data, which can be tricky for researchers who are used to thinking about their data within the life of a particular grant. In addition, you will unlikely have more funds for your data 5 years after your grant ends. The policy asks that you describe the

plans for archiving data, samples, and other research products, and for preservation of access to them.

The best solution to archive and preserve your data is to partner with an existing service or institution. There are many data repositories so selecting the right one is important. Many fields have data repositories that are well known to researchers in that discipline; however, not everyone is so lucky.

Maintain Your Plan

Hopefully your NSF grant application will be accepted and that you’ll soon be on your way to starting your research project! As you begin your work you’ll want to periodically revisit your data management plan to ensure that it still fits the needs of your work. Maintaining your DMP will help you follow through on your commitments to the NSF and help showcase your ability to properly managing your data in future grants.