Building a Data Curation Profile

The idea of Data Curation Profiles was developed by a team at Purdue University.  The purpose of a data profile is to provide an in-depth snapshot of the status of your data at any given time.  Purdue has released a Data Curation Profiles Toolkit that will guide you through the creation of a profile.  The ideal arrangement is for a data management specialist to work with the researcher on developing the profile, and for this reason many of the items in the toolkit are addressed to the data management specialist – but researchers can still use those tools for their own data.

The Data Curation Profile is not a Data Management Plan, although having a profile makes the creation of a DMP much easier.  The Data Curation Profile as described by the team at Purdue is exceptionally long and detailed.  For that reason, we here at BU have come up with a shorter, modified version.  This version consists of a list of questions about your data that you should try to answer with as much detail as you can.  Whether you choose to use the longer and more detailed Data Curation Profile from Purdue or use the shorter questionnaire provided on this page, creating a data profile will give you a better idea of the current status of your data, as well as alert you to areas in which you need to take action for proper data management.

The questionnaire is shown below, and is also available as a downloadable PDF.

  1. Please provide a general overview of the project and what you hope to accomplish.

  3. Please provide a brief description of the data.  Make sure to include all data and ephemera you want managed, not just the raw numbers.

  5. Approximately how many data files do you have now, and how many to you anticipate having at the end of the project?

  7. What is the average size of the data files you current have?

  9. What format(s) are the data in? (MS Word, MS Excel, MySQL database, etc.)

  11. Who is the intended audience of this data? – if there is more than one audience expected for different types of data, specify.

  13. Please fill in the enclosed chart on your sharing needs/concerns.

  15. Please describe briefly the way your data is currently organized: file name conventions, any existing metadata, units, etc.    Is your current system important for you to keep or would you be willing to adjust to a more universally compatible metadata system?

  17. Who is the intellectual property owner of this data?

  19. What are the funding sources for your research?

  21. Please describe any conditions or constraints placed on the sharing of this data (mandatory dissemination agreements, confidentiality clauses, etc.)

  23. What specific software programs or tools were used in the collection and organization of this data?

  25. What specific software programs or tools are required to utilize this data (proprietary file formats, GIS, etc.)

  27. Do you intend to publish the results of your research in an academic journal?  Do you intend for your data to be linked to this publication?

  29. Do you want to be able to obtain usage statistics for your data?  What measurements are most important to you?

  31. Where are your files currently stored?  Do you currently have backups of your data?

  33. What security measures are currently being used to control access to your data?

  35. What security measures do you require to control access to your data going forward?

  37. How long would you like your data to be preserved?  (if different types of data should be preserved for different time periods, please specify)

  39. Would you like an embargo on access to your data? (if different types of data require different embargoes please specify).

  41. What uses do you anticipate your data may be put to in the future?


List each type of data here 

(Planning documents, raw data, analysis, etc.)

Wouldn’t share with anyone Would share only with my collaborators Would share with others in my field Would share with other academics outside my field Would share with the general public