BU Libraries Digital Preservation Policy

(draft December 2011)

Introduction

Boston University Libraries are committed to providing long-term access to the digital content they collect and curate. The Libraries’ long history of preserving traditional content informs its decisions about digital preservation. The Libraries recognize that collaborative approaches to preserving digital content are essential. Support for collaborative efforts such as Portico (http://www.portico.org/digital-preservation/) and LOCKSS (http://lockss.stanford.edu/lockss/Home) is an important part of the Libraries overall digital preservation strategy. The Libraries work with Boston University’s Information Services and Technology to provide secure storage and backup of locally hosted digital content. Participation in Portico and LOCKSS provides a mechanism for keeping multiple copies of the digital content in distributed repositories.

Digital content presents many challenges for preservation, not the least being rapidly changing file formats and obsolescent hardware and software. The Libraries adhere to best practices for digital preservation to insure data accessibility, fixity and usability. OpenBU, the University’s institutional repository, functions as a primary repository and focus for the Libraries’ preservation efforts. For each additional locally hosted collection of digital content, the Libraries define appropriate levels of preservation support.

Preservation Support Levels

The Libraries commit to maintaining the fixity (bitstream integrity) for all digital objects submitted to OpenBU. Digital objects are assigned a persistent identifier to provide persistent access. Secure storage and backup services are provided for all digital content. The Libraries will take reasonable steps to ensure the usability of the digital objects placed in OpenBU. Preservation steps include format migration, emulation, and normalization (actions that convert an unsupported format to a supported format). The preservation steps performed by the Libraries is determined by the file format of the digital objects. More extensive action will be taken to preserve the usability of objects in file formats that are open, fully disclosed, well documented, widely adopted, and most accessible for migration, emulation, or normalization actions. Fewer actions will be taken to preserve usability of file formats that are proprietary and/or undocumented, and those that are considered working formats (e.g., Photoshop .psd) and/or are not widely adopted.

The support levels indicated below are:

  • Bit-Level — the Libraries commit to maintaining the files on the level of ones and zeroes, but not to maintaining format integrity. This level of support is generally assigned to proprietary formats.
  • Format — the Libraries commit to maintaining bitstream and format integrity. This level of support is generally assigned to formats (proprietary or not) for which open standards are available. For instance, if PDF as a format changes substantially, we will endeavor to convert all of our PDF bitstreams to the latest version. One exception to this is HTML, which changes so relatively frequently that the Libraries do not have the resources to ensure format integrity in its case.

More information about file formats known and/or supported by DSpace (the software running OpenBU) is available here.

File Support and Preservation Best Practices

The following table details BU Libraries’ preservation support levels for commonly used file formats. Best practices for file formats are included.

TEXT AND MICROSOFT OFFICE FILE FORMATS

Format File Support Level Best Practices
HTML .html, .htm Bit-Level Must include all other referenced files, including CSS files and any includes.
Microsoft Word .doc Bit-Level Though acceptable for deposit, the best practice is to convert to PDF prior to deposit
Microsoft PowerPoint .ppt Bit-Level Disable all macros and other effects. Conversion to PDF is also an excellent option.
Microsoft Exel .xls Bit-Level Disable all macros. You may also wish to export dataset into a tab-delimited text file (.txt) prior to
deposit.
PDF .pdf Format
Postscript .ps Format
Rich Text .rtf Bit-Level For increased support, you
may wish to consider conversion to PDF
Plain Text .txt Format It is recommended that .txt
files be saved using UTF-* (Unicode) character set.
SGML .sgm, .sgml Bit-Level Requires that the depositor
also include the DTD along with the SGML file.
XML .xml Bit-Level To ensure the best available support, include the DTD along with a well-formed XML file that is valid according to the included DTD.

IMAGE FILE FORMATS

Format File Support Level Best Practices
BMP .bmp Bit-Level
GIF .gif Format
JPEG .jpg Format When possible save using no compression
JPEG2000 .jp2 Bit-Level This file format holds much potential once tools and support become more widely available. Best practice
is to save master files using no compression.
PNG .png Format
Photo CD .pcd Bit-Level
Photoshop .psd Bit-Level
TIFF .tif, .tiff Format Considered the best format for storing your master images. Best practice is to save these files with no compression.

AUDIO FILE FORMATS

Format File Support Level Best Practices
MPEG audio .mp3 Bit-Level
Real Audio .ra, .rm, .ram Bit-Level Proprietary format. It may be appropriate to convert to another
format prior to deposit.
Wave .wav Format Recommended format for capturing digital audio. This format can
store all the data in an uncompressed format and its wide use suggests
long-term community support.
Windows .wma Bit-Level

VIDEO FILE FORMATS

Format File Support Level Best Practices
AVI .avi Bit-Level
MPEG-1 .mp1 Bit-Level
MPEG-2 .mp2 Bit-Level
MPEG-4 .mp4 Bit-Level
Quicktime .mov Bit-Level
Windows Media Video .wmv Bit-Level

Additional Information

It normally saves time to include planning for how digital content will be preserved in the initial design phase of a digital project. Before beginning any imaging or audio project, you may wish to review the following documents. Each addresses current best practices in the area of digital imaging and digital audio creation. Boston University librarians are eager to consult with you about your preservation planning. For assistance, please contact open-help@bu.edu.

Digital Imaging

California Digital Library

http://www.cdlib.org/inside/diglib/guidelines/bpgimages/cdl_gdi_v2.pdf

Collaborative Digitization Program

http://www.bcr.org/dps/cdp/best/index.html

Library of Congresshttp://www.digitalpreservation.gov/formats/content/still_preferences.shtml

National Archives and Records Administration (NARA)
http://www.archives.gov/preservation/technical/guidelines.pdf

Digital Audio

Collaborative Digitization Program

http://www.bcr.org/dps/cdp/best/index.html

University of Michigan

http://deepblue.lib.umich.edu/bitstream/2027.42/40248/1/Audio-Best_Practice.pdf

One Comment on BU Libraries Digital Preservation Policy