Preservation and Conversion Strategies at the Bentley Historical Library
The Bentley Historical Library is committed to the long-term preservation of and access to its digital collections. Because the library must contend with thousands of potential file formats, Digital Curation Services has adopted a three-tier approach to facilitate the preservation and conversion of digital content:
Tier 1: Materials produced in sustainable formats will be maintained in their original version.
Tier 2: Common "at-risk" formats will be converted to preservation-quality file types to retain important features and functionalities.
Tier 3: All other content will receive basic bit-level preservation.
This document provides further information on the Bentley Historical Library's accepted preservation formats and conversion strategies.
Download this document as a PDF file.
The library has identified a number of sustainable file formats that are widely used and/or non-propietary, many of which have been recognized as international standards by bodies such as the International Standards Organization (ISO), ECMA International, and the Organization for the Advancement of Structured Information Standards (OASIS). The longevity of these formats has furthermore been acknowledged by various peer institutions and experts in the digital curation community, including the Library of Congress's National Digital Information Infrastructure and Preservation Program.
Digital materials stored in these file formats should remain usable to researchers and administrative units at the University of Michigan for the foreseeable future and beyond. The Bentley Historical Library will therefore preserve the original version of content stored in these sustainable formats at the time of accession. Digital Curation Services will monitor community best practices and technological advances in case a migration to alternative preservation formats should prove necessary.
|Media Type||Sustainable Preservation Formats|
|Audio Files||WAV: Waveform Audio File Format|
|AIFF: Audio Interchange File Format|
|MP3: Moving Picture Experts Group Layer 3 compression|
|FLAC: Free Lossless Audio Codec File|
|OGG: Ogg Vorbis Audio File|
|MIDI: Musical Instrument Digital Interface File (including SMF and XMF wrappers)|
|Office Documents and Text-Based Files||DOCX: MS Word Open XML Document|
|XLSX: MS Excel Open XML Document|
|PPTX: PowerPoint Open XML Presentation|
|PDF/A: Portable Document Format (Archival)|
|PDF: Portable Document Format|
|TXT: Plain Text File|
|RTF: Rich Text Format File|
|XML: Extensible Markup Language Data File|
|CSV: Comma Separated Values File|
|TSV: Tab Separated Values File|
|Database Files||CSV: Comma Separated Values File|
|SIARD: Software Independent Archiving of Relational Databases (open XML format)|
|MySQL SQL: Structured Query Language file (MySQL is an open source relational database management system)|
|Email Files||MBOX:Mailbox File|
|Raster Image Files||TIFF: Tagged Image Format File|
|JPEG/JFIF: Joint Photographic Experts Group JPEG Interchange Format File (lossy compression)|
|JPEG 2000: Joint Photographic Experts Group (lossless compression)|
|GIF: Graphic Interchange Format|
|PNG: Portable Network Graphic|
|Vector Image Files||SVG: Scalable Vector Graphics File|
|Video Files||MPEG-1/2: Moving Picture Experts Group|
|AVI: Audio Video Interleave File (uncompressed)|
|MOV: Quicktime Movie (uncompressed)|
|MP4: Moving Picture Experts Group (with H.264 encoding)|
|MJ2: Motion JPEG 2000|
|MXF: Material Exchange Format File (uncompressed)|
|DV: Digital Video File (non-proprietary)|
The digital curation community has long acknowledged the disadvantages posed by proprietary formats (for which only specific software may be used) and content encoded with "lossy" compression (i.e. compression that reduces the quality of the data to conserve space). The Bentley Historical Library will therefore convert the most common at-risk formats to preservation-quality sustainable formats. The original version of content will also be maintained alongside the preservation copy to ensure the authenticity of the Bentley Library's digital collections. These conversion strategies reflect the policies and practices of peer institutions as well as the National Digital Information Infrastructure and Preservation Program.
Visit the Library of Congress Sustainability of Digital Formats site for more information on preservation issues and descriptions of preferred formats.
|Media Type||At-Risk Formats||Preservation Target|
|Audio Files||WMA: Windows Media Audio File||WAV Format (preferably Broadcast WAVE)|
|RA: Real Audio File|
|SND: Apple Sound File|
|AU: Sun Audio File|
|Office Documents and Text-Based Files||DOC: MS Word 1997-2003 Document||MS Office Open XML (OOXML) Format|
|PPT: MS PowerPoint 1997-2003 Presentation|
|XLS: MS Excel 1997-2003 Spreadsheet|
|Database Files||ACCDB or MDB: MS Access Database Files||SIARD Open XML Format|
|MS SQL Server Database Files|
|Oracle Database Files|
|Email Files||EML: Email Message File||MBOX Format|
|PST: Outlook Personal Information Store File|
|Eudora Mail and approx. 40 other formats|
|Raster Image Files||BMP: Windows Bitmap||TIFF Format|
|PSD: Adobe Photoshop Document|
|RAW: Raw Image Data File|
|FPX: FlashPix Bitmap|
|PCD: Kodak Photo CD Image|
|PCT: Apple Picture File|
|TGA: Targa Graphic|
|Vector Image Files||AI: Adobe Illustrator||SVG Format|
|WMF: Windows Metafile|
|PS: PostScript||PDF/A Format|
|EPS: Encapsulated PostScript|
|Video Files||SWF: Shockwave Flash||MPEG4 (with H.264 encoding)|
|FLV: Flash Video|
|WMV: Windows Media Video|
|RV (or RM: Real Video|
Because it is infeasible to create conversion plans for the tens of thousands of formats in existence, the Bentley Historical Library will ensure that digital holdings in other formats (i.e. ones not specifically identified in this document) will receive bit-level preservation. The use of integrity checks and regular replacement of storage media (conducted by trusted partners in the University of Michigan Library Information Technology division and Information and Technology Services) will preserve the raw data stored in these files (i.e. the "stream" of 0s and 1s) in its original state. The library concedes that hardware or software obsolescence may reduce the functionality of these files or render them inaccessible. At the same time, the faithful preservation of the bitstreams will allow the library to take advantage of future developments in emulation technology.
Please contact Digital Curation Services with questions or comments regarding the Bentley Historical Library's digital preservation and conversion strategies.