File Formats

Introduction

This document identifies file formats that have been recognized by the archival community as sustainable and well-suited for long-term preservation. If possible, you should create or maintain files in these preservation-quality formats because they are one or more of the following:

  • non-proprietary, so the format is not tied to one piece of software or the platform developed by a single company
  • “open,” in that key characteristics and specifications of the format are well-known and documented so that anyone can use or implement the format
  • widely-used so that the format will not soon become obsolete (as happened with formerly popular word processing formats like Word Perfect)
  • recognized as a standard by groups like the ISO, ECMA International, and OASIS as well as the Library of Congress

By employing sustainable, preservation-quality formats, you (or your organization) greatly increase the likelihood that your content will remain accessible and fully functional into the foreseeable future.

Format Recommendations by Media Type

 

Media Type Recommended Formats
Office Documents and Text-Based Files DOCX: MS Word Open XML Document (created in MS Office 2007 and 2010)
XLSX: MS Excel Open XML Document (created in MS Office 2007 and 2010)
PPTX: MS PowerPoint Open XML Document (created in MS Office 2007 and 2010)
ODT: OpenDocument Text Document (created in OpenOffice)
ODS: OpenDocument Spreadsheet (created in OpenOffice)
ODP: OpenDocument Presentation (created in OpenOffice)
PDF/A: Portable Document Format (Archival) (more information)
TXT: Plain Text File (ANSI or UTF-8 encoded)
RTF: Rich Text Format File
XML: Extensible Markup Language Data File
CSV: Comma Separated Values File
TSV: Tab Separated Values File
Audio Files WAV: Waveform Audio File Format (more information)
AIFF: Audio Interchange File Format
MP3: Moving Picture Experts Group Layer 3 compression
FLAC: Free Lossless Audio Codec File
OGG: Ogg Vorbis Audio File
Video Files MPEG-1/2: Moving Picture Experts Group
AVI: Audio Video Interleave File (uncompressed)
MOV: Quicktime Movie (uncompressed)
MP4: Moving Picture Experts Group (with H.264 encoding)
MJ2: Motion JPEG 2000
DV: Digital Video File (non-proprietary)
Raster (or Bitmap) Image Files TIFF: Tagged Image Format File
JPEG/JFIF: Joint Photographic Experts Group JPEG Interchange Format File (lossy compression)
JPEG 2000: Joint Photographic Experts Group (lossless compression)
GIF: Graphic Interchange Format
PNG: Portable Network Graphic
Vector Image Files SVG: Scalable Vector Graphics File
Email Files MBOX:Mailbox File
NOTE: A major limitation of ‘free’ Web-mail such as Gmail, Yahoo, or Hotmail is the inability to easily download or export messages to a different email client or your desktop. Using Mozilla Thunderbird, Outlook, MacMail, or similar clients may allow you to save local copies of messages and be platform-independent.
Database Files CSV: Comma Separated Values File
MySQL SQL: Structured Query Language file (MySQL is an open source relational database management system)