Format

Using open file formats helps future-proof your data

A file format is a way of encoding information within a computer file so that it can be recognized by an application and accessed. It is indicated by the file name extension (generally a full stop followed by three letters such as .txt, .doc, .jpg, .mov). In other words, this allows the computer to recognize that a document contains text or that a file should be processed as a video. Additionally, file formatting is important as this may affect whether the file contents are accessible following long-term storage.

File formats are an essential consideration in data storage. Software and data storage technology changes quickly, and files can easily become obsolete or difficult to access. In general, it is recommended that data files are copied to new media every 2-5 years, especially if technology changes or if files begin to degrade.

Open File Formats

Open file formats can be used by anyone. Choose open file formats to:

  • increase your ability to open and read your files in the future
  • make your data usable and accessible to more researchers immediately

Because the file specifications are publicly available, the open source software community can ensure that data stored in these file formats remain accessible over the long term.

Proprietary File Formats

Proprietary File Formats work only with software provided by the vendor. File specifications are not freely available, so when the software is no longer supported, files in that format are typically unreadable. Some research disciplines and industries treat a specific proprietary file format as a de facto standard which you may wish to follow.

Recommended File Formats

  • Databases: XML, CSV
  • E-Books: EPUB
  • Images: JPG, PNG, PDF, TIFF, BMP
  • Sound: MP3, FLAC
  • Text: TXT, CSV, PDF/A, ASCII, UTF-8
  • Video: MPG, MOV, AVI
  • Spreadsheets: CSV
  • Medical Images: DICOM
Download our File Format Recommendations as a PDF

 

Need help? Contact research.data@ubc.ca