Research Guides

Data Management

Sustainable formats

Formats more likely to be accessible in the future are:

  • Non-proprietary
  • Open, documented standard
  • Common usage by research community
  • Standard representation (ASCII, Unicode)
  • Unencrypted
  • Uncompressed

Consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format.

Examples of preferred format choices:

  • PDF/A, not Word
  • ASCII, not Excel
  • MPEG-4, not Quicktime
  • TIFF or JPEG2000, not GIF or JPG

Library of Congress' recommended file formats.

File Organization

Simmon's College guide.

NYU's guide

MIT's guide.


Naming conventions

Directory structure naming 

When organizing files, directory top-level folder should include the project title, unique identifier, and date (year). The substructure should have a clear, documented naming convention; for example, each run of an experiment, each version of a dataset, and/or each person in the group.

File naming

  • Reserve the 3-letter file extension for application-specific codes, for example, formats like .wrl, .mov, and .tif.
  • Identify the activity or project in the file name

Use free tools to help you:

File naming conventions for specific disciplines

DOE's Atmospheric Radiation Measurement (ARM) program