Research Guides

Data Management

This guide outlines the how's and why's of managing research data at the CUNY Grad Center.

Sustainable formats

Formats more likely to be accessible in the future are:

  • Non-proprietary
  • Open, documented standard
  • Common usage by research community
  • Standard representation (ASCII, Unicode)
  • Unencrypted
  • Uncompressed
  • Software agnostic = can be used with many different software, not just one
  • If you have to use proprietary formats, export or convert them to open file formats at the end of your project

Consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format.

Examples of preferred format choices:

  • PDF/A, not Word
  • ASCII, not Excel
  • MPEG-4, not Quicktime
  • TIFF or JPEG2000, not GIF or JPG

Library of Congress' recommended file formats.

File Organization

Tips:

Do not go more than 2 folders deep

Each folder should have a README with contextual information

Choose an organization strategy that works for you (thematic grouping/by file type/by analysis)

Simmon's College guide.

NYU's guide

MIT's guide.

 

Example:

-Project

--src (source files)

--data (raw data (should be read only))

--results (processed data)

--docs (text docs and codebooks)

 

 

Naming conventions

Directory structure naming 

When organizing files, directory top-level folder should include the project title, unique identifier, and date (year). The substructure should have a clear, documented naming convention; for example, each run of an experiment, each version of a dataset, and/or each person in the group.

File naming

  • Reserve the 3-letter file extension for application-specific codes, for example, formats like .wrl, .mov, and .tif.
  • Choose SHORT file names that are recognizable to humans + machines
  • Prefix your files with a date in YYYY-MM-DD
  • Avoid special characters!
  • Avoid spaces in filenames - use an underscore _ instead
  • Identify the activity or project in the file name

Use free tools to help you:

http://www.bulkrenameutility.co.uk

http://renamer4mac.com

http://www.powersurgepub.com/products/psrenamer.html

File naming conventions for specific disciplines

DOE's Atmospheric Radiation Measurement (ARM) program