Research Guides

Data Management

This guide outlines the how's and why's of managing research data at the CUNY Grad Center.

Introduction

Good research data management practices will get you 90% of the way there, but because of environmental variabilities (different operating systems, different software versions, etc.) a project might not be reproducible. Reproducibility is when independent people use the same research materials and conditions to verify a claim; a reviewer running someone's code.

The Practice of Reproducible Research

 

Tools

  • Containers - Portable OS, lightweight, short-term sharing.  Example: Singularity
  • Packagers - Tools that bundle all your work together with every dependency needed. Example: ReproZip
  • Web-based IDE - IDEs you can use in-browser, usually used with containers.  Example: JupyterHub
  • Web-based replay - In-browser tools that let you re-run research hosted elsewhere. Example: Binder