portfolio

dataspice R package

Create lightweight schema.org descriptions of your datasets.

Dataspice makes it easier for researchers to create basic, lightweight, and concise metadata files for their datasets by editing the kind of files they’re probably most familiar with: CSVs.

These csv files can be directly modified, or they can be edited using either associated helper function and/or deploying dataspice shiny apps which help guide through metadata collection.

These metadata csv files can then be used to:

  • Make useful information available during analysis.
  • Create a helpful dataset README webpage for data.
  • Produce more complex metadata formats for richer description of datasets and to aid dataset discovery.

Metadata fields are based on Schema.org/Dataset and other metadata standards and represent a lowest common denominator which means converting between formats should be relatively straightforward.

Typical Dataspice workflow

Typical Dataspice workflow

This project was a team effort developed as part of rOpenSci unconf 2018 and is now available on CRAN. I’ve also developed a number of training materials using it, to teach data producers to develop accompanying minimal metadata in R.