extract_metadata - read column metadata from SPSS and Stata binary files, and format it as JSON

extract_metadata input-file output-file

extract_metadata reads column metadata from existing binary data files, so that readstat can produce new, column-compatible binary files from CSV input files. Both programs use JSON as a metadata interchange format.

The input-file should be a file with one of the following extensions:

Stata binary file, version 104 or newer
SPSS uncompressed binary file
SPSS compressed binary file

In all cases, output-file should end in .json.

Suppose you have a Stata file with last year's survey data, and want to produce a compatible Stata file containing this year's survey data. First, extract the metadata:

extract_metadata last-year.dta survey-metadata.json

Now apply it to this year's data, which is stored in a CSV file:

readstat this-year.csv metadata.json this-year.dta

The first line of the CSV file should contain column names which match the column names in last-year.dta. If everything went well, your new binary data set is now stored in this-year.dta.


Copyright (C) 2012-2019 Evan Miller, and others where indicated.

01 February 2019