Analyses by Spielman and Van Riper for the December 2019 Workshop on 2020 Census Data Products: Data Needs and Privacy Considerations.
Our slides are here (on Dropbox). Note that this repository does not include code for the "off-spine" geographies described in the presentation.
In general this repository is a mess, it was rapidly constructed for the workshop, it contains the following code and data:
- Input A National File containing all census tracts, their original 2010 Decennial Census estimates, and their corresponding estimates from 2010 Differential Privacy Demonstration Data.
- Input A census tract to Core Based Statistical Area (CBSA) crosswalk.
- Input Files defining CBSA type.
- Output A file comparing the differences between all adjacent pairs of census tracts in Washington, DC.
- Output A geojson for Washington,DC that shows differences in sifferential privacy due to DP. THe files also containts maximum difference observed between a census tract and its neightors.
- main.R File to run most analyses and data prep scripts. Spits out some spatial data.
- tract_data_prep.R Mushes together the geographic boundairies,tract data, and CBSA definitions.
- calc_local_diffs.R Defines a sereis of function for looking at tract-to-tract differences do to DP. Several utility functions are also defined to allow computinng the functional in parallel at scale.
- moran_and_plot.R Calculate Morans I and run tests of the spatial randomness of the errors. Spits out some plots.
- descriptive_plots.R Some descriptive plots of the differences between the original and differentially Private 2010 Census Tract data.
- segregation_under_dp.R A markdown file with code to calculate segregation statistics.
The figures folder contains lots of output, titles should be slef explanatory. All are generated by included code.