Scripts for populating the population factfinder database
This repo contains SQL
and bash scripts for creating, importing, and populating the NYC population factfinder postgresql database. The database is created using dokku-postgres, allowing for easy linking to the factfinder-api dokku app.
Create a clean postgres database called factfinder
using dokku-postgres
dokku postgres:create factfinder-$(date -d "today" +"%Y%m%d%H%M")
Load these into /data
before running the import scripts.
An easy way to get these files from your local machine to the server is to use scp
:
scp {localfile} {user}@{server}:{pathToRemoteFile}
decennial.zip
&acs.zip
- csv files that are output from the factfinder-acs-munge R script.decennial_dictionary.csv
- metadata for decennial queries.factfinder_metadata.csv
- metadata for acs queries.
create_tables.sql
- Creates new postgresql tables: decennial
, demographic
, social
, housing
, economic
, factfinder_metadata
, decennial_dictionary
This sql script can be executed using the dokku-postgres
connect
command, which runs the sql inside the running postgres container.
dokku postgres:connect factfinder < create_tables.sql
The remaining scripts make use of psql, which runs on the host machine but connects to the dokku postgresql instance. In order to connect, two environment variables must be set:
$PGPASSWORD
- the password that dokku-postgres
created for the database.
$PGIPADDRESS
- the local IP address of the postgresql container.
$PGDATABASE
- the database name (not to be confused with the dokku-postgres container name)
Find these by running dokku postgres:info factfinder
, and set them in the local environment.
import.sh
- imports the decennial and acs CSVs.
import_metadata.sh
- imports the two metadata tables;
create_indices.sh
- adds btree indices to the decennial table and each of the 4 acs tables.
Linking the dokku-postgres
container and the factfinder-api
app will add the database connection string to factfinder-api
as an environment variable: DATABASE_URL=postgres://postgres:{password}@{dockerhostname}:{port}/factfinder
.
This url is then used by the pg
node package to connect to the database.
dokku postgres:link factfinder factfinder-api