What heating/cooling setpoints are appropriate for the reference building in a high-performance residential standard?
Extract the data from the BigQuery dataset (7.6TB) and get in a format that can be analyzed by staff at Mathis Consulting. The team determined that it wanted the data by state/province.
There are two tables in the dataset, “dyd” and “meta_data” which exist in the Google Cloud Platform. The “dyd” table is quite large and has a lot of fields that are not needed. The “meta_data” table is smaller. The “Identifier” field exists in both tables and will be the field in which data will be merged.
-
In Big Query, clean extract the wanted columns from the tables and clean all the data by removing duplicates and NULLS.
-
Merge the two datasets on "Identifier."
-
Select the wanted time and dates (by day hours and night hours, month).
-
Establish final queries based state, year, day/night.
-
Export tables to CSV files. These tables are very large.
-
Bring large tables exported from Big Query in to Jupyter Notebook and aggregate the data for temperature control, heating, and cooling based on each identifier.
-
Export CSV files for each state, month, and day/night hours.
-
Added a checker to the notebook to ensure the data was correct per state.