https://sugatagh.github.io/dsml/projects/site-energy-usage-intensity-prediction/
-
Energy usage intensity (EUI) refers to the amount of energy used per square foot annually. It is calculated by dividing the total energy consumed by the building in a year by the total gross floor area. Like miles per gallon for cars, EUI is the prime indicator of the energy performance of a building.
-
The EUI of a site or a building may depend on
- building characteristics: floor area, facility type etc.
- weather data for the location of the building: annual average temperature, annual total precipitation etc.
-
In this project, we aim to predict the continuous variable site EUI, given the characteristics of the building and the weather data for the location of the building.
-
A detailed exploratory data analysis on the dataset is carried out.
-
The observations obtained from EDA are used in the data preprocessing stages (consisting of missing data imputation and categorical data encoding) and feature engineering stages (consisting of feature extraction, data transformation, and binarization).
-
We employ the random forest, XGBoost, and CatBoost regressors to predict site EUI.
-
We apply hyperparameter tuning to the random forest algorithm, which appears to perform best among the baseline candidates.
-
The final model obtains a root mean square error (RMSE) score of
$32.742930$ , a mean absolute error (MAE) score of$17.944242$ , and a coefficient of determination$(R^2)$ of$0.703486$ .