Project:

Final Project for de-zoomcamp 2022 (1st cohort)

Project Summary:

This project will aggregate historical Divvy Bike Data from the City of Chicago.

Technologies to be Used:

GCP VM Instance (Processing)
Terraform (Infrascructure as a Service)
Airflow (Data Pipeline - ETL)
GCP Storage Bucket (Data Lake)
Big Query (Data Warehouse)
DBT (Creating Analytical Views)

Problem Description:

While this data is freely available from the City of Chicago it is divided by month and is in csv format.

By combining this data there may be trends that can be identified which may otherwise be missed looking at a smaller subset of the data.
Creating a resilient data pipeline to facilitate the importing and aggregation of the data this project should be of utility for someone who wishes to perform the same task while eliminating the need for repetitive data cleaning and importing.

Data:

The data to be used for this project can be found here - Divvy Bike Data

Below is a sample of the data to be used:

ride_id - Unique ID Assigned to Each Divvy Trip
rideable_type - Type of Vehicle Used
started_at - Start of Trip Date and Time
ended_at - End of Trip Date and Time
start_station_name - Name Assigned to Station the Trip Started at
start_station_id - Unique Identification Number of Station the Trip Started at
end_station_name - Name Assigned to Station the Trip Ended at
end_station_id - Unique Identification Number of Station the Trip Ended at
start_lat - Latitude of the Start Station
start_lng - Longitude of the Start Station
end_lat - Latitude of the End Station
end_lng - Longitude of the End Station
member_casual - Field with Two Values Indicating Whether the Rider has a Divvy Membership or Paid with Credit Card

Data Pipeline Diagram:

Data Visualizations

Data Visualizations for this project ccan be found here. https://datastudio.google.com/reporting/ea3f603a-f8f5-4d0c-9664-7608835b8ddb

Walkthrough

A video walkthrough of the finished project

Video: Walk Through of Final Project

Reproduce (Test it yourself)

Follow the instructions here

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
dags		dags
images		images
scripts		scripts
terraform		terraform
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
Final Project Divvy Data Transformation.ipynb		Final Project Divvy Data Transformation.ipynb
GitLikeMe.md		GitLikeMe.md
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project:

Project Summary:

Technologies to be Used:

Problem Description:

Data:

Data Pipeline Diagram:

Data Visualizations

Walkthrough

Reproduce (Test it yourself)

About

Releases

Packages

Languages

MichaelShoemaker/shoemaker-de-zoomcamp-final-project

Folders and files

Latest commit

History

Repository files navigation

Project:

Project Summary:

Technologies to be Used:

Problem Description:

Data:

Data Pipeline Diagram:

Data Visualizations

Walkthrough

Reproduce (Test it yourself)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages