Skip to content

Open-source Uber data engineering project utilizing GCP, Python, Mage AI, and Looker Studio for comprehensive analytics on TLC Trip Record Data.

Notifications You must be signed in to change notification settings

CtrlAltFit/Uber_data_engineering_project

Repository files navigation

Modern Data Engineering GCP Project ~ Uber

Welcome to the "Uber_data_engineering_project" repository! This open-source project is dedicated to exploring and performing data analytics on Uber data using a variety of cutting-edge tools and technologies. Our goal is to provide a comprehensive analysis using Google Cloud Platform (GCP), Python, Mage Data Pipeline Tool, BigQuery, and Looker Studio.

Introduction

The project focuses on leveraging advanced data engineering techniques to analyze Uber data. We utilize GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio to create a robust and efficient data analytics pipeline.

Architecture

Our architecture incorporates various technologies to ensure a seamless flow of data processing. The key components include Google Storage, Compute Instance, BigQuery, Looker Studio, and the modern data pipeline tool provided by Mage AI.

Technology Used

  • Programming Language - Python

Google Cloud Platform

  1. Google Storage
  2. Compute Instance
  3. BigQuery
  4. Looker Studio

Modern Data Pipeine Tool - https://www.mage.ai/

Contibute to this open source project - https://github.com/mage-ai/mage-ai

Dataset Used

TLC Trip Record Data Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

More info about dataset can be found here:

  1. Website - https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
  2. Data Dictionary - https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf

Data Model

Credit: Special thanks to @darshilparmar for providing the Uber dataset.

About

Open-source Uber data engineering project utilizing GCP, Python, Mage AI, and Looker Studio for comprehensive analytics on TLC Trip Record Data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published