Skip to content

A project that provides a cloud-native solution for ingesting, transforming, and visualizing cryptocurrency data, utilizing modern tools and workflows for scalability and automation.

Notifications You must be signed in to change notification settings

lupusruber/crypto_stats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crypto Statistics Data Engineering Project

This project is designed to process and visualize cryptocurrency data through a cloud-based architecture using Google Cloud Platform (GCP). The project includes end-to-end workflows for data ingestion, transformation, and visualization, with tools like Terraform for infrastructure setup, Mage for workflow orchestration, dbt for data transformation, and Metabase for interactive dashboards.

It aims to provide an automated, scalable solution for ingesting, transforming, and analyzing large volumes of cryptocurrency data in real-time.

Technologies used

  • Cloud: GCP (Google Cloud)
  • Infrastructure as code (IaC): Terraform
  • Workflow orchestration: Mage
  • Data Warehouse: Google BigQuery
  • Data Lake: Google Cloud Storage
  • Data Transofrmations: dbt (Data Build Tool)
  • Data Visualizations: Metabase

Data Ingestion DAG (Source -> Bucket)

Data Ingestion DAG

ETL (Bucket -> DWH)

ETL DAG

Dashboards

Dashboard

How to replicate this project?

1. Clone the repo

git clone https://github.com/lupusruber/crypto_stats.git

2. Create the needed infrastructure

cd terraform
terraform init
terraform plan
terraform apply

3. Get Mage and run the pipelines

docker run -it -p 6789:6789 -v $(pwd):/home/src mageai/mageai /app/run_app.sh mage start [project_name]

Copy the pipeline scripts inside the cointainer.

4. Get dbt and run the models

For this project dbt cloud was used. Create a new dbt project and add the models from the repo to the project directory. Run the command:

dbt build

The staged models and the facts should be part of your big query dataset now.

5. Get Metabase and create dashboards

docker run -d -p 3000:3000 --name metabase metabase/metabase

Create the dashboards using the data from Big Query.

Notes

  • You need to have a GCS account
  • Create a service account and download credentials
  • Store credentails in [project_name]/keys/credentials.json, they are used by Terraform, Mage, dbt and Metabase

About

A project that provides a cloud-native solution for ingesting, transforming, and visualizing cryptocurrency data, utilizing modern tools and workflows for scalability and automation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published