Merge pull request #98 from DSC-McMaster-U/documentation

Documentation
DSC-McMaster-U · Nov 4, 2023 · 1ebdf95 · 1ebdf95
2 parents facf94e + 28051ca
commit 1ebdf95
Show file tree

Hide file tree

Showing 4 changed files with 132 additions and 82 deletions.
diff --git a/README.md b/README.md
@@ -1,28 +1,22 @@
-# Auto-ML
-<!-- TODO: add in list of top/active contributors -->
-
+# AutoMate-ML
 <!-- TODO: add in repo badges once project starts-->
-<!-- TODO: move some of the content to a specialized doc for developers, turn this doc into a 'how to use the service' type doc -->
+![Google Cloud](https://img.shields.io/badge/GoogleCloud-%234285F4.svg?style=for-the-badge&logo=google-cloud&logoColor=white) ![Kubernetes](https://img.shields.io/badge/kubernetes-%23326ce5.svg?style=for-the-badge&logo=kubernetes&logoColor=white) ![Terraform](https://img.shields.io/badge/terraform-%235835CC.svg?style=for-the-badge&logo=terraform&logoColor=white) ![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=white) ![GitHub Actions](https://img.shields.io/badge/github%20actions-%232671E5.svg?style=for-the-badge&logo=githubactions&logoColor=white)
+
+![Figma](https://img.shields.io/badge/figma-%23F24E1E.svg?style=for-the-badge&logo=figma&logoColor=white) ![React](https://img.shields.io/badge/react-%2320232a.svg?style=for-the-badge&logo=react&logoColor=%2361DAFB) ![Next JS](https://img.shields.io/badge/Next-black?style=for-the-badge&logo=next.js&logoColor=white) ![FastAPI](https://img.shields.io/badge/FastAPI-005571?style=for-the-badge&logo=fastapi)
+
+## Contributors
+
+
 ## About the Project
 <!-- TODO: insert screenshot of application page-->
 Automation has been making its way into all industries and services, including machine learning! Automating the training and development of models makes machine learning more accessible to user's with little to no background in ML. It takes the user's task, uses a training dataset to fit and tune models to the desired model metrics, and returns a functioning classification model to the user. Google Cloud provides an [AutoML](https://cloud.google.com/vertex-ai/docs/beginner/beginners-guide) service (which we will be taking major inspiration from), and they define it as below:
 
 > "AutoML enables developers with limited machine learning expertise to train high-quality models specific to their business needs. Build your own custom machine learning model in minutes."
 
-That sounds really hard to develop tho! Do not fear... for python libraries exist to make everything easier on us. This project does not require you to have background in ML, however, be prepared to gain some! We will also be working on a good amount of front-end/back-end development to push out user interface in the form of a web application. We will start off developing a minimum viable product using python libraries, like streamlit, pandas, and pycaret to do all the heavy lifting for us, and then iteratively build upon the project adding more custom features and tools to the tech stack. Python will just be our starting point; however, we will expand out into other languages/tools depending on our projects direction and our team member's interest/skills. 
 
-## Projected Roadmap
-| Sprint Number   | Goals |
-|-----------------|---------|
-|1| Minimum Viable Product: Streamlit powered data app |
-|2-4 | Improve user interface by swapping out streamlit for JavaScript/HTML/CSS/Flask or other more powerful web dev tools
-|2-4|Integrate Google Cloud's Vertex AI AutoML service to provide more accurate models |
-|5 and beyond | More features that we'll come up with as a team! 
 
-On a week-by-week basis, we will be tackling this broad roadmap through a list of specific tasks/features each contributor will take on. See the [open issues](https://github.com/DSC-McMaster-U/Auto-ML/issues) for a full list of proposed features and issues. 
 
-### Agile Development
-To give us all a taste of :star2: real :star2: software engineering, we will be mimicking an agile development environment with this project, having weekly(ish) sprints to push out small features. Features will show up as issues and will each get assigned to an individual at our weekly scrum. Since we are mimicking a professional dev team, we want to follow python (or other languages) style conventions defined [here](https://peps.python.org/pep-0008/), and we'll be using pylint (a static code analysis library for python) to ensure the code meets best practices. We also want to implement unit testing, so each time you want to submit a feature PR, we will require you to implement testing (try to maximize coverage using python's coverage library). We will not require unit testing of PRs where it would be unhelpful/redundant. 
+
 
 ## Project Challenges 
 Below is a list of challenges that we'll try to address over the course of our project, after developing our MVP. Some of them reflect industry level challenges involving ML services. They may show up as features/issues throughout our project, depending on what stage we're at.  
@@ -35,32 +29,6 @@ Below is a list of challenges that we'll try to address over the course of our p
 - Build our own automated model training service using PyTorch, i.e. implement the service from scratch - this will require members to have knowledge in ML and familiarity with PyTorch
 - Incorporate Federated Learning: this is related to maintaining user privacy with cloud training. Federated learning has been a hot topic in ML so it'd be great for us as devs to get our hands on it. Check out [this](https://federated.withgoogle.com/) comic made by google to learn a bit more about the benefits of federated learning. Read [this](https://blog.research.google/2017/04/federated-learning-collaborative.html?m=1) google blog post to learn more if you're interested
 
-## What you'll need to contribute:
-
-### Github
-Hopefully you're already accustomed to working with git, as it we will be hosting our project right in this repo. You should know how to clone a repo, commit changes, push and pull to the remote repository. You should also familiarize yourself with making pull requests, as that will be how you contribute to this project! If you've never made a PR, please complete [this](https://github.com/firstcontributions/first-contributions) tutorial and adopt the practice of making a branch to commit to any time you want to make a PR. More details on how we'll be organizing our project will come soon!
-
-### Other Dependencies
-We'll start off with a fully python project, so if you don’t already have python installed on your machine, download [here](https://www.python.org/downloads/) and set up a development environment - VS code is recommended but use what ever you like!
-
-To install the python libraries we'll be using for our MVP, run the below command in the directory containing this repo on your machine:
-
-```sh
-# Can be py or python3 depending on your system
-python -m pip install -r requirements.txt 
-```
-
-You're free to add to the requirements.txt file if you run into any new libraries you want to add to the project. You're also free to change the version number if you run into conflicts involving the libraries, as some of them may be dated/deprecated. To add to the requirements.txt file, on a new line in the file, simply add the install name of the project followed by '==' followed by the version number, like:
-
- `pycaret==3.0.4` 
-
-## Resources for MVP
-
-### Streamlit data app
-
-https://www.youtube.com/watch?v=ApxEBGbqTyQ&ab_channel=DataProfessor
-
-https://www.youtube.com/watch?v=xTKoyfCQiiU&t=1196s&ab_channel=NicholasRenotte
 
 
 

diff --git a/cloud-infra/.gitignore b/cloud-infra/.gitignore
@@ -28,7 +28,11 @@ override.tf.json
 
 # Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
 # example: *tfplan*
+*tfplan*
 
 # Ignore CLI configuration files
 .terraformrc
-terraform.rc
+terraform.rc
+
+# Secrets 
+credentials.json
diff --git a/cloud-infra/.terraform.lock.hcl b/cloud-infra/.terraform.lock.hcl
diff --git a/cloud-infra/main.tf b/cloud-infra/main.tf
@@ -1,6 +1,121 @@
+# Provider and Common Variables 
+variable "project" {
+  default = "automateml"
+}
+variable "region" {
+  default = "us-east1"
+}
+variable "zone" {
+  default = "us-east1-a"
+}
 provider "google" {
-  project = "automateml"
-  region  = "us-east1"
+  project     = var.project
+  region      = var.region
+  credentials = file("credentials.json")
+  zone        = var.zone
+}
+
+# Add GKE Service Account 
+# Minimum roles bc will be the default account used by requests
+resource "google_service_account" "GKE_tf_account" {
+  account_id   = "gke-tf-service-account"
+  display_name = "A Aervice Account  For Terraform To Make GKE Cluster"
+}
+
+# Kubernetes Version
+variable "cluster_version" { 
+  default = "1.26"
+}
+
+# Setup Clusters 
+resource "google_container_cluster" "cluster" {
+  name               = "trail"
+  location           = var.zone
+  min_master_version = var.cluster_version
+  project            = var.project
+
+  # Ignore changes to min-master-version - this is bc version may be different to what TF expects
+  lifecycle {
+    ignore_changes = [
+      min_master_version,
+    ]
+  }
+
+  # Cant create cluster w/o pool defined, create smallest possible pool, delete immediatly 
+  # Use seperately managed pools
+  remove_default_node_pool = true
+  initial_node_count       = 1
+
+  # Enable Workload Identity 
+  # allows workloads in clusters to impersonate IAM service accounts 
+  workload_identity_config {
+    workload_pool = "${var.project}.svc.id.goog"
+  }
+}
+
+# Node Pool Definition
+resource "google_container_node_pool" "primary_preemptible_nodes" {
+  name       = "trial-zero-cluster-node-pool"
+  location   = var.zone
+  project    = var.project
+  cluster    = google_container_cluster.cluster.name
+  node_count = 1
+
+  # Setup autoscaling with min and max number of nodes
+  autoscaling {
+    min_node_count = 1
+    max_node_count = 5
+  }
+
+  version = var.cluster_version
+
+  # Node configuration definition: 
+  node_config {
+
+    preemptible  = true
+    machine_type = "e2-medium"
+
+    # Google recommends custom service accounts that have cloud-platform scope and 
+    # permissions granted via IAM Roles.
+
+    # Tie nodes to sa created above
+    service_account = google_service_account.GKE_tf_account.email
+    oauth_scopes = [
+      "https://www.googleapis.com/auth/cloud-platform"
+    ]
+
+    # Disable regeneration of node pool everytime we run this file
+    metadata = {
+      disable-legacy-endpoints = "true"
+    }
+  }
+
+  lifecycle {
+    ignore_changes = [
+      # Ignore changes to node_count, initial_node_count and version
+      # otherwise node pool will be recreated if there is drift between what 
+      # terraform expects and what it sees
+      initial_node_count,
+      node_count,
+      version
+    ]
+  }
+}
+
+
+# Older code: 
+/*
+resource "google_service_account_iam_member" "GKE_account_iam" {
+  service_account_id = google_service_account.GKE_account.name
+  role               = "roles/iam.serviceAccountUser"
+  member             = "user:jane@example.com"
+}
+
+# Allow SA service account use the default GCE account
+resource "google_service_account_iam_member" "gce-default-account-iam" {
+  service_account_id = data.google_compute_default_service_account.default.name
+  role               = "roles/iam.serviceAccountUser"
+  member             = "serviceAccount:${google_service_account.sa.email}"
 }
 
 resource "google_project_service" "run_api" {
@@ -43,42 +158,4 @@ resource "google_cloud_run_service_iam_member" "run_all_users" {
 output "service_url" {
   value = google_cloud_run_service.run_service.status.0.url
 }
-
-# Add Service Account 
-resource "google_service_account" "default" {
-  account_id   = "compute-service-account" #changing id causes forces new account
-  display_name = "Service Account for Compute Instance"
-}
-
-# Create new VM, Attach to Service Account - for later 
-/*
-resource "google_compute_instance" "default" {
-  name         = "my-test-vm"
-  machine_type = "n1-standard-1"
-  zone         = "us-central1-a"
-
-  boot_disk {
-    initialize_params {
-      image = "debian-cloud/debian-11"
-    }
-  }
-
-  // Local SSD disk
-  scratch_disk {
-    interface = "SCSI"
-  }
-
-  network_interface {
-    network = "default"
-
-    access_config {
-      // Ephemeral public IP
-    }
-  }
-
-  service_account {
-    email  = google_service_account.default.email
-    scopes = ["cloud-platform"] #  `cloud-platform` is recommended for avoid embedding secret keys or user credentials
-  }
-}
 */