CI/CD for Machine Learning

A beginner's project on "Automating Training, Evaluation, and Deploying Models using GitHub Actions" provided by Abid Ali Awan and datacamp.

Tutorial: https://www.datacamp.com/tutorial/ci-cd-for-machine-learning

Original repository: https://github.com/kingabzpro/CICD-for-Machine-Learning

Additions to the original project:

Add local pre-commit hooks
Add the GitHub Action 'pre-commit.ci lite'
Add Docker

Project Description

This project trains a random forest algorithm with scikit-learn pipelines to build a drug classifier. The evaluation is done automatically using CML (Continuous Machine Learning). A web application is build with gradio and deployed on the Hugging Face Hub.

From training to evaluation, the entire process is automated using GitHub Actions. Pushing code to the GitHub repository will trigger the training, evaluation and deployment, leading to an updated web application, model, and results on Hugging Face (see https://huggingface.co/spaces/jonas-luehrs/Drug-Classification).

The Makefile includes commands to install Python packages (install), format code (format), train scripts (train), and generate CML reports (eval), push the updated model and results to the "update" branch (update-branch), and upload the new model, results, and gradio app to the Hugging Face space (deploy).

Dataset

Drug Classification

CI/CD Pipeline

Web Application

Pre-Commit Hooks

The code quality is checked with pre-commit hooks. To install the pre-commit hooks run the following command. This is used to ensure that the code quality is consistent and that the code is formatted uniformly.

pip install pre-commit
pre-commit install

This will install the pre-commit hooks in your local repository. The pre-commit hooks will run automatically before each commit. If the hooks fail the commit will be aborted. You can skip the pre-commit hooks by adding the --no-verify flag to your commit command.

The installed pre-commit hooks are:

black - Code formatter (Line length 100)
flake8 - Code linter (Selected rules)
isort - Import sorter

To check and autofix pull requests, the GitHub Action pre-commit.ci lite is used. To use it, you need to add it to this repository as a GitHub application. Here is an example of how the pre-commit-ci-lite bot autfixes a pull request.

Installation

Clone the repository:

git clone https://github.com/JonasLuehrs/mlops-workflow.git

Create a new virtual environment:

cd mlops-workflow
# Create a new virtual environment
python -m venv venv
# Activate environment for Linux
source venv/bin/activate
# Activate environment for Windows
source venv\Scripts\activate
# Install packages
pip install -r requirements.txt

The pipeline needs to be executed at least once, so that we have a model for drug classification available.

Run the app locally:

python ./App/drug_app.py

The Gradio app should now be accessible at http://localhost:7860.

Run the app locally with Docker

Make sure that you have Docker installed, see here.

Execute the following commands

docker build -t gradio-app .
docker run -p 7860:7860 gradio-app

Similar to the first approach you should be able to access the Gradio app at http://localhost:7860.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.github/workflows		.github/workflows
App		App
Data		Data
Images		Images
Model		Model
Results		Results
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
notebook.ipynb		notebook.ipynb
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CI/CD for Machine Learning

Project Description

Dataset

CI/CD Pipeline

Web Application

Pre-Commit Hooks

Installation

Run the app locally with Docker

About

Releases

Packages

Contributors 2

Languages

License

JonasLuehrs/mlops-workflow

Folders and files

Latest commit

History

Repository files navigation

CI/CD for Machine Learning

Project Description

Dataset

CI/CD Pipeline

Web Application

Pre-Commit Hooks

Installation

Run the app locally with Docker

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages