A logical, reasonably standardized, but flexible project structure for MLOps.
- This template is presented in my tutorial : Structuring Your Machine Learning Project with MLOps in Mind.
- It is based on the Cookiecutter Data Science template.
- It is updated to meet the MLOps workflow described in my tutorial: here.
- You can find the project in: Project homepage.
- Python 2.7 or 3.5+
- Cookiecutter Python package >= 1.4.0: This can be installed with pip by or conda depending on how you manage your Python packages:
$ pip install cookiecutter
or
$ conda config --add channels conda-forge
$ conda install cookiecutter
cookiecutter https://github.com/Chim-SO/cookiecutter-mlops
The directory structure of your new project looks like this:
{{ cookiecutter.repo_name }}/
├── LICENSE
├── README.md
├── Makefile # Makefile with commands like `make data` or `make train`
├── configs # Config files (models and training hyperparameters)
│ └── model1.yaml
│
├── data
│ ├── external # Data from third party sources.
│ ├── interim # Intermediate data that has been transformed.
│ ├── processed # The final, canonical data sets for modeling.
│ └── raw # The original, immutable data dump.
│
├── docs # Project documentation.
│
├── models # Trained and serialized models.
│
├── notebooks # Jupyter notebooks.
│
├── references # Data dictionaries, manuals, and all other explanatory materials.
│
├── reports # Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures # Generated graphics and figures to be used in reporting.
│
├── requirements.txt # The requirements file for reproducing the analysis environment.
└── src # Source code for use in this project.
├── __init__.py # Makes src a Python module.
│
├── data # Data engineering scripts.
│ ├── build_features.py
│ ├── cleaning.py
│ ├── ingestion.py
│ ├── labeling.py
│ ├── splitting.py
│ └── validation.py
│
├── models # ML model engineering (a folder for each model).
│ └── model1
│ ├── dataloader.py
│ ├── hyperparameters_tuning.py
│ ├── model.py
│ ├── predict.py
│ ├── preprocessing.py
│ └── train.py
│
└── visualization # Scripts to create exploratory and results oriented visualizations.
├── evaluation.py
└── exploration.py