This repository contains the code for training of MobileNetV3 for segmentation as well as default model for classification. Every module here is subject for subsequent customizing.
- Requirements
- Quick setup and start
- CNN architectures
- Loss functions
- Augmentations
- Training
- Convert to TensorFlow Lite
- Pretrained models
- Projects use the MobileNetV3-segm model implementation
Machine with an NVIDIA GPU
NVIDIA driver >= 418
CUDA >= 10.1
Docker >= 19.03
NVIDIA Container Toolkit (https://github.com/NVIDIA/nvidia-docker)
-
Clone the repo, build a docker image using provided Makefile and Dockerfile.
git clone make build
-
The final folder structure should be:
Semantic-segmentation-with-MobileNetV3 ├── data ├── notebooks ├── modules ├── train ├── Dockerfile ├── Makefile ├── requirements.txt ├── README.md
-
The container could be started by a Makefile command. Training and evaluation process was made in Jupyter Notebooks so Jupyter Notebook should be started.
make run jupyter notebook --allow-root
MobileNetV3 backnone with Lite-RASSP modules were implemented. Architecture may be found in modules/keras_models.py
F-beta and FbCombinedLoss (F-beta with Cross Entropy) losses were implemented. Loss functions may be found in modules/loss.py
There were implemented the following augmentations: Random rotation, random crop, scaling, horizontal flip, brightness, gamma and contrast augmentations, Gaussian blur and noise.
Details of every augmentation may be found in modules/segm_transforms.py
Training process is implemented in notebooks/train_mobilenet.ipynb notebook.
Provided one has at least PicsArt AI Hackathon dataset and Supervisely Person Dataset it is only needed to run every cell in the notebook subsequently.
To successfully convert this version of MobileNetV3 model to TFLite optional argument "training" must be removed from every batchnorm layer in the model and after that pretrained weights may be loaded and notebook cells for automatic conversion may be executed.
notebooks/convert2tflite.ipynb notebook contains model conversion sample scripts with and without quanization.
Only person segmentation datasets were used for training models in this project: PicsArt AI Hackathon dataset and Supervisely Person Dataset.
Trained Keras model (input size 224x224 px) may be found here.
Trained model converted to a TensorFlow Lite FlatBuffer may be found here.
The same model but quantized after training may be downloaded via this link.
Note: The model was trained with TF2.0, so, it may contain some bugs as compared with the current TF version.
- Real-time CPU person segmentation in video calls: repo