Fine Tuning the Yolov5 object detection algorithm to automatically detect and count products kept on supermarket shelves, to make sure all products are replenished.
For this task I had used the SKU110 dataset. It consists of train (8232 images), validation (587 images) and test (2940 images) sets. The annotations for the images are provided as a csv file. The number of bounding boxes may vary from image to image. An example of the csv file is shown below.
- column 1 represents the image name.
- columns 2 and 3 represnt the top left coordinates of the bounding boxes.
- columns 4 and 5 represnt the bottom right coordinates of the bounding boxes.
- column 6 represents the class(Since only one class, it is kept as object).
- columns 7 and 8 represent the image widths and heights respectively.
But for the yolo algorithm, we need to change the format accordingly. Roboflow.ai provides a interface where it automatically annotates and augments the iamges accordingly to the yolov5 format. Unfortunately only upto 1000 images can be annotated for free. I used only the resize option(416x416x3) in the preprocessing step. So I used the first 1000 images of the train set and splitted it into train-val-test with a 80-10-10 rule.
I used the yolov5s model configurations by only changing the number of classes to 1. Trained the model for 300 epochs and a batch size of 32. The training progress is shown below.
Some of the predictions are shown below.