The idea behind this project is to use AI to detect a person's face and mark his/her attendance. (I expect this to be used heavily in the future)
The dataset used for this project are images clicked from my phone within the campus. (For the personal reasons, I have not checked in the dataset images.)
For this project, I have used Keras-VGGFace module which is based on top of the VGG16 state of the art deep CNN model. Keras-VGGFace is used to extract the facial features from the image, which eventually is used to train a Classifier.
- Keras
- Keras-VGGFace
- sklearn
- numpy
- h5py
- matplotlib
- seaborn
- cv2
- Save the training images ordered in classes inside data/train folder.
- Run the following command inside project directory
$ python train.py
- Once the training completes, run
$ python face_detection.py
The command will start the webcam and it will start detecting faces.
- The dataset contained 390 images distributed in 4 classes. Out of which 80% was used for training and remaining for validation
- The Classifier achieved 96% accuracy for the test set.
Report
Confusion Matrix
-
I tried inceptionv3 model by Google, removed the final layer and customized the fully connected layer for my use case. I was under a wrong impression that, it's good enough to detect features from my different dataset.
-
I tried avoiding overfitting by dropping some of the neurons from the fully connected layer.
-
I wasted a lot of time by tweaking and training the model by customizing it for my use case.
-
Eventually, I realized the problem was not overfitting but underfitting. The number of images per class was far too less to train a deep CNN model.
I learned it the hard way by trying and testing by tweaking many hyper-parameters and the model itself.
Hope you avoid the mistakes I did.
Thank you.