This project was developed in the context of the Master's degree in Data Science at Faculty of Sciences, University of Lisbon.
We implemented the training of a simple Convolution Neural Network in parallel, by using CPU multiprocessing capabilities and GPU programming, through CUDA.
For more details regarding our implementation, check report
The network architecture is very simple: a convolution and max pooling layer, followed by a fully connected layer for classification. Softmax was used as error function.
We compared the time execution of the training, for three different versions: the sequential version, which was developed in NumPy, based on [1], and both the parallel version on CPU and GPU.
The network was trained to classify images, regarding ten possible classes. MNIST handwritting digits dataset was used.
As can be seen in the following image, the networks are learning well, reaching almost 90% of accuracy after just 100 epochs
The project code is structured in the following way:
project.ipynb: In this notebook, different CNNs with different settings are trained in sequential, parallel on CPU and parallel on GPU ways. The results from this experiment were saved in df_results pickle file.
results.ipynb: The results from the experience above described are analysed.
folder cnn: all the code necessary to train the CNN in multiple ways: sequential, parallel on CPU and parallel on GPU
-> cnn.py: code for network initialization, and some helper functions;
-> cnn_sequential: code for training sequentially on CPU by using NumPy;
-> cnn_parallel_cpu.py: code for training in multithreading CPU;
-> cnn_parallel_cuda.py: code for training in GPU. It includes all the cuda functions/kernels developed.
folder tests:
-> conv_filter_example.ipynb: a simple example of the convolution process
-> test_sequential_vs_gpu.ipynb: a notebook containg code for one epoch of training, in CUDA GPU and CPU sequential version. There you can see that the output of the training process is exactly the same for both versions, for the same randomly initialized weights.