Training a CNN in parallel

This project was developed in the context of the Master's degree in Data Science at Faculty of Sciences, University of Lisbon.

We implemented the training of a simple Convolution Neural Network in parallel, by using CPU multiprocessing capabilities and GPU programming, through CUDA.

For more details regarding our implementation, check report

CNN architecture

The network architecture is very simple: a convolution and max pooling layer, followed by a fully connected layer for classification. Softmax was used as error function.

Results

We compared the time execution of the training, for three different versions: the sequential version, which was developed in NumPy, based on [1], and both the parallel version on CPU and GPU.

Network evaluation

The network was trained to classify images, regarding ten possible classes. MNIST handwritting digits dataset was used.

As can be seen in the following image, the networks are learning well, reaching almost 90% of accuracy after just 100 epochs

Code structure

The project code is structured in the following way:

project.ipynb: In this notebook, different CNNs with different settings are trained in sequential, parallel on CPU and parallel on GPU ways. The results from this experiment were saved in df_results pickle file.
results.ipynb: The results from the experience above described are analysed.

folder cnn: all the code necessary to train the CNN in multiple ways: sequential, parallel on CPU and parallel on GPU
    
    -> cnn.py: code for network initialization, and some helper functions;
    -> cnn_sequential: code for training sequentially on CPU by using NumPy;
    -> cnn_parallel_cpu.py: code for training in multithreading CPU;
    -> cnn_parallel_cuda.py: code for training in GPU. It includes all the cuda functions/kernels developed.

folder tests:

    -> conv_filter_example.ipynb: a simple example of the convolution process
    -> test_sequential_vs_gpu.ipynb: a notebook containg code for one epoch of training, in CUDA GPU and CPU sequential version. There you can see that the output of the training process is exactly the same for both versions, for the same randomly initialized weights.

References

[1] https://github.com/SkalskiP/ILearnDeepLearning.py/tree/master/01_mysteries_of_neural_networks/06_numpy_convolutional_neural_net

[2] https://github.com/WHDY/mnist_cnn_numba_cuda

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
cnn		cnn
images		images
mnist		mnist
tests		tests
utils		utils
.numba_config.yaml		.numba_config.yaml
df_results.pkl		df_results.pkl
project.ipynb		project.ipynb
readme.md		readme.md
report.pdf		report.pdf
requirements.txt		requirements.txt
results.ipynb		results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training a CNN in parallel

CNN architecture

Results

Network evaluation

Code structure

References

About

Releases

Packages

Languages

joaocastanheira94/numba-cuda-cnn

Folders and files

Latest commit

History

Repository files navigation

Training a CNN in parallel

CNN architecture

Results

Network evaluation

Code structure

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages