Deep Learning Advanced Initialization

Deep Learning Project ETHZ HS23 from Jannek Ulm, Leander Diaz-Bone, Alexander Bayer, and Dennis Jüni. The final report can be found here.

Training the models

We provide two different python environments for training the models and running the experiments. The "environment_cuda" is working for us on a Linux Machine with CUDA 12, the "environment_mps" works on new Apple Silicon Macbooks and uses the Apples MPS GPU acceleration. For specific device/cuda versions one might need to adapt the environment files. When running the training file all the pre-trained models required for the experiments are automatically saved in the models folder.

Running the experiments from the paper

Most experiments require pretrained models on subsets of CIFAR-100. Due to the size of a pretrained model, they are not included in this repository, however they can easily be recalculated using this training file (for the original ResNet-18 model and also for the custom ResNet-18 model). All models were trained on 10 classes, which were chosen from 2 superclasses with indices between 0 and 9.

Rerunning the experiments

All experiments from the paper are shown in final_experiments.ipynb.

Testing the accuracy a randomly initialized model on random tuples of superclasses with indices between 10 and 19.
Testing the accuracy of a pretrained model (on some superclasses with indices between 0 and 9) and then re-trained on a random tuple of superclasses with indices between 10 and 19.
Testing the initialization with Gabor filters with 1,2,6,10, and 17 layers being initialized.
Testing the number of pretrained models used for clustering (with 10 clusters and 17 layers being initialized) with Euclidean and Fourier distance.
Testing the number of clusters used for clustering (with 10 models and 17 layers being initialized) with Euclidean and Fourier distance.
Testing the number of layers initialized for clustering (with 10 models and 10 cllusters) with Euclidean and Fourier distance.
Testing a randomly initialized custom ResNet-18.
Testing a custom ResNet-18 which was clustered and permuted according to the alignment algorithm.
Testing a random ResNet-18 model on the Tiny ImageNet dataset.
Testing a pretrained ResNet-18 (pretrained on a subset of 10 CIFAR-100 superclasse) on the Tiny ImageNet dataset.
Testing a clustered ResNet-18 model on the Tiny ImageNet dataset.

All the validation and training accuracies during the run of these experiments were saved in tracked_params.

Plotting the results

All plots from the paper were generated using parameters in tracked_params. The specific code used can be found in final_plotting.ipynb.

Figure 2 - Single-filter Clustering Initialization Methods

Figure 3 - Varying number of Clusters

Figure 4 - Varying number of layers

Figure 5 - Varying number of models

Figure 6 - Custom model with alignment

Figure 7 - Gabor varying number of layers

Figure 8 - Compare initializing first layer

Figure 9 - Tiny ImageNet Comparison

Table 1 - Tiny ImageNet accuracies

Model	Epoch 5	Epoch 15
Random initialization	28.246	30.924
Pre-trained on CIFAR-100	28.338	30.268
Clustered initialization	31.1	35.374

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
experiment_results		experiment_results
old_code		old_code
.gitignore		.gitignore
Custom_ResNet18.py		Custom_ResNet18.py
LICENSE		LICENSE
README.md		README.md
environment_cuda.yml		environment_cuda.yml
environment_mps.yml		environment_mps.yml
final_experiments.ipynb		final_experiments.ipynb
final_plotting.ipynb		final_plotting.ipynb
final_report.pdf		final_report.pdf
infrastructure.py		infrastructure.py
resnet18_hyperparam_tuning.py		resnet18_hyperparam_tuning.py
resnet18_training.py		resnet18_training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning Advanced Initialization

Training the models

Running the experiments from the paper

Rerunning the experiments

Plotting the results

Figure 2 - Single-filter Clustering Initialization Methods

Figure 3 - Varying number of Clusters

Figure 4 - Varying number of layers

Figure 5 - Varying number of models

Figure 6 - Custom model with alignment

Figure 7 - Gabor varying number of layers

Figure 8 - Compare initializing first layer

Figure 9 - Tiny ImageNet Comparison

Table 1 - Tiny ImageNet accuracies

About

Releases

Packages

Contributors 2

Languages

License

dennisjueni/deeplearning-2023

Folders and files

Latest commit

History

Repository files navigation

Deep Learning Advanced Initialization

Training the models

Running the experiments from the paper

Rerunning the experiments

Plotting the results

Figure 2 - Single-filter Clustering Initialization Methods

Figure 3 - Varying number of Clusters

Figure 4 - Varying number of layers

Figure 5 - Varying number of models

Figure 6 - Custom model with alignment

Figure 7 - Gabor varying number of layers

Figure 8 - Compare initializing first layer

Figure 9 - Tiny ImageNet Comparison

Table 1 - Tiny ImageNet accuracies

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages