Skip to content

[TLI] Transfer Learning Between Different Architectures Via Weights Injection 🐏 🐏 🐏

Notifications You must be signed in to change notification settings

maciejczyzewski/tli-pytorch

Repository files navigation

Transfer Learning by Injection (TLI)

"Transfer Learning Between Different Architectures Via Weights Injection"

Introduction

This work presents a naive algorithm for parameter transfer between different architectures with a computationally cheap injection technique (which does not require data). The primary objective is to speed up the training of neural networks from scratch. It was found in this study that transferring knowledge from any architecture was superior to Kaiming and Xavier for initialization. In conclusion, the method presented is found to converge faster, which makes it a drop-in replacement for classical methods.

How to use

Just copy and paste the tli.py to your project. There is only one main function - apply_tli(model, teacher). As teacher you can provide any of the following:

Example:

from tli import apply_tli
apply_tli(model, teacher='tf_efficientnet_b0')

Results replication

WARNING: final version of the algorithm are still being developed.

$ python3 research_run.py

Authors

This work will be developed further in collaboration with Kamil Piechowiak and Daniel Nowak as part of a bachelor's thesis at the Poznan University of Technology, Poznan, Poland.

Requirements

This code was tested on Python 3.x and PyTorch 1.7.0, but it should work on older versions as well.

About

[TLI] Transfer Learning Between Different Architectures Via Weights Injection 🐏 🐏 🐏

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published