-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Refactor model compression examples #3326
Refactor model compression examples #3326
Conversation
Tensorflow | ||
"""""""""" | ||
|
||
.. autoclass:: nni.algorithms.compression.tensorflow.pruning.LevelPruner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuzhe-lz do we still support tensorflow, at least provided one pruner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tensorflow example was removed here: https://github.com/microsoft/nni/pull/3242/files#diff-8555e1a0ab0c25960a752bdb8741ae4de1d9ab10634740970a657a9ebff38c42
You have reviewed it 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuzhe-lz @QuanluZhang Sorry for deleting by mistake 🥺
Upload the TensorFlow example naive_prune_tf.py
.
docs/en_US/Compression/Pruner.rst
Outdated
|
||
This is an one-shot pruner, which prunes filters with the smallest geometric median |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to add a little bit more description
docs/en_US/Compression/Pruner.rst
Outdated
:target: ../../img/l1filter_pruner.png | ||
:alt: | ||
|
||
This is an one-shot pruner, which prunes the filters prunes filters in the **convolution layers**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"prunes the filters prunes filters"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix it.
|
||
.. image:: ../../img/l1filter_pruner.png | ||
:target: ../../img/l1filter_pruner.png | ||
:alt: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you think figure is not helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, figures look redundant.
docs/en_US/Compression/Pruner.rst
Outdated
@@ -471,7 +418,8 @@ PyTorch code | |||
|
|||
pruner.update_epoch(epoch) | |||
|
|||
You can view :githublink:`example <examples/model_compress/pruning/model_prune_torch.py>` for more information. | |||
You can view :githublink:`mnist example <examples/model_compress/pruning/basic_pruners_torch.py>` for more information. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no command for this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
@@ -4,39 +4,35 @@ Knowledge Distillation on NNI | |||
KnowledgeDistill | |||
---------------- | |||
|
|||
Knowledge distillation support, in `Distilling the Knowledge in a Neural Network <https://arxiv.org/abs/1503.02531>`__\ , the compressed model is trained to mimic a pre-trained, larger model. This training setting is also referred to as "teacher-student", where the large model is the teacher and the small model is the student. | |||
Knoiwledge Distillation (KD) is proposed in `Distilling the Knowledge in a Neural Network <https://arxiv.org/abs/1503.02531>`__\ , the compressed model is trained to mimic a pre-trained, larger model. This training setting is also referred to as "teacher-student", where the large model is the teacher and the small model is the student. KD is often used to fine-tune the pruned model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
knowledge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix it
@@ -1,9 +0,0 @@ | |||
AGPruner: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this file used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no used in the examples, looks redundant, so remove it.
@@ -62,30 +62,6 @@ def get_data(dataset, data_dir, batch_size, test_batch_size): | |||
])), | |||
batch_size=batch_size, shuffle=False, **kwargs) | |||
criterion = torch.nn.CrossEntropyLoss() | |||
elif dataset == 'imagenet': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why imagenet is removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, what kind of pruners is auto pruner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest to add one more example which combines auto tune and pruner, for example, tuning the sparsity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest to add more detailed description at the top of each of these example code files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagenet is not used in comparison experiment, remove it for simplicity and clarity. Add more description at the top of the file, please review the latest version.
# Licensed under the MIT license. | ||
''' | ||
Examples for level pruner on mnist | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a very simple example right? suggest to rename it to "naive_example_torch.py"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think "naive_prune_torch" is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree
# Licensed under the MIT license. | ||
''' | ||
Examples for basic pruners | ||
''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference between this file and basic_pruners_kd_torch.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplify the kd example, please review the latest version.
trialConcurrency: 1 | ||
trialGpuNumber: 0 | ||
tuner: | ||
name: grid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the indent is strange
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to tell users how to start this experiment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix it
pruner.compress() | ||
|
||
# after testing | ||
nni.report_final_results(acc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be better to add simple code to show how is acc
created
} | ||
|
||
Then we need to modify our codes for few lines | ||
The previous example manually choosed L2FilterPruner and pruned with a specified sparsity. Different sparsity and different pruners may have different effect on different models. This process can be done with NNI tuners. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"choosed" -> "chose"
docs/en_US/Compression/Pruner.rst
Outdated
|
||
Slim Pruner **prunes channels in the convolution layers by masking corresponding scaling factors in the later BN layers**\ , L1 regularization on the scaling factors should be applied in batch normalization (BN) layers while training, scaling factors of BN layers are** globally ranked** while pruning, so the sparse model can be automatically found given sparsity. | ||
|
||
This is an one-shot pruner, which adds sparsity regularization on the scaling factors of batch normalization (BN) layers durting training to identify unimportant channels. . The channels with small scaling factor values will be pruned. For more details, please refer to `'Learning Efficient Convolutional Networks through Network Slimming' <https://arxiv.org/pdf/1708.06519.pdf>`__\. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "channels. ."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix typos
@@ -45,7 +45,7 @@ After training, you get accuracy of the pruned model. You can export model weigh | |||
|
|||
pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth') | |||
|
|||
The complete code of model compression examples can be found :githublink:`here <examples/model_compress/pruning/model_prune_torch.py>`. | |||
Please refer :githublink:`mnist example <examples/model_compress/pruning/mnist_torch.py>` for quick start. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no such file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. The link has been updated.
docs/en_US/Compression/Pruner.rst
Outdated
@@ -471,7 +416,12 @@ PyTorch code | |||
|
|||
pruner.update_epoch(epoch) | |||
|
|||
You can view :githublink:`example <examples/model_compress/pruning/model_prune_torch.py>` for more information. | |||
You can view :githublink:`mnist example <examples/model_compress/pruning/naive_example_torch.py>` for a quick start. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a little strange, why mention quick start here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to show the simplest example for the usage of pruners.
I will refactor the QuickStart.rst
. Remove the quick start in this file.
''' | ||
Examples for automatic pruners | ||
Example for supported automatic pruning algorithms. | ||
In this example, we present the usage of automatic pruners (NetAdapt, AutoCompressPruner). L1, L2, FPGM pruners are aims for comparsion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"are aims for comparison" -> "are also executed for comparison purpose"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fix it.
''' | ||
NNI example for supported basic pruning algorithms. | ||
In this example, we show the end-to-end pruning process: pre-training -> pruning -> fine-tuning. | ||
Note that pruners use masks to simiulate the real pruning. In order to obtain a real compressed model, model speed up is required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> simulate
# Licensed under the MIT license. | ||
|
||
''' | ||
NNI exmaple for fine-tuning the pruend model with KD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'pruend' -> 'pruned'
|
||
''' | ||
NNI exmaple for fine-tuning the pruend model with KD. | ||
Run basic_pruners_torch.py first to get the pruend model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'pruend' -> 'pruned'
|
||
def get_model_optimizer_scheduler(args, device, train_loader, test_loader, criterion): | ||
if args.model == 'lenet': | ||
model = LeNet().to(device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not right, we should use the masked model instead of the original model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, i mean the kd part
} | ||
|
||
Then we need to modify our codes for few lines | ||
The previous example manually chose L2FilterPruner and pruned with a specified sparsity. Different sparsity and different pruners may have different effect on different models. This process can be done with NNI tuners. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may have different effects...
|
||
Last, define our task and automatically tuning pruning methods with layers sparsity | ||
Then, define a ``config`` file in YAML to automatically tuning model, pruning algorithm and sparisty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sparisty -> sparsity
docs/en_US/Compression/Pruner.rst
Outdated
@@ -1,16 +1,15 @@ | |||
Supported Pruning Algorithms on NNI | |||
=================================== | |||
|
|||
We provide several pruning algorithms that support fine-grained weight pruning and structural filter pruning. **Fine-grained Pruning** generally results in unstructured models, which need specialized haredware or software to speed up the sparse network.** Filter Pruning** achieves acceleratation by removing the entire filter. We also provide an algorithm to control the** pruning schedule**. | |||
We provide several pruning algorithms that support fine-grained weight pruning and structural filter pruning. **Fine-grained Pruning** generally results in unstructured models, which need specialized haredware or software to speed up the sparse network. **Filter Pruning** achieves acceleratation by removing the entire filter. Some pruning algorithms use one-shot method that prune weights at once based on an importance metric. Other pruning algorithms control the **pruning schedule** that prune weights during optimization, including some automatic pruning algorithms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haredware -> hardware
acceleratation -> acceleration
docs/en_US/Compression/Pruner.rst
Outdated
|
||
Slim Pruner **prunes channels in the convolution layers by masking corresponding scaling factors in the later BN layers**\ , L1 regularization on the scaling factors should be applied in batch normalization (BN) layers while training, scaling factors of BN layers are** globally ranked** while pruning, so the sparse model can be automatically found given sparsity. | ||
|
||
This is an one-shot pruner, which adds sparsity regularization on the scaling factors of batch normalization (BN) layers durting training to identify unimportant channels. The channels with small scaling factor values will be pruned. For more details, please refer to `'Learning Efficient Convolutional Networks through Network Slimming' <https://arxiv.org/pdf/1708.06519.pdf>`__\. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
durting -> during?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Fix typo errors!
Examples:
mnist_torch.py
for quick start.basic_pruners_torch.py
.basic_pruners_kd_torch.py
.Doc: