Skip to content

Latest commit

 

History

History
83 lines (51 loc) · 5 KB

create_recipe_template.mdx

File metadata and controls

83 lines (51 loc) · 5 KB
title metaTitle metaDescription index
Creating recipes with SparseML Recipe Template
Creating Sparsification Recipes with SparseML
Creating Sparsification Recipes with SparseML
1000

Creating Sparsification Recipes with SparseML Recipe Template

This page explains how to create recipes using the SparseML Recipe Template local API.

SparseML Recipe Template is a simple API to generate sparsification recipes on the fly. In the most basic use case it takes in information about the pruning and quantization techniques and produces a relevant recipe. Optionally path to a torch.jit loadable model can also be specified in which case the recipe produced is specific to the architecture of the specified model. The sparsification recipes are produced in markdown or yaml format which can be directly used across all SparseML utilities. Learn more about sparsification recipes here SparseML Recipe Template is built for users who may be inclined to create their recipes by hand to enable more fine-grained control or add custom modifiers when sparsifying a new model from scratch with a single API command.

Overview

SparseML Recipe Template can optionally take in the path to an torch.jit loadable model that a user wishes to generate a Sparsification recipe for; if no such model is provided, a generic recipe based on good defaults and designated parameters is produced. If you are using the python API, you may also supply an already instantiated PyTorch Module.

This utility also provides different options for configuring the generated recipe according to required sparsification techniques. Users are encouraged to change the recipes according to their specific needs.

To learn more about the usage invoke the Recipe Template cli with --help option:

  sparseml.recipe_template --help

The arguments specified configure different aspects of the recipe such as whether it should include pruning and if so, what algorithm should be used, quantization and if so, what algorithm should be used, as well as the learning rate schedule function type, to mention a few.

Base Arguments:

--pruning true|false(default)|ALGORITHM
  - ALGORITHM: global magnitude pruning, gradual magnitude pruning – sparsity per layer, ACDC, Movement, MFAC, OBC
--quantization true|false(default)|TARGET_HARDWARE(default to QAT algorithm ex: vnni, tensorrt)
--lr TYPE
  - TYPE: constant, stepped, cyclic, exponential, linear

Creating the Recipe Template

Once sparseml.recipe_template is run, a base recipe template will be created with the necessary top level variables based on the compression methods specified (ie pruning, quantization). These templates will either be python strings with {variable} placeholders left in for on the fly formatting, or functions that take in the desired variables and return the constructed template.

Notes on template construction:

  • For QAT only, a constant pruning modifier with __ALL_PRUNABLE__ will be added.
  • If a pytorch model is supplied, pruning params will be selected using this helper function. If time is allowed the following filters among others should be added:
    • Ignoring depthwise layers in mobilenet-like models (ie convs with groups==num_channels)
    • Ignoring the first and last layers for pruning
    • [stretch] Setting a minimum number of parameters a layer should have to be considered prunable in the recipe
  • Variables and values in the template recipe should then be updated according to user input and the standards set in the comments above along with acceptance criteria.
  • Different pruning modifiers may have different constructor argos - will need to account for this when swapping the algorithm

Enabling the Recipe Template

The output will be a recipe.md file, with information added based on intended usage.

The yaml frontmatter of the markdown recipe will be the generated recipe template. In the markdown section of the recipe, the user will be provided instructions on how to plug this recipe into a Manager and run training aware (and one-shot for testing) using SparseML.

Additionally, a printout of the settable recipe variables and instructions on how to override in the Manager API will be provided. [STRETCH] parse documentation on each variable from comments in the recipe template, or [MINIMAL] direct the user to read in the yaml section above.

Additional Features:

  • Sparse transfer mode (use constant pruning modifier instead of pruning modifier)
  • Convenience functions to replace manager from_yaml (ie. manager.sparsification(...))
  • For LR modifiers:
    • When pruning, cycle through init_lr and final_lr
    • When finetuning, cycle through init_lr and final_lr
    • When quantizing, hold constant at final_lr

Additional Recipe Template Modifyers

When using the Python API, you can also fine tune specific parameters at a more detialed level such as Epoch, LR, Pruning, QAT, and Modifiers in addition to the base arguments to configure the generated recipe based on your needs.