Skip to content

Latest commit

 

History

History
90 lines (70 loc) · 3.99 KB

README.md

File metadata and controls

90 lines (70 loc) · 3.99 KB

FMS Acceleration Framework Library

This contains the library code that implements the acceleration plugin framework, in particular the classes:

  • AccelerationFramework
  • AccelerationPlugin

The library is envisioned to:

  • Provide single integration point into Huggingface.
  • Manage AccelerationPlugin in a flexible manner.
  • Load plugins from single configuration YAML, while enforcing compatiblity rules on how plugins can be combined.

See following resources:

Using AccelerationFramework with HF Trainer

Being by instantiating an AccelerationFramework object, passing a YAML configuration (say via a path_to_config):

from fms_acceleration import AccelerationFramework
framework = AccelerationFramework(path_to_config)

Plugins automatically configured based on configuration; for more details on how plugins are configured, see below.

Some plugins may require custom model loaders (in replacement of the typical AutoModel.from_pretrained). In this case, call framework.model_loader:

model = framework.model_loader(model_name_or_path, ...)

E.g., in the GPTQ example, see sample GPTQ QLoRA configuration, we require model_name_or_path to be custom loaded from a quantized checkpoint.

We provide a flag framework.requires_custom_loading to check if plugins require custom loading.

Also some plugins will require the model to be augmented, e.g., replacing layers with plugin-compliant PEFT adapters. In this case:

# will also take in some other configs that may affect augmentation
# some of these args may be modified due to the augmentation
# e.g., peft_config will be consumed in augmentation, and returned as None 
#       to prevent SFTTrainer from doing extraneous PEFT logic
model, (peft_config,) = framework.augmentation(
    model, 
    train_args, modifiable_args=(peft_config,),
)

We also provide framework.requires_agumentation to check if augumentation is required by the plugins.

Finally pass the model to train:

# e.g. using transformers.Trainer. Pass in model (with training enchancements)
trainer = Trainer(model, ...)

# call train
trainer.train()

Thats all! the model will not be reap all acceleration speedups based on the plugins that were installed!

Configuration of Plugins

Each package in this monorepo:

  • can be independently installed. Install only the libraries you need:

    pip install fms-acceleration/plugins/peft
    pip install fms-acceleration/plugins/unsloth # to be available in the near future
  • can be independently configured. Each plugin is registed under a particular configuration path. E.g., the autogptq plugin is reqistered under the config path peft.quantization.auto_gptq.

    AccelerationPlugin.register_plugin(
        AutoGPTQAccelerationPlugin,
        configuration_and_paths=["peft.quantization.auto_gptq"], 
    )

    This means that it will be configured under theat exact stanza:

    plugins:
        peft:
            quantization:
                auto_gptq:
                    # everything under here will be passed to plugin 
                    # when instantiating
                    ...
  • When instantiating fms_acceleration.AccelerationFramework, it internally parses through the configuration stanzas. For plugins that are installed, it will instantiate them; for those that are not, it will simply passthrough.

  • AccelerationFramework will manage plugins transparently for user. User only needs to call AccelerationFramework.model_loader and AccelerationFramework.augmentation.