-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wanda #1834
Wanda #1834
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of shared code with the SpareGPTModifier, if this needs to get pushed in quickly to support research then sure ship it! But if not, I'm uncomfortable pushing this since theres so much duplicated code across multiple files. Rather than moving the shared code around, could the Wanda modifier just inherit from SparseGPT and overwrite functions as neccesary? We do something similar for the SmoothQuant and LogQuant modifiers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! One nit: I'm thinking we should move the obcq folder to pruning/obcq?
I agree let's do that as a follow up PR |
Initial Implementation of Wanda; Updated to use memory tricks similar to OBCQ
LayerCompressor
andSparseGPT
;Research Paper Link: https://arxiv.org/abs/2306.11695
Smaller Test Recipe(targets just one layer):
The requested changes have been added in a stacked PR fashion; Use the itemized list below to navigate:
TerminalModuleCompressor
contract #1885 (closed as changes were included in 1887)LayerCompressor
Contract #1886 (closed as changes are included in 1887)Major changes include: