Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[Model Compression] admm pruner #4116

Merged
merged 7 commits into from
Sep 22, 2021

Conversation

J-shang
Copy link
Contributor

@J-shang J-shang commented Aug 26, 2021

No description provided.

@J-shang J-shang force-pushed the compression_v2_admm branch from 7b7bf11 to f312f7b Compare August 26, 2021 08:42
@J-shang J-shang marked this pull request as ready for review August 27, 2021 02:46
@@ -655,3 +654,122 @@ def reset_tools(self):
self.sparsity_allocator = Conv2dDependencyAwareAllocator(self, 0, self.dummy_input)
else:
raise NotImplementedError('Only support mode `normal`, `global` and `dependency_aware`')


class ADMMPruner(OneShotPruner):
Copy link
Contributor

@xiaowu0162 xiaowu0162 Aug 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that ADMMPruner has some functionalities of the PruningScheduler, e.g., it performs multiple iterations and keeps track of context/state data like Z and U (I'm not sure whether this is supposed to be handled by the "task" abstraction we discussed). How do we want to integrate ADMMPruner with the scheduler logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, ADMM has scheduler logic, but in fact, it does not generate masks during each iteration. ADMM only reset the elements with small magnitudes to zero, and these elements will also be trained in the following iterations. Only in the last iteration, ADMM will generate masks. So maybe make ADMM a pruner is more reasonable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I just read the ADMM paper, and it seems that the iterative elements are for solving the optimization problem instead of iterative pruning.

def _validate_config_before_canonical(self, model: Module, config_list: List[Dict]):
schema_list = [deepcopy(NORMAL_SCHEMA), deepcopy(INTERNAL_SCHEMA)]
for schema in schema_list:
schema.update({SchemaOptional('row'): And(float, lambda n: n > 0)})
Copy link
Contributor

@xiaowu0162 xiaowu0162 Aug 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend using 'rho' instead since 'row' is confusing. But this involves a change to the original API, so we might don't want to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I do think "row" is very confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's a good suggestion, rho is better, I will modify it.

self.training_epochs = training_epochs
super().__init__(model, config_list)

self.Z = {name: wrapper.module.weight.data.clone().detach() for name, wrapper in self.get_modules_wrapper().items()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use a clear name rather than Z and U.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

U can be named as scaled_dual_variable, but I have no idea to name Z. The author rewrites the origin problem to 2-block optimization, and Z can be seen as another solution of weight. I think using Z is because in ADMM, the second optimization goal is usually denoted as z.
image

And in compression, the problem is
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@QuanluZhang QuanluZhang merged commit 8b61e77 into microsoft:master Sep 22, 2021
@J-shang J-shang deleted the compression_v2_admm branch October 25, 2021 03:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants