-
Notifications
You must be signed in to change notification settings - Fork 1.8k
fix(speedup): refactor the execution logic of functions in speedup #5107
Conversation
docs/source/compression/pruning.rst
Outdated
@@ -105,4 +105,5 @@ In addition, for the convolutional layers that have more than one filter group, | |||
``dependency-aware pruner`` will also try to prune the same number of the channels for each filter group. | |||
Overall, this pruner will prune the model according to the L1 norm of each filter and try to meet the topological constrains (channel dependency, etc) to improve the final speed gain after the speedup process. | |||
|
|||
Operations that will be recognized as having channel dependencies: add/sub/mul/div, addcmul/addcdiv, logical_and/or/xor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
?
@@ -725,8 +717,6 @@ def _build_graph(self): | |||
|
|||
# associate module name with their trace graph nodes | |||
for node in graph.nodes(): | |||
if node.kind() == CONSTANT_KIND: | |||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we should skip the CONSTANT_KIND and when we should not? I found you did not remove all the if node.kind() == CONSTANT_KIND:
in the code
if node.type == 'module': | ||
inputs_name = node.inputs | ||
else: | ||
inputs_name = [val_node.debugName() for val_node in node.key_node.inputs()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is key_node? maybe we need a doc to explain the important attrs in a node
elif isinstance(obj, dict): | ||
return {k: recr_detacher(v) for k, v in obj.items()} | ||
else: | ||
return obj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can obj a customized data type with detach
function?
try:
return obj.detach()
except AttributeError:
return obj
@@ -297,16 +318,15 @@ def update_indirect_sparsity(self, node): | |||
debug_name = auto_infer.input_debugname[in_id] | |||
|
|||
last_output = self.internal_result[debug_name] | |||
# if isinstance(last_output, torch.Tensor): | |||
# TODO what if last output is tuple/list of tensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so what if last output is tuple/list of tensor
?
totally refactored old speedup in https://github.com/Louis-J/nni/tree/refactor_speedup |
the speedup v2 now is runnable. so better to reassess that refactor the old speedup is suitable or not. |
Description
In speedup, we use a non-general trick to treat the prim ops. We analyze the relationship between the prim ops, then delete all the prim ops, finally connect the non-prim op directly. We also assume that all the intermediate variables are tensors, tuples of tensors, or lists of tensors. But it'll cause many problems in various models.
So, it's better to execute all the prim ops for real. In this pr, I remove the deletion of prim ops, and execute the prim recursively. And the assumption of variable types is also removed. Now it's closer to the original execution.
But it changed a lot and is untested now. More work is needed.
#5097
Test Options
Checklist
How to test