-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change functional surgery method return values to None #1543
Change functional surgery method return values to None #1543
Conversation
Can we instead return how many layers are changed? Would be very useful for agent bc then it can skip surgeries if they are no-ops |
This is an option. However, there are lots of algorithms which aren't surgery based, for which there is no clear number to return. e.g.: compute_ema. One priority should be consistency for users using the functional interface. Returning non-none will reduce this. We currently raise NoEffectWarnings for no-op model surgery algorithms; the agent could catch those warnings. |
Agreed with @nik-mosaic , having the methods return an
whereas |
IMO it's ok to return "something" for apply methods if it makes sense (e.g., number of layers replaced as @mvpatel2000 suggested) but not model if apply method is modifying the model inplace. Any user of apply_* method will look at least look at the API to see what it does and will be able to see return type/value. Therefore, consistency across the board for the apply APIs is, I think, too restrictive. |
I think returning an int would pretty clearly indicate what's going on here, and the docs can be clear this is in-place. It's a lot cleaner to just measure how many layers are changed vs. trying to catch warnings and interpret them. Basically +1 to Daya's points |
Per offline discussion, functional model surgery methods will return # of modified layers where applicable and None elsewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may break code that uses functional version of algorithms and uses return values. I think that's ok though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one request:
Add a note here (https://docs.mosaicml.com/en/v0.10.1/trainer/algorithms.html#model-surgery) explaining that all surgery methods return the number of instances replaced.
Also, do we need to add a test to enforce this API?
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@nik-mosaic do we close this PR? |
In a December design discussion, we backtracked on our decision to return the number of layers modified. We decided that for user-friendliness, functional algorithms should return None instead. This PR is now ready to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should go out on 13. Holding off for now since we're going to cut a 12.1 off dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to merge
Closes https://mosaicml.atlassian.net/browse/CO-988 and #1441 .
For consistency sake, and to ensure that the user knows the model is being mutated, all applicable functional model surgery methods return None, rather than some returning the model object and some returning None.