-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(WIP) Multi platform abstraction tweaks #1195
Conversation
I was thinking about this a little bit in #1173 actually. It may be good opportunity to rename some of these ops to make them more clear, even if they still are aliased for BC reasons. Possible examples: |
raise NotImplementedError | ||
|
||
|
||
class FourBitMatmul(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to think about here is that for the nested quantization, the interface for KBitQuantization
is needed (quantize_blockwise/dequantize_blockwise)
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Closing this as it will be replaced by upcoming torch.library refactor |
I'll fill in more details as I go along here, but for now I'm just publicly sharing what I'm working on. Feel free to already comment if anything catches your eye.
I've gathered extensive feedback from previous PRs (some closed) around this topic and am implementing my interpretation of what's needed, what Tim wants (API structure/ sub-interfaces) and what the community suggested, e.g. tensor-driven dispatch + deferred initialization.
The idea is that I'll ask for extensive feedback from the community once I'm through with the changes I'm imagining (aiming for this week) and we see how to continue from there and then merge the Intel PR #1178, once it's been adapted to this slightly modified architecture.