-
Notifications
You must be signed in to change notification settings - Fork 23.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add aten mkldnn zero_ operator #20573
Conversation
Summary: ### mkldnn backward ops list: - [ ] \(#20567) Add aten mkldnn conv2d backward operator 💛 - [ ] \(#20570) Add aten mkldnn backward ops: relu, linear and reshape 💛 - [ ] \(#20571) Add aten mkldnn backward ops: max_pool2d, avg_pool2d and adaptive_avg_poo2d 💛 - [ ] \(#20572) Add aten mkldnn batchnorm backward operator 💛 - [ ] \(#20573) Add aten mkldnn zero_ operator:yellow_heart: - [ ] \(#20575) Add mkldnn mul operator 💛 Pull Request resolved: #20575 Differential Revision: D15799529 Pulled By: bddppq fbshipit-source-id: 4887d8ef1a0e316ad9db199b657d9481fc13e486
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are merge conflicts and failed CI jobs, could you rebase?
Summary: ### mkldnn backward ops list: - [ ] \(pytorch/pytorch#20567) Add aten mkldnn conv2d backward operator 💛 - [ ] \(pytorch/pytorch#20570) Add aten mkldnn backward ops: relu, linear and reshape 💛 - [ ] \(pytorch/pytorch#20571) Add aten mkldnn backward ops: max_pool2d, avg_pool2d and adaptive_avg_poo2d 💛 - [ ] \(pytorch/pytorch#20572) Add aten mkldnn batchnorm backward operator 💛 - [ ] \(pytorch/pytorch#20573) Add aten mkldnn zero_ operator:yellow_heart: - [ ] \(pytorch/pytorch#20575) Add mkldnn mul operator 💛 Pull Request resolved: pytorch/pytorch#20575 Differential Revision: D15799529 Pulled By: bddppq fbshipit-source-id: 4887d8ef1a0e316ad9db199b657d9481fc13e486
683a232
to
e000509
Compare
@bddppq , rebased it, but the failed cases are not realsted my changes, could you help see them? thanks! |
auto n = x.get_nelems(); | ||
auto* x_ = static_cast<float *>(x.get_data_handle()); | ||
parallel_for(0, n, 2048, [x_](int64_t begin, int64_t end){ | ||
vec256::map( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpuhrsch Is it safe to unconditionally using these vectorized utilities? Or do we need to guard it by some avx macros check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vec256 uses AVX and AVX2 guards to compile with the compatibility version. So calling https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/cpu/vec256/functional.h#L99 without compiling it for avx2 is correct.
But, since this isn't compiled with avx2 for OSS, it might be slow for users. Why isn't this a kernel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpuhrsch , if written as a kernel, the input should not be a TensorIterator as add_kernel , or written as SoftMaxKernel, please give me some advice, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inputs doesn't have to be TensorIterator. We prefer TensorIterator as input for kernels as we use same iterator for CPU and CUDA. In your case you can implement kernel that takes Tensor as input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using map here is also inefficient, please consider writing something like vec256::fill which will loadu vector, fill it and store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VitalyFedyunin , thanks for your suggestion, I will change the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VitalyFedyunin , I have added a fill API in ideep, perharps direct process opaque tensor is not good in pytorch, After talking with @jgong5, add soomthing like vec256::fill is only useful mkldnn tensor, perhaps this way have not big benefit. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: ### mkldnn backward ops list: - [ ] \(pytorch/pytorch#20567) Add aten mkldnn conv2d backward operator 💛 - [ ] \(pytorch/pytorch#20570) Add aten mkldnn backward ops: relu, linear and reshape 💛 - [ ] \(pytorch/pytorch#20571) Add aten mkldnn backward ops: max_pool2d, avg_pool2d and adaptive_avg_poo2d 💛 - [ ] \(pytorch/pytorch#20572) Add aten mkldnn batchnorm backward operator 💛 - [ ] \(pytorch/pytorch#20573) Add aten mkldnn zero_ operator:yellow_heart: - [ ] \(pytorch/pytorch#20575) Add mkldnn mul operator 💚 Pull Request resolved: pytorch/pytorch#20573 Differential Revision: D15820477 Pulled By: bddppq fbshipit-source-id: 35d95f5b4e013c8db1911f52148550a2e40a2e68
mkldnn backward ops list: