-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This is great, how can I learn to do it myself? #9
Comments
Great question! A good starting point is Custom C++ and Cuda Extensions by Peter Goldsborough. To get started converting your own function I would recommend doing the following:
A (perhaps easier) alternative would be leveraging numba. It is a similar approach, but the kernel can be written in Python, and then just-in-time compiled to cpu/gpu. Maghoumi/pytorch-softdtw-cuda is a great example of this method (thanks to Mathieu for pointing this out). If done properly, you can likely achieve performance on-par with the C++/CUDA implementation. I hope this helps. If enough people ask I would be happy to write a blog post about the process! - Teddy |
Thanks Teddy for the explanation. I hope more people show interest so we can see your blog on this topic. |
I'll leave it open for visibility :) |
Closing this, but keeping it pinned |
Thanks for sharing this code, is it the rewritten version of (https://github.com/google-research/fast-soft-sort)? what is the benefit of this version compare to fast soft sort from google?
what makes me really interested in this work is the implementation, I found it so hard to incorporate new stuff to pytorch with c++ and cuda, I know it is a little to much to ask but I am sure it will be appreciated so much if you can write a tutorial (or make a video) on how we can do it for other function.
That would be a huge help!
The text was updated successfully, but these errors were encountered: