-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement NVTXW export #54
Implement NVTXW export #54
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't had time to look at the NVTXW code yet, but here are my preliminary thoughts on the part of this that integrates into the prof-viewer code.
Overall this is dramatically better, thanks for doing this work. I appreciate that you work with the existing abstractions rather than working around them. There are a handful of places where I think it would be better to avoid so much hard-coding, but this is hopefully a relatively minor change from what you've got now.
There is one structural change you might consider, which I left a comment about in the source code. At the moment, you essentially read the entire data set into memory, interleave it into one massive table, and then de-interleave it as you write it back out. First of all this uses memory proportional to the size of the profile. Second, it just seems like a lot of work to do this (and probably inefficient). I think that an approach in which you match tiles with their corresponding meta tile, and then write that as a unit, would probably be more efficient. As a bonus it would avoid needing to have so many streams open on the NVTXW side. I think you can do this with fixed memory usage while still allowing a degree of parallelism by configuring the number of outstanding calls you allow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments on the NVTXW code itself.
278ca3e
to
2e257de
Compare
Updated to fix introduced clippy diagnostics |
I'm on vacation but I will hit the "run" button as long as you keep pushing changes. I'll get back to you with a review on August 26. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the Rust NVTXW crate.
Overall my highest level comment is that you may want to think about exactly how high- or low-level you want this interface to be. The current interface is serviceable and does, strictly speaking, remove unsafe
from user code. But the code that users have to write is still a bit questionable: ideally, users wouldn't have to think that hard about struct packing, CString
, and raw pointers at all. So this may be fine for now, but just recognize that as written it's not a great example of best-in-class Rust API wrappers for C.
I added some minor code comments to improve quality in various ways throughout the code, and suggested some more idiomatic practices.
Some other things you might think about for NVTXW, which I did not specifically address:
|
2e257de
to
77f22ee
Compare
Thanks for your feedback, it's really helpful both to make the code better and to build my Rust skills. I've updated the branch to address some items, and I will continue with the rest and update it further. Right now, I am particularly interested in feedback on my revised implementation of tile matching, using a single |
77f22ee
to
519e2fb
Compare
519e2fb
to
3c95db6
Compare
The CI is now failing with
The only thing I can think of that might have introduced this is specifying the version of bindgen in the crate's Cargo.toml. |
Try reverting to bindgen https://github.com/StanfordLegion/prof-viewer/actions/runs/10411469581/job/28835561058 |
d6a8e73
to
6d41688
Compare
947b741
to
8ed87b9
Compare
@elliottslaughter I've fixed the build on macOS. Thanks for bringing that to my attention. |
8ed87b9
to
435c521
Compare
I've implemented this request. |
84c51df
to
3726193
Compare
Other than the discussion about the enums in the NVTXW crate, I believe I have addressed all review feedback. Once I have resolved it and have your approval, I am prepared to proceed with the remaining steps for publishing the crate and updating the branches. We should be able to complete this within the week, in time for your merge window. |
45af5c7
to
9583a40
Compare
@evanramos-nvidia I wanted to address one point mentioned in your previous comment:
I posted the release timeline here on August 29: In particular, feature freeze was September 16. The intent of our release system is to follow a train model. The goals are (a) to run frequent releases so it matters less what specific release a feature gets into, and (b) merge features when they're ready (rather than sprint to get them ready by some deadline). Even though we have a cutoff date we usually shy away from large merges toward the end of the cycle anyway. Based on the fact that this is new code being added and we're past the feature freeze date, it would be my plan to merge this right after the release so that it's early in the next cycle. My understanding is that cuNumeric tracks the Let me know if you have any questions. |
@evanramos-nvidia are you ready to move forward with this? We can make this available to the cuNumeric folks as soon as it hits |
9583a40
to
c80bbfa
Compare
Updated to complete these steps. |
fcd668d
to
ca6c5ba
Compare
ca6c5ba
to
13f1867
Compare
Elliott, thank you for your help in reviewing this PR. It has been valuable to get your feedback and coaching since this is the first Rust project I've worked on. |
Yup, and thanks for putting in the work to make all the changes. It's been a long road but I think the code is in a much better place now, with a much better foundation for building future work. The Legion side MR got merged so this should be available to the Legate folks as soon as they update Legion. |
This PR contains the second iteration of our NVTXW export feature. The new code looks significantly different than before, keeping in mind the feedback you've given so far. We may still revise some details of these changes, particularly where the Rust code is perhaps too C-like, but overall it is ready for review. It is at a point where getting the code reviewed by someone with more Rust experience will help inform us how we want to finalize it.
I've marked this PR as a draft while we determine where we will put the crate. For review and CI purposes, I've set it up in the subdir
crate-tmp
, but this is temporary. If you have any feedback to offer on the crate in addition to the Legion Prof Viewer code itself, we would definitely appreciate that, since we are new to Rust and still learning.