-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds active stat items for clean subtasks #2245
Conversation
accounts-db/src/active_stats.rs
Outdated
@@ -17,6 +27,16 @@ pub struct ActiveStats { | |||
#[derive(Debug, Copy, Clone)] | |||
pub enum ActiveStatItem { | |||
Clean, | |||
CleanCollectCandidates, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow. this is a lot of granularity. my gut says we need half of these. otoh, what is the downside of all these? some noise? not sure. certainly helpful as we debug these oom situations. not sure how useful it is usually. since they are metrics it also seems that we don't need log files to find 'info' logs. I'm thinking we don't really need the 1/0 results. maybe if we just logged the 1 each time each phase started and we could look at the code to see there are no gaps between stages. that seems kind of hacky and just trying to save half the logs. each of these logs will be once per clean, so once per 100 slots. This is a rambling comment. In short, I'm reluctant to put in this much granularity. But, maybe it is hte right idea and I shouldn't be so conservative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is a lot. I agree likely too much. I figured I'd start with "everything" and then we can reduce it. I plan to run this on the dev box to see where the oom happens. I figure for that use-case, more is better. Maybe it'll inform what parts can be reduced for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've revived this PR and reduced the number of new items a bit. How's the new set look?
265363a
to
7da6044
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems a good set of stat to have.
I believe that these stats are reported every ~100 slots. right?
Yeah, whenever |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
c6f1803
7da6044
to
c6f1803
Compare
I had to resolve merge conflicts (removal of uncleaned_roots), so I rebased and force-pushed. No code was changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
Problem
At startup there are OOM reports that we believe are caused by
clean
. But when there's an OOM, we don't get the metrics fromclean
to know which part was the culprit.Summary of Changes
Add "active stat" items for the clean subtasks.