Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement GC metrics collection without native(C++) modules #274

Merged
merged 6 commits into from
Feb 1, 2020

Conversation

yvasiyarov
Copy link
Contributor

I've implemented GC monitoring using perf_hooks builtin module (no dependencies on third party native modules).
Unfortunately perf_hooks module does not provide ability to measure heap size before GC, so I do not report amount of garbage collected( like https://github.com/SimenB/node-prometheus-gc-stats does)

But this implementation reports GC duration percentiles(with split by GC type) which can be quite useful.

Copy link
Collaborator

@zbjornson zbjornson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. @SimenB and @siimon might have some thoughts on aligning these metrics with node-prometheus-gc-stats. I like that this uses a summary for the pause duration instead of a counter, but I wonder if the two counter names should end with _total to match that module. That would cause trampling if both modules are used -- prom-client could detect if node-prometheus-gc-stats is loaded and not collect GC metrics itself maybe. (thinking out loud, not sure if these are good ideas or not.)

Tested locally, looks good. Have a CI failure though. I know SimenB will also ask for the changelog to be updated 🙂

lib/metrics/gc.js Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
@yvasiyarov
Copy link
Contributor Author

@zbjornson I've fixed unit tests. Regarding naming of metrics I do not have strong opinion.
From one side added metrics serve same function as https://github.com/SimenB/node-prometheus-gc-stats module, so it might be a good idea to use same names for metrics.

On the other side this metrics is not drop in replacement for node-prometheus-gc-stats . As I said some metrics is available only at node-prometheus-gc-stats (and this information is not available through perf_hooks interface). So there might be a reason to have both sets of metrics available at the same time.

If we will use same names for metrics - than prom-client will be conflicting with node-prometheus-gc-stats module.

@yvasiyarov
Copy link
Contributor Author

@SimenB what can I do to get this PR merged ?
if you want me to fix something, please, let me know.

@yvasiyarov
Copy link
Contributor Author

@zbjornson sorry for disturbing you, but how can I have this PR merged ?

@yvasiyarov yvasiyarov force-pushed the gc branch 2 times, most recently from 8c503fc to 99f65d3 Compare September 2, 2019 14:07
@siimon
Copy link
Owner

siimon commented Sep 8, 2019

I’m not a huge fan of adding checks in prom-client for detecting other libraries, in this case gc-stats. If we could go around it with different naming that would be great.

I’ve not read up on the native gc stats api, but what is preferred between this and the “old” solution? When should you use this and not the other? Should there be some guidance in the docs on this?

@zbjornson zbjornson force-pushed the master branch 2 times, most recently from d6f2cf6 to 29b6d86 Compare November 12, 2019 21:14
@zbjornson
Copy link
Collaborator

Re-reviewing... sorry for letting this sit.

The gc-stats module (used by node-prometheus-gc-stats) measures "bytes reclaimed" just by calling v8's GetHeapStatistics in the GC prologue and epilogue; it's not something v8 provides in the GC epilogue. I tested if process.memoryUsage().heapUsed could be used to approximate that, but it can't (there's no userland GC prologue so you're measuring newly allocated, live objects too). bytes_reclaimed is a useful metric; I think it would be worth opening an issue in nodejs/node suggesting it be added to perf_hooks.

(FWIW, the GC duration reported by perf_hooks and gc-stats is very similar.)

That means the current benefit of this PR is no native add-on for users to deal with and duration reported as a Summary (see also issue in node-prometheus-gc-stats to convert to a histogram), but the downside is no bytes_reclaimed metric.

One possible route forward is to essentially move node-prometheus-gc-stats into this module: add a note in the readme like "GC duration is available as a default metric. If you also desire the number of bytes reclaimed by each collection, you can install gc-stats and this module will use it automatically." Then this module would have try { require("gc-stats") } catch {} -- i.e. silently fall back to just using perf-hooks. Then we wouldn't have to worry about metric name conflicts between this and node-prometheus-gc-stats. I'm happy to help with this change if you'd like.

TBD is the _total suffix. I guess leave it off since that's best practice, although inconsistent with the rest of this module at the moment. Alternatively, add it for now and correct it in #284...

@zbjornson
Copy link
Collaborator

The gc-stats module (used by node-prometheus-gc-stats) measures "bytes reclaimed" just by calling v8's GetHeapStatistics in the GC prologue and epilogue

Now that I think about it, I think this method of determining bytes reclaimed is not accurate because v8's GC is concurrent. I've asked hooraybuffer on Twitter to confirm. Given that, I think this perf_hooks impl is the best route forward.

@yvasiyarov
Copy link
Contributor Author

@zbjornson thank you for spending your time on this PR and research you have made regarding "bytes reclaimed" metric.
Now, what should I do to make this PR merged ?
Do I need to add "_total" suffix or not ?

@zbjornson
Copy link
Collaborator

@siimon @SimenB how would you feel about deprecating node-prometheus-gc-stats due to the (likely) accuracy issue with "bytes reclaimed" mentioned above, and instead just monitoring GC duration in this module? (hooryabuffer didn't reply unfortunately.)

Copy link
Contributor

@sam-github sam-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favour of allowing useful node.js builtin metrics to be exposed, and gc times are useful. Long gc times are a signal of ill health of a service. I don't think the names should be the same as the gc stats, I think they should be different (for coexistence and transition), and should be prom standard. These names should be run through the promcheck tool. If a histogram was used there wouldn't be a need for seperate count and summary metrics. Is there a reason to not use a prom histogram?

lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/defaultMetrics.js Show resolved Hide resolved
@siimon
Copy link
Owner

siimon commented Jan 29, 2020

I see it as a great fit to deprecate the gc-stats module if it could be added into prom-client with no third party dependencies!

Sorry for taking so long on this, will try and keep an eye on this! @yvasiyarov if you could make the requested changes @sam-github pointed out that would be great and after that I think we can merge this one.

@sam-github thanks for taking your time on reviewing this!

@yvasiyarov
Copy link
Contributor Author

@siimon @sam-github thank you for the feedback. I will make requested changes in a few days

@yvasiyarov
Copy link
Contributor Author

@siimon @SimenB @zbjornson @sam-github guys, please, have a look at updated PR

Copy link
Contributor

@sam-github sam-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Sorry if I'm bikeshedding the details, its just that once its published its hard to change the details without being semver-major.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
lib/metrics/gc.js Outdated Show resolved Hide resolved
@yvasiyarov
Copy link
Contributor Author

@sam-github thank you for your feedback, I really appreciate that. Please have another look at PR

Copy link
Contributor

@sam-github sam-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made two small suggestions, but this LGTM. Dependency on c++ addons is a continual source of pain for users, I think this is going to be really helpful.

CHANGELOG.md Outdated Show resolved Hide resolved
example/server.js Outdated Show resolved Hide resolved
@sam-github
Copy link
Contributor

Sorry, one other suggestion. The docs that got added to the CHANGELOG.md, I think they should go in the README.md, and the CHANGELOG.md should just have a oneliner: Add: feat: implement GC metrics collection without native(C++) modules.

yvasiyarov and others added 3 commits February 1, 2020 18:21
Co-Authored-By: Sam Roberts <vieuxtech@gmail.com>
Co-Authored-By: Sam Roberts <vieuxtech@gmail.com>
@yvasiyarov
Copy link
Contributor Author

@sam-github I've updated both CHANGELOG.md and README.md

Copy link
Contributor

@sam-github sam-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@siimon
Copy link
Owner

siimon commented Feb 1, 2020

🎉👍

@siimon siimon merged commit 9b4ef10 into siimon:master Feb 1, 2020
@siimon
Copy link
Owner

siimon commented Feb 1, 2020

Will try and merge some more PRs and make a release when I’m back on a regular computer on Monday 👍

@yvasiyarov yvasiyarov deleted the gc branch February 2, 2020 14:41
@TranBaVinhSon
Copy link

When will this pull request be released ? I need gc metrics in my project. Many thanks !

@SimenB
Copy link
Collaborator

SimenB commented Feb 17, 2020

After #333 lands

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants