Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: host-metrics system.cpu.utilization calculation fix #1741

Merged

Conversation

david-luna
Copy link
Contributor

Which problem is this PR solving?

this PR is fixing the calculation of system.cpu.utilization metrics which is currently returning values that are out of the range [0,1] specified in semantic conventions. See #1718

Short description of the changes

the calculation now is made based on the previous values returned from the os.cpus NodeJS API and the time it was called. Also Date.now static function is used to avoid unnecessary object creation (date objects) on each measurement.

@david-luna david-luna requested a review from a team October 16, 2023 11:27
@github-actions github-actions bot requested a review from legendecas October 16, 2023 11:27
@pichlermarc pichlermarc added bug Something isn't working priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect pkg:host-metrics labels Oct 16, 2023
@codecov
Copy link

codecov bot commented Oct 16, 2023

Codecov Report

Merging #1741 (7603e56) into main (de6156a) will decrease coverage by 0.40%.
Report is 1 commits behind head on main.
The diff coverage is 100.00%.

❗ Current head 7603e56 differs from pull request most recent head 768d62d. Consider uploading reports for the commit 768d62d to get more accurate results

@@            Coverage Diff             @@
##             main    #1741      +/-   ##
==========================================
- Coverage   91.44%   91.05%   -0.40%     
==========================================
  Files         143      133      -10     
  Lines        7273     6560     -713     
  Branches     1458     1308     -150     
==========================================
- Hits         6651     5973     -678     
+ Misses        622      587      -35     
Files Coverage Δ
packages/opentelemetry-host-metrics/src/enum.ts 100.00% <100.00%> (ø)
packages/opentelemetry-host-metrics/src/metric.ts 100.00% <100.00%> (ø)
...ges/opentelemetry-host-metrics/src/stats/common.ts 97.77% <100.00%> (+2.77%) ⬆️

... and 11 files with indirect coverage changes

Comment on lines 30 to 31
* The first time will return 0 for % masurements since there is not enough
* data to calculate it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be possible to load cpus at startup and calculate the load from then? It would miss the first few milliseconds, but would avoid having a 0 measurement on each program restart.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would make testing harder but it is possible. I'll give it a try

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Luckily the test is setup in a way that is not affected by this data preload. The changes are pushed :)

packages/opentelemetry-host-metrics/test/metric.test.ts Outdated Show resolved Hide resolved
ensureValue(metric, { state: 'user', cpu: '0' }, 30247.935978659552);
ensureValue(metric, { state: 'system', cpu: '0' }, 21071.23374458153);
ensureValue(metric, { state: 'idle', cpu: '0' }, 124998.56618872957);
ensureValue(metric, { state: 'user', cpu: '0' }, 0.7);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the value rounded or is this just a coincidence that they're all 1 significant figure?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a coincidence. The data is mocked with packages/opentelemetry-host-metrics/test/mocks/cpu.json.

@david-luna david-luna changed the title fix: host-metrics system.cpu.utilization calculation fix fix!: host-metrics system.cpu.utilization calculation fix Oct 26, 2023
@david-luna david-luna requested a review from dyladan October 26, 2023 12:15
@david-luna
Copy link
Contributor Author

@legendecas @dyladan could you give another look please?

@@ -27,6 +27,11 @@ export enum METRIC_NAMES {
PROCESS_MEMORY_USAGE = 'process.memory.usage',
}

export enum METRIC_ATTRIBUTES {
Copy link
Contributor Author

@david-luna david-luna Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this enum to properly report the attributes with their full name as set on the semantic conventions

See: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/system/system-metrics.md#metric-systemcputime

@legendecas
Copy link
Member

Thank you for working on this.

@david-luna
Copy link
Contributor Author

Thanks @legendecas :)

I'll prepare another PR for process.cpu.* metrics once this one is merged.

@legendecas legendecas enabled auto-merge (squash) November 7, 2023 03:20
@legendecas legendecas merged commit b9350d9 into open-telemetry:main Nov 7, 2023
10 checks passed
@dyladan dyladan mentioned this pull request Nov 6, 2023
@david-luna david-luna deleted the dluna/1718-fix-cpu-utilization branch November 7, 2023 11:08
jmcdo29 pushed a commit to jmcdo29/opentelemetry-js-contrib that referenced this pull request Nov 13, 2023
…metry#1741)

* tests: tune the fake clock for testing

* tests: tune the values to expect

* chore: lint autofix

* chore: improve os.cpus mock

* chore: revert implementation for process CPU usage

* chore: rename var

* chore: avoid having useless data upon 1st collection

* chore: remove comments

* chore: change system.cpu attribute names to match semantic conventions

* chore: update test

* chore: lint fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg:host-metrics priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants