Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Windows perf data helper code #3871

Merged
merged 2 commits into from
Apr 3, 2017

Conversation

andrewkroh
Copy link
Member

@andrewkroh andrewkroh commented Mar 31, 2017

Refactoring of the Windows performance data code. There's more cleanup to come. See #3828.

  • Improve generated PDH code to return usable error codes.
  • Use the standard error interface in return values.
  • Use more restrictive param types in generated functions.
  • Create PdhErrno that interprets the PDH error codes and returns their human readable error string.

This PR is based on #3862, so it should be merged first to avoid duplicate code review.

// XXX: Sometimes counters return negative values. We can ignore
// this error. For an explanation see:
// https://www.netiq.com/support/kb/doc.php?id=7010545
if err == PDH_CALC_NEGATIVE_DENOMINATOR {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be if value.Err == PDH_CALC_NEGATIVE_DENOMINATOR {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. It would be nice to have test for this error in the future.

if !r.executed {
// Most counters require two sample values in order to compute a
// displayable value. So initial perform two samples.
r.query.Execute()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @andrewkroh. You wrote

but the first value will always be 0

This is because the time slot is to small between these two calls. I would prefer the following solution

if err := r.query.Execute(); err != nil {
		return nil, err
	}

	// Get the values.
	values, err := r.query.Values()
	if err != nil {
		return nil, errors.Wrap(err, "failed formatting counter values")
	}

	// Write the values into the map.
	result := common.MapStr{}
	var errs multierror.Errors

	for counterPath, value := range values {
		if value.Err != nil {
			// XXX: Sometimes counters return negative values. We can ignore
			// this error. For an explanation see:
			// https://www.netiq.com/support/kb/doc.php?id=7010545
			if value.Err == PDH_CALC_NEGATIVE_DENOMINATOR {
				result.Put(counterPath, 0)
				continue
			}
			if value.Err == PDH_INVALID_DATA {
				if !r.executed {
					// Most counters require two sample values in order to compute a
					// displayable value. So the first fetch for this counters return PDH_INVALID_DATA.
					// So check if this is the first fetch. If so just move on and dont't put that 0 value
					// into our result. If not this is maybe a real error
					r.executed = true
					continue
				}
			}
			errs = append(errs, value.Err)
			continue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed up the error handling a bit to sort of match ^. If an error occurs it will report the metric as zero. But it uses some conditions to determine whether or not the error is reported in the event. There are two special cases, one for PDH_CALC_NEGATIVE_DENOMINATOR and one for PDH_INVALID_DATA on the first fetch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to experiment and change it up. I primarily wanted to remove the sleep.

@martinscholz83
Copy link
Contributor

Hi @andrewkroh. These changes are really great!! I have left 2 comments about the first query. One thing i'm very interested in is how you get all these PDH error codes with that gcc command.

@martinscholz83
Copy link
Contributor

Ok, got it. These are define preprocessors coming from pdhmsg.h. And that -D command make them visible.

Refactoring of the Windows performance data code. There's more cleanup to come. See elastic#3828.

- Improve generated PDH code to return usable error codes.
- Use the standard error interface in return values.
- Use more restrictive param types in generated functions.
- Create PdhErrno that interprets the PDH error codes and returns their human readable error string.
@martinscholz83
Copy link
Contributor

Hi @andrewkroh. For error PDH_CALC_NEGATIVE_DENOMINATOR i currently write a test. But i think we should first merge your changes to not mix up things.

@ruflin ruflin merged commit 572bbfa into elastic:master Apr 3, 2017
@andrewkroh andrewkroh deleted the bugfix/perfmon-cgo branch July 5, 2017 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants