Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vsphere] check revamp addressing several issues #3055

Merged
merged 16 commits into from
Jan 30, 2017
Merged

Conversation

truthbk
Copy link
Member

@truthbk truthbk commented Dec 2, 2016

What does this PR do?

This PR aims to address several issues with the check. Mostly focusing on the following points:

  • Identifying all relevant infrastructure elements - VMs, hosts, datastores, datacenters. etc. We had issues when due to the infrastructure hierarchy we weren't properly traversing the tree. By resorting to container views and allowing pyvmomi to do the traversal for us we should be guaranteed to view all infrastructure we're privileged enough to see.
  • Improving vCenter/vSphere 6+ compatibility.
  • Improving tagging.
  • Fixing a bug where we were submitting vsphere "rates" as rates instead of gauges, effectively computing the rate of rates. Vsphere rates are precomputed and should be submitted as gauges. This has been broken for a long time and might affect some customers, but we weren't doing the right thing.
  • Performance should have improved, however collecting all metrics is still CPU intensive.

Motivation

A general need to improve our vsphere check to better address customer requirements.

Copy link
Member

@yannmh yannmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!


server_instance = self._get_server_instance(instance)
root_folder = server_instance.content.rootFolder
for resource_type in RESOURCE_TYPE_MAP:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we keep

if i_key in self.morlist_raw and len(self.morlist_raw[i_key]) > 0:

?

Vsphere objects in self.morlist_raw[i_key][resource_type] are populated at the same time as the resource keys in self.morlist_raw[i_key].

Copy link
Member Author

@truthbk truthbk Dec 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I did that because now we pop() the elements per resource type in _cache_morlist_process(), before we knew when the list was empty, we were done. Now we may have processed all ComputeResources, or Datacenters, but not the VM's or whatever, so just checking for the length of the dictionary wouldn't be enough. Besides, we pop from the lists, not the keys. self.morlist_raw[i_key][resource_type] dicts are created the first time around, but they then exist through-out, I believe, so the len(self.morlist_raw[i_key]) == len(RESOURCE_TYPE_MAP) in subsequent iterations.

u"job_atomic: Exploring MOR %s: name=%s, class=%s",
obj, obj.name, obj.__class__
)
def _get_all_objs(content, vimtype, regexes=None, include_only_marked=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should move _get_all_objs and build_resource_registry out of _discover_mor body?

@@ -582,15 +584,19 @@ def _discover_mor(self, instance, tags, regexes=None, include_only_marked=False)
If it's a node we want to query metric for, queue it in `self.morlist_raw` that
will be processed by another job.
"""
def _get_all_objs(content, vimtype, regexes=None, include_only_marked=False):
def _get_all_objs(content, vimtype, regexes=None, include_only_marked=False, tags=[]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid the mutable tags=[] as a default argument. It's easy to forget it and modify the value later on.

for c in container.view:
if not self._is_excluded(c, regexes, include_only_marked):
obj_list.append(c)
obj_list.append(dict(mor_type=vimtype, mor=c, hostname=c.name, tags=tags))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: NamedTuple for that. It's easier to use as we know that all keys exist, i.e.

vsphere_object.mtype

instead of

vsphere_object['mtype']

which is more ambiguous and can quickly become

vsphere_object.get('mtype')


mor['metrics'] = self._compute_needed_metrics(instance, available_metrics)
mor_name = str(mor['mor'])
if available_metrics:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we don't check that? We were not doing it before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we probably don't need it, going to try with passing around the empty list.

interval = REAL_TIME_INTERVAL
else:
interval = None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: we have some small code duplication, I wonder if it'd be nicer to have the interval stored once for all in mor, i.e.

dict(mor_type=vimtype, mor=c, hostname=c.name, tags=tags, interval=interval)

"vsphere.%s" % metric_name,
value,
hostname=mor['hostname'],
tags=['instance:%s' % instance_name]
tags=['instance:%s' % instance_name] + mor['tags']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary. more['tags'] should be host tags, not metric tags.

@truthbk truthbk changed the title [vsphere][wip] check revamp addressing several issues [vsphere] check revamp addressing several issues Dec 14, 2016
@truthbk truthbk merged commit e309c52 into master Jan 30, 2017
truthbk added a commit to DataDog/integrations-core that referenced this pull request Feb 3, 2017
truthbk added a commit to DataDog/integrations-core that referenced this pull request Feb 3, 2017
truthbk added a commit to DataDog/integrations-core that referenced this pull request Feb 16, 2017
truthbk added a commit to DataDog/integrations-core that referenced this pull request Feb 16, 2017
@truthbk truthbk added this to the 5.12.0 milestone Feb 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants