Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins check should collect some basic metrics #567

Closed
conorbranagan opened this issue Jul 5, 2013 · 9 comments
Closed

Jenkins check should collect some basic metrics #567

conorbranagan opened this issue Jul 5, 2013 · 9 comments
Assignees

Comments

@conorbranagan
Copy link
Contributor

Including build times (tagged by build name?), a counter on build failures, etc.

@conorbranagan
Copy link
Contributor Author

It looks like we're actually already capturing the build duration in _get_build_metadata so it's just a matter of turning this into a metric.

@adamjt
Copy link

adamjt commented Jul 5, 2013

Sorting by node would be great as well. For example, we only really care about build times on master, and want to filter out feature branch builds. This functionality exists in the Global Build Stats plugin.

@remh
Copy link

remh commented Jul 8, 2013

taking it

@ghost ghost assigned remh Jul 8, 2013
@remh
Copy link

remh commented Jul 8, 2013

@adamjt about the branch builds, are you talking about git branches ?

@adamjt
Copy link

adamjt commented Jul 8, 2013

@remh Yes, sorry, nodes was a poor choice of words. My use case is that for alerting, the only thing I care about is a significant change in build time on the master branch. It would be very noisy if that alert metric was polluted with failures and long build times from feature branches, which are expected to be fairly unstable early on.

@remh remh closed this as completed in 6260603 Jul 9, 2013
@adamjt
Copy link

adamjt commented Aug 8, 2013

@remh We got job duration metrics only once, for a handful of builds that happened a long time ago, and now we're seeing this in the logs:

ERROR (init.py:454): Check 'jenkins' instance #0 failed#012Traceback (most recent call last):#12 File "/usr/share/datadog/agent/checks/init.py", line 445, in run#012 self.check(instance)#12 File "/usr/share/datadog/agent/checks.d/jenkins.py", line 141, in check#012 if output['result'] == 'SUCCESS':#012KeyError: 'result'

@remh
Copy link

remh commented Aug 8, 2013

Hum not sure it's related to this new metric.
Can you send me the content of the build.xml file for this job ?
Make sure you get rid off any private information in it.

@adamjt
Copy link

adamjt commented Aug 8, 2013

I'm not sure how to tell which build it's failing on, how would I go about that?

@benpatterson
Copy link

Hi - We also see this error in our infrastructure. Log below.

2014-06-17 12:49:05 UTC | INFO | dd.collector | checks.collector(collector.py:347) | Finished run #4330. Collection time: 4.47s. Emit time: 0.09s
2014-06-17 12:51:43 UTC | INFO | dd.collector | checks.collector(agent_metrics.py:42) | CPU consumed (%) is high: 17.8, metrics count: 13, events count: 1
2014-06-17 12:52:22 UTC | INFO | dd.collector | checks.collector(collector.py:347) | Finished run #4340. Collection time: 4.72s. Emit time: 0.04s
2014-06-17 12:53:02 UTC | ERROR | dd.collector | checks.jenkins(__init__.py:515) | Check 'jenkins' instance #0 failed
Traceback (most recent call last):
  File "/usr/share/datadog/agent/checks/__init__.py", line 506, in run
    self.check(copy.deepcopy(instance))
  File "/usr/share/datadog/agent/checks.d/jenkins.py", line 159, in check
    if output['result'] == 'SUCCESS':
KeyError: 'result'

It appears to be intermittent (as you can see from previous runs captured in the log).

@adamjt @remh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants