-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #11806] Checks are not executed anymore on command #4224
Comments
Updated by gbeutner on 2016-05-18 12:02:40 +00:00
|
Updated by hostedpower on 2016-05-18 12:21:26 +00:00 CPU spikes also now on the monitoring server. It's the icinga2 consuming all the CPU. top - 14:20:07 up 19 days, 23:18, 2 users, load average: 2.72, 2.72, 2.79 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND |
Updated by gbeutner on 2016-05-18 12:28:50 +00:00
Are you using command_endpoint? |
Updated by gbeutner on 2016-05-18 12:31:59 +00:00 Nvm, I think I found the problem. Can you test whether this still happens with the latest master branch? |
Updated by gbeutner on 2016-05-18 12:35:05 +00:00
Applied in changeset b99b373. |
Updated by hostedpower on 2016-05-18 12:39:33 +00:00 Hi, I probably need to compile from source? It's a production server only using Debian packages. I'm not sure what the easiest way to get a stable situation again :( Jo |
Updated by hostedpower on 2016-05-18 12:43:43 +00:00 PS: from what I can see, many many checks are not working atm. Many checks should have been executed again but are not. I think It's so severe a bugfix should be sent out trough apt packages as well asap ... |
Updated by hostedpower on 2016-05-18 12:55:14 +00:00 PS PS: if this new setting is not used, I assume it works like before? Does this mean there is no limit or how does it work? Most checks we use are indeed on the remote servers itself (using command_endpoint). |
Updated by gbeutner on 2016-05-18 12:58:11 +00:00 We're going to release a fix for this tomorrow (i.e. 2.4.9). As a temporary workaround you can set the concurrent_checks option to a fairly large number (e.g. 4294967294). |
Updated by hostedpower on 2016-05-18 13:14:03 +00:00 Hi, Thanks a lot for the quick response and fix. I assume 4294967294 is too high since it didn't work. Setting it to 10000 seems to work for now, but the CPU usage is still quite high. I assume the new version fixed that as well? |
Updated by hostedpower on 2016-05-18 13:38:01 +00:00 Just want to let you know that cpu usage keeps increasing and increasing even with concurrent_checks = 10000: top - 15:34:57 up 20 days, 33 min, 2 users, load average: 2.55, 1.62, 1.31 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND Sorry that I cannot apply the fix for now, just want to be sure you took this into account as well with the fix coming tomorrow :) Wouldn't it be a good idea to keep the latest and one older version available in the debmon project packages? Not sure who's responsible for this, just a thought. Atm I cannot reverse to 2.4.7 afaik since the older packages are purged from debmon already. |
Updated by gbeutner on 2016-05-18 13:38:25 +00:00
|
Updated by mfrosch on 2016-05-18 18:53:30 +00:00
On v2.4.8-413-g4af6bde After about an hour, all checks freeze, no apparent CPU usage |
Updated by mfrosch on 2016-05-18 18:53:44 +00:00
|
Updated by gbeutner on 2016-05-19 05:05:13 +00:00
That's not the latest commit. Please test with b99b373 or later. |
Updated by mfrosch on 2016-05-19 07:46:21 +00:00 Had to trigger Jenkins, watching my Testsystem now. |
Updated by hostedpower on 2016-05-19 08:44:39 +00:00 Checks halted here too, CPU usage very high as well :( I hope the fix solves this all :) |
Updated by gbeutner on 2016-05-19 11:25:04 +00:00
Applied in changeset 232c299. |
Updated by gbeutner on 2016-05-19 11:31:10 +00:00
|
Updated by a2yp on 2016-05-19 12:24:54 +00:00 We still have this Problem with Versions:
=> Checks are not performed after some minutes |
Updated by hostedpower on 2016-05-19 13:04:42 +00:00 Same here , issues not fixed at all! Since 12:45 checks are not refreshed. This is terrible :( I would like to go back to 2.4.7 asap. |
Updated by a2yp on 2016-05-19 13:12:56 +00:00 hostedpower wrote:
Unfortunately, this version is no longer available in debmon. |
Updated by hostedpower on 2016-05-19 13:15:07 +00:00 Indeed, it would be better to keep some versions there :( |
Updated by Isotop7 on 2016-05-19 14:28:36 +00:00 a2yp wrote:
same here... IMHO this isnt resolved. EDIT: version 2.4.10 seems to fix it for me! The daemon is running for 10 minutes straight with no problems whatsoever. |
Updated by a2yp on 2016-05-19 15:04:21 +00:00 Version 2.4.10 seems to fix this issue. Here Icinga runs without problems for about 40 minutes. THX |
Updated by hostedpower on 2016-05-19 18:19:56 +00:00 Seems like solved with 2.4.10 as well here! Did you revert all changes or simply fixed the issues somehow? :) |
This issue has been migrated from Redmine: https://dev.icinga.com/issues/11806
Created by hostedpower on 2016-05-18 12:00:10 +00:00
Assignee: mfrosch
Status: Resolved (closed on 2016-05-19 11:25:04 +00:00)
Target Version: 2.4.9
Last Update: 2016-05-19 18:19:56 +00:00 (in Redmine)
Hi,
Since latest release some checks don't seem to be executed anymore. I had apt upgrade pending, I upgraded systems and since then the status is not updated.
Clicking onto: "Check now" on the service in the web gui does not seem to do anything.
In the end I restarted the icinga2 daemon on the monitoring server, I went back to the web gui, I clicked on "Check now" and finally they were checked.
A few hours later I try the same for another host, but it fails again. Restarting icinga2 on the monitoring server fixes it again :|
I'm not sure if it could be related to this new feature: https://dev.icinga.org/issues/8137
Kind regards
Jo
Changesets
2016-05-18 12:30:36 +00:00 by gbeutner b99b373
2016-05-19 11:15:00 +00:00 by gbeutner 232c299
Relations:
The text was updated successfully, but these errors were encountered: