-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dev.icinga.com #12258] icinga2 IDO reload performance significant slower with latest snapshot release #4418
Comments
Updated by mfriedrich on 2016-07-29 15:16:27 +00:00
2.4.10 is not directly comparable as the event which logs "finished reconnect" is not necessarily really at the end. If there were other queries re-inserted into the query queue after it happened you won't see them (at least not in your log summary). We've also changed the way how transactions are fired so there will be more BEGIN/COMMIT queries inside the current snapshot. One thing that's gone are the numerous DELETE and INSERT statements for group members, those are UPDATE and one INSERT if not existing, plus the DELETE on the session token at the end. Does your schema have all the 2.5.0.sql changes applied? One thing which is interesting - your database is only processing half of the queries in the same time. I'd investigate on the database side (show processlist and slow query logging) if there's something changed over there. I don't get the relation to #11501 - that's a different topic (API) and not related to the IDO database. |
Updated by hvhaugwitz on 2016-07-29 17:01:03 +00:00 dnsmichi wrote:
Yes, at least the changes that come with the icinga2-ido-mysql package.
The database is on the same host as icinga2 and the only change was the upgrade from stable to snapshot release. The slow query log is empty.
With #11501 implemented there would be no need for reloads on host configuration changes ;-) |
Updated by hvhaugwitz on 2016-07-29 17:34:26 +00:00
attached the summaries of the queries (as discussed with gunnarbeutner via irc) |
Updated by tobiasvdk on 2016-07-30 19:17:40 +00:00 What I also confirm is that icinga doesn't do as many queries/s as it did "before". Afair, in my case it was ~9000 q/s for ~2 minutes and then it dropped to ~7500 q/s. Now it's only 5500 - 6000 q/s (without the burst just after the reload). |
Updated by mfriedrich on 2016-08-01 08:46:35 +00:00
|
Updated by mfriedrich on 2016-08-02 12:32:41 +00:00
|
Updated by mfriedrich on 2016-08-02 12:39:04 +00:00 I assume the related change with #11688 and deleting any stale comment/downtime objects turns into multiple DELETE statements for each object. Even given a database index it will still cause some time. As we now have an upsert mechanism implemented for #12210 it makes sense to go the same route here with an additional session_token. Will link the follow-up ticket, Please test git commit cd5c936 |
Updated by mfriedrich on 2016-08-02 12:40:03 +00:00
|
Updated by hvhaugwitz on 2016-08-02 12:57:07 +00:00 dnsmichi wrote:
Thanks, will do the test later. It seems that in your commit the session_token column and the idx_downtimes_session_del index for the icinga_sheduleddowntime table is missing in lib/db_ido_mysql/schema/mysql.sql |
Updated by mfriedrich on 2016-08-02 13:06:26 +00:00 Correct, thanks for noticing so fast. Shouldn't do 3 issues at the same time. Thanks fixed. |
Updated by hvhaugwitz on 2016-08-03 14:07:21 +00:00 Today I've tested v2.4.10-583-g4544014:
The 33276 'DELETE FROM icinga_scheduleddowntime' statements are still there. Apart from that the change has no measurable effect to the reconnection time of icinga2:
|
Updated by mfriedrich on 2016-08-03 14:24:06 +00:00 Thanks, corrected that with the downtimes delete. Other than that there are no significant changes - except for additional indexes for select/delete queries which may influence insert queries and their time to process. You're using quite a lot of custom vars and service groups, that's the only thing I can see for now from the query counts. The changed DELETE/INSERT into an upsert might have an influence which could maybe analysed from within MySQL itself (different strategy for query execution, or execution time). |
Updated by hvhaugwitz on 2016-08-03 17:55:24 +00:00 I don't think the queries themselves are the problem here. They are more or less the same as with 2.4.10-1. Something has been changed in the snapshot version that causes icinga2 to process the configuration a lot more slowly than before. Please let me know if I can do anything to debug this issue further. |
Updated by gbeutner on 2016-08-08 14:35:21 +00:00
|
Updated by mfriedrich on 2016-08-11 16:36:08 +00:00
Steps to reproduceGenerate lots of services and simulate a config dump (10k services, 100k custom attributes) Test environment
2.4.10There is a bug in 2.4.10 where the final DELETE session_token does not happen at the end, but somewhere before. We'll therefore take the last query into account before the normal program status update intervals happen. 80 seconds.
2.5.0v2.4.10-639-gb3f4a4d 77 seconds.
ConclusioThe IDO config dump is nearly the same. 2.4.10 has several bugs with query priority ordering which might cause custom vars not being deleted, or the wrong assumption of the finished config dump while it actually isn't. While this issue helped to nail down the initial query differences, after all there is no noticeable performance drop. But it is the same or a bit faster even. I'm attaching the gzipped debug logs in case you're interested. I'm therefore closing the issue. |
Updated by hvhaugwitz on 2016-08-16 17:43:10 +00:00 I'm a bit surprised that you closed this bug report without asking for feedback, But nevertheless the recent commits from gunnarbeutner fixing #12435 also fixed |
Updated by mfriedrich on 2016-08-16 18:19:33 +00:00 You've built your arguments based on assumptions with the log entries. I already mentioned that this method doesn't work. Of course I could have asked for a more in-depth analysis on your side. I just felt that this isn't necessary as I did not recognise any specific performance decreasements in other test environments. Sorry if that came out the wrong way. I appreciate what you and others do with testing and additional help in making the next release great. |
This issue has been migrated from Redmine: https://dev.icinga.com/issues/12258
Created by hvhaugwitz on 2016-07-29 13:50:28 +00:00
Assignee: mfriedrich
Status: Resolved (closed on 2016-08-11 16:36:08 +00:00)
Target Version: 2.5.0
Last Update: 2016-08-16 18:19:33 +00:00 (in Redmine)
The MySQL IDO reload performance is significant slower with the latest icinga2
snapshot relesase (version: v2.4.10-565-ga3815e4) than with 2.4.10-1:
2.4.10-1:
v2.4.10-565-ga3815e4:
Though the number of queries during the reload is a little bit higher the
current stable release needs only half of the time to reconnect to the
database.
As unfortunately #11501 is not be fixed in 2.5.0 it would be important that
the IDO reload performance stays at least at the same level as before.
Some data about the setup:
We have a huge amount of custom variables:
Attachments
Changesets
2016-08-01 15:54:03 +00:00 by mfriedrich 1074102
2016-08-02 12:37:16 +00:00 by mfriedrich cd5c936
2016-08-02 13:05:21 +00:00 by mfriedrich 1ff6939
2016-08-03 14:15:22 +00:00 by mfriedrich 00f05a8
2016-08-11 15:43:39 +00:00 by mfriedrich d84872f
Relations:
The text was updated successfully, but these errors were encountered: