Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #8897] Duplicated scheduled downtimes created in cluster HA zone #2844

Closed
icinga-migration opened this issue Mar 30, 2015 · 16 comments · Fixed by #6820
Closed
Assignees
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working ref/IP ref/NC
Milestone

Comments

@icinga-migration
Copy link

This issue has been migrated from Redmine: https://dev.icinga.com/issues/8897

Created by dgoetz on 2015-03-30 07:52:53 +00:00

Assignee: (none)
Status: New
Target Version: Backlog
Last Update: 2016-11-09 14:52:13 +00:00 (in Redmine)

Icinga Version: 2.3.0
Backport?: Not yet backported
Include in Changelog: 1

Setup

  • ha master with 2 nodes
  • one config master with zones.d
  • ScheduledDowntime object configuration

Problem

Both cluster nodes sync their configuration, and will schedule a new downtime from the configuration objects.
The cluster event will be sent to each other and creating an additional downtime with the same time window, but different id.

Proposed Solution

Check for config_owner attribute in AddDowntime() and do not add the downtime if there already exists one.
This is similar to how we prevent deleting downtimes at runtime which are generated from config objects.

Original description

A Scheduled Downtime creates a Downtime on every host in ha zone, this causes no problems except of to many objects and bloated views.


Relations:

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-04-04 12:07:39 +00:00

  • Status changed from New to Feedback
  • Assigned to set to dgoetz

Can you elaborate in detail (configuration, logs) what you mean by that? I don't see a bug here.

@icinga-migration
Copy link
Author

Updated by dgoetz on 2015-04-07 08:40:10 +00:00

Can try to get logs and config on Thursday, but it is simple to reproduce. Create a Schedulded Downtime in a high available zone and every icinga 2 host will create the Downtime object by its own, so it will be duplicated (or more because it is multiplied by the number of hosts).

( I have discussed this with Gunnar already and he confirmed it)

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-06-18 08:57:14 +00:00

  • Category set to Cluster
  • Status changed from Feedback to New
  • Assigned to deleted dgoetz

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-07-02 13:50:19 +00:00

  • Subject changed from Scheduled Downtime created by every host in ha zone to Duplicated scheduled downtimes created in cluster HA zone
  • Description updated
  • Target Version set to Backlog

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-07-02 13:55:17 +00:00

The problem I still see is how to determine whether a downtime already exists with the same content. Do we compare by id, or more information passed?

@icinga-migration
Copy link
Author

Updated by frjaraur on 2015-08-27 10:55:13 +00:00

I don't now if it related to something I have seen on my configuration.
We create a downtime for Developement hosts and when we check this configuration on icingaweb2 on Overview>Downtimes I get 5 downtimes configurations per host...
My environment has 2 masters in ha-cluster and 4 satellites. Why we get 5 downtimes per host? maybe one per satellite plus one for the master?

Thanks for Your Work,
Javier R.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-12 09:08:22 +00:00

A different approach: Make downtimes a configuration object which is synced thoughout the HA cluster and depend on its version - only create/update it if the sender's version is newer than the current one.

That's part of #9927 and #9777.

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-12 09:08:31 +00:00

  • Relates set to 9927

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2015-09-12 09:08:37 +00:00

  • Relates set to 9777

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-03-04 15:54:13 +00:00

  • Parent Id set to 11313

@icinga-migration
Copy link
Author

Updated by mfriedrich on 2016-11-09 14:52:13 +00:00

  • Parent Id deleted 11313

@icinga-migration icinga-migration added bug Something isn't working area/distributed Distributed monitoring (master, satellites, clients) labels Jan 17, 2017
@icinga-migration icinga-migration added this to the Backlog milestone Jan 17, 2017
@dnsmichi dnsmichi modified the milestone: Backlog Sep 13, 2017
@dnsmichi dnsmichi added the help wanted Extra attention is needed label Sep 26, 2017
@dnsmichi dnsmichi removed the help wanted Extra attention is needed label Apr 25, 2018
@dnsmichi
Copy link
Contributor

Duplicate of #4272

@dnsmichi dnsmichi marked this as a duplicate of #4272 Apr 25, 2018
@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 4, 2018

Actually not a duplicate, re-opening.

@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 4, 2018

SDs paused

scheduled_downtime_ha_cluster_paused

Downtimes just once

scheduled_downtime_ha_cluster_just_one_downtime

@dnsmichi dnsmichi removed this from the 2.11.0 milestone Dec 5, 2018
@dnsmichi dnsmichi added this to the 2.10.3 milestone Dec 5, 2018
@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 11, 2019

ref/IP/9673

@dgoetz
Copy link
Contributor

dgoetz commented Feb 25, 2019

ref/NC/590167
ref/NC/591721

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working ref/IP ref/NC
Projects
None yet
3 participants