Skip to content

Commit

Permalink
[network checks] Kill skip_event option (#1054)
Browse files Browse the repository at this point in the history
  • Loading branch information
zippolyte authored Mar 17, 2018
1 parent c4c84cc commit bbe194d
Show file tree
Hide file tree
Showing 15 changed files with 21 additions and 202 deletions.
2 changes: 0 additions & 2 deletions dns_check/datadog_checks/dns_check/dns_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,8 +128,6 @@ def _get_tags(self, instance):
def report_as_service_check(self, sc_name, status, instance, msg=None):
tags = self._get_tags(instance)

instance['skip_event'] = True

if status == Status.UP:
msg = None

Expand Down
4 changes: 3 additions & 1 deletion http_check/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
# CHANGELOG - http_check

1.4.1 / Unreleased
2.0.0 / Unreleased
==================

### Changes

* [BUGFIX] Make import of default certificate file relative rather than absolute
Fixes loading problem on Windows, and/or allows check to be installed in other
location
* [DEPRECATION] Remove the `skip_event` option from the check. See [#1054][]

1.4.0 / 2018-02-13
==================
Expand Down Expand Up @@ -78,3 +79,4 @@
[#758]: https://github.com/DataDog/integrations-core/issues/758
[@xkrt]: https://github.com/xkrt
[#905]:https://github.com/DataDog/integrations-core/pull/905
[#1054]:https://github.com/DataDog/integrations-core/pull/1054
7 changes: 1 addition & 6 deletions http_check/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,8 @@ instances:
# days_warning: 28 # default 14
# days_critical: 14 # default 7
# timeout: 3 # in seconds. Default is 1.
skip_event: true # Default is false, i.e. emit events instead of service checks. Recommend to set to true.
- name: Example website (staging)
url: http://staging.example.com/
skip_event: true
```

The HTTP check has more configuration options than many checks — many more than are shown above. Most options are opt-in, e.g. the Agent will not check SSL validation unless you configure the requisite options. Notably, the Agent _will_ check for soon-to-expire SSL certificates by default.
Expand All @@ -54,7 +52,6 @@ See the [sample http_check.yaml](https://github.com/DataDog/integrations-core/bl
| `check_certificate_expiration` | When `check_certificate_expiration` is enabled, the service check will check the expiration date of the SSL certificate. Note that this will cause the SSL certificate to be validated, regardless of the value of the `disable_ssl_validation` setting. |
| `days_warning` & `days_critical` | When `check_certificate_expiration` is enabled, these settings will raise a warning or critical alert when the SSL certificate is within the specified number of days from expiration. |
| `headers` | This parameter allows you to send additional headers with the request. Please see the [example YAML file](https://github.com/DataDog/integrations-core/blob/master/http_check/conf.yaml.example) for additional information and caveats. |
| `skip_event` | When enabled, the check will not create an event. This is useful to avoid duplicates with a server side service check. This defaults to `false`. |
| `skip_proxy` | If set, the check will bypass proxy settings and attempt to reach the check url directly. This defaults to `false`. |
| `allow_redirects` | This setting allows the service check to follow HTTP redirects and defaults to `true`.
| `tags` | A list of arbitrary tags that will be associated with the check. For more information about tags, please see our [Guide to tagging](/guides/tagging/) and blog post, [The power of tagged metrics](https://www.datadoghq.com/blog/the-power-of-tagged-metrics/) |
Expand Down Expand Up @@ -92,9 +89,7 @@ See [metadata.csv](https://github.com/DataDog/integrations-core/blob/master/http

### Events

Older versions of the HTTP check only emitted events to reflect site status, but now the check supports service checks, too. However, emitting events is still the default behavior. Set `skip_event` to true for all configured instances to submit service checks instead of events.

The Agent will soon deprecate `skip_event`, i.e. the HTTP check will only support service checks.
The HTTP check does not include any event at this time.

### Service Checks

Expand Down
6 changes: 0 additions & 6 deletions http_check/conf.yaml.example
Original file line number Diff line number Diff line change
Expand Up @@ -117,12 +117,6 @@ instances:
# Host: alternative.host.example.com
# X-Auth-Token: SOME-AUTH-TOKEN

# The (optional) skip_event parameter will instruct the check to not
# create any event to avoid duplicates with a server side service check.
# This default to False.
#
skip_event: true

# The (optional) skip_proxy parameter would bypass any proxy settings enabled
# and attempt to reach the the URL directly.
# If no proxy is defined at any level, this flag bears no effect.
Expand Down
2 changes: 1 addition & 1 deletion http_check/datadog_checks/http_check/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

HTTPCheck = http_check.HTTPCheck

__version__ = "1.4.1"
__version__ = "2.0.0"

__all__ = ['http_check']
79 changes: 1 addition & 78 deletions http_check/datadog_checks/http_check/http_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
match_hostname

# project
from checks.network_checks import EventType, NetworkCheck, Status
from checks.network_checks import NetworkCheck, Status
from config import _is_affirmative
from util import headers as agent_headers

Expand Down Expand Up @@ -359,83 +359,6 @@ def send_status_down(loginfo, message):

return service_checks

# FIXME: 5.3 drop this function
def _create_status_event(self, sc_name, status, msg, instance):
# Create only this deprecated event for old check
if sc_name != self.SC_STATUS:
return
# Get the instance settings
url = instance.get('url', None)
name = instance.get('name', None)
nb_failures = self.statuses[name][sc_name].count(Status.DOWN)
nb_tries = len(self.statuses[name][sc_name])
tags = instance.get('tags', [])
tags_list = []
tags_list.extend(tags)

# Only add the URL tag if it's not already present
if not filter(re.compile('^url:').match, tags_list):
tags_list.append('url:%s' % url)

# Get a custom message that will be displayed in the event
custom_message = instance.get('message', "")
if custom_message:
custom_message += " \n"

# Let the possibility to override the source type name
instance_source_type_name = instance.get('source_type', None)
if instance_source_type_name is None:
source_type = "%s.%s" % (NetworkCheck.SOURCE_TYPE_NAME, name)
else:
source_type = "%s.%s" % (NetworkCheck.SOURCE_TYPE_NAME, instance_source_type_name)

# Get the handles you want to notify
notify = instance.get('notify', self.init_config.get('notify', []))
notify_message = ""
if notify:
notify_list = []
for handle in notify:
notify_list.append("@%s" % handle.strip())
notify_message = " ".join(notify_list) + " \n"

if status == Status.DOWN:
# format the HTTP response body into the event
if isinstance(msg, tuple):
code, reason, content = msg

# truncate and html-escape content
if len(content) > 200:
content = content[:197] + '...'

msg = u"%d %s\n\n%s" % (code, reason, content)
msg = msg.rstrip()

title = "[Alert] %s reported that %s is down" % (self.hostname, name)
alert_type = "error"
msg = u"%s %s %s reported that %s (%s) failed %s time(s) within %s last attempt(s)."\
" Last error: %s" % (notify_message, custom_message, self.hostname,
name, url, nb_failures, nb_tries, msg)
event_type = EventType.DOWN

else: # Status is UP
title = "[Recovered] %s reported that %s is up" % (self.hostname, name)
alert_type = "success"
msg = u"%s %s %s reported that %s (%s) recovered" \
% (notify_message, custom_message, self.hostname, name, url)
event_type = EventType.UP

return {
'timestamp': int(time.time()),
'event_type': event_type,
'host': self.hostname,
'msg_text': msg,
'msg_title': title,
'alert_type': alert_type,
"source_type_name": source_type,
"event_object": name,
"tags": tags_list
}

def report_as_service_check(self, sc_name, status, instance, msg=None):
instance_name = self.normalize(instance['name'])
url = instance.get('url', None)
Expand Down
2 changes: 1 addition & 1 deletion http_check/manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"mac_os",
"windows"
],
"version": "1.4.1",
"version": "2.0.0",
"guid": "eb133a1f-697c-4143-bad3-10e72541fa9c",
"public_title": "Datadog-HTTP Check Integration",
"categories":["web", "network"],
Expand Down
1 change: 0 additions & 1 deletion snmp/datadog_checks/snmp/snmp.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,6 @@ def __init__(self, name, init_config, agentConfig, instances):
for instance in instances:
if 'name' not in instance:
instance['name'] = self._get_instance_key(instance)
instance['skip_event'] = True

self.generators = {}

Expand Down
7 changes: 7 additions & 0 deletions tcp_check/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# CHANGELOG - tcp_check

2.0.0 / Unreleased
==================

### Changes

* [DEPRECATION] Remove the `skip_event` option from the check. See [#1054][]

1.0.0 / 2017-03-22
==================

Expand Down
4 changes: 0 additions & 4 deletions tcp_check/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ instances:
- name: SSH check
host: jumphost.example.com # or an IPv4/IPv6 address
port: 22
skip_event: true # if false, the Agent will emit both events and service checks for this port; recommended true (i.e. only submit service checks)
collect_response_time: true # to collect network.tcp.response_time. Default is false.
```

Expand All @@ -31,7 +30,6 @@ Configuration Options
* `port` (Required) - Port to be checked. This will be included as a tag: `url:<host>:<port>`.
* `timeout` (Optional) - Timeout for the check. Defaults to 10 seconds.
* `collect_response_time` (Optional) - Defaults to false. If this is not set to true, no response time metric will be collected. If it is set to true, the metric returned is `network.tcp.response_time`.
* `skip_event` (Optional) - Defaults to false. Set to true to skip creating an event. This option will be removed in a future version and will default to true.
* `tags` (Optional) - Tags to be assigned to the metric.

[Restart the Agent](https://docs.datadoghq.com/agent/faq/agent-commands/#start-stop-restart-the-agent) to start sending TCP service checks and response times to Datadog.
Expand Down Expand Up @@ -71,8 +69,6 @@ The TCP check does not include any event at this time.

Returns DOWN if the Agent cannot connect to the configured `host` and `port`, otherwise UP.

Older versions of the TCP check only emitted events to reflect changes in connectivity. This was eventually deprecated in favor of service checks, but you can still have the check emit events by setting `skip_event: false`.

To create alert conditions on this service check in the Datadog app, click **Network** on the [Create Monitor](https://app.datadoghq.com/monitors#/create) page, not **Integration**.

## Troubleshooting
Expand Down
6 changes: 0 additions & 6 deletions tcp_check/conf.yaml.example
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,6 @@ instances:
# - customtag1
# - customtag2

# The (optional) skip_event parameter will instruct the check to not
# create any event to avoid duplicates with a server side service check.
# This default to False.
#
skip_event: true

# - name: My second service
# host: 127.0.0.1
# port: 80
2 changes: 1 addition & 1 deletion tcp_check/datadog_checks/tcp_check/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

TCPCheck = tcp_check.TCPCheck

__version__ = "1.0.0"
__version__ = "2.0.0"

__all__ = ['tcp_check']
58 changes: 1 addition & 57 deletions tcp_check/datadog_checks/tcp_check/tcp_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import time

# project
from checks.network_checks import EventType, NetworkCheck, Status
from checks.network_checks import NetworkCheck, Status


class BadConfException(Exception):
Expand Down Expand Up @@ -100,62 +100,6 @@ def _check(self, instance):
self.log.debug("%s:%s is UP" % (addr, port))
return Status.UP, "UP"

# FIXME: 5.3 remove that
def _create_status_event(self, sc_name, status, msg, instance):
# Get the instance settings
host = instance.get('host', None)
port = instance.get('port', None)
name = instance.get('name', None)
nb_failures = self.statuses[name][sc_name].count(Status.DOWN)
nb_tries = len(self.statuses[name][sc_name])

# Get a custom message that will be displayed in the event
custom_message = instance.get('message', "")
if custom_message:
custom_message += " \n"

# Let the possibility to override the source type name
instance_source_type_name = instance.get('source_type', None)
if instance_source_type_name is None:
source_type = "%s.%s" % (NetworkCheck.SOURCE_TYPE_NAME, name)
else:
source_type = "%s.%s" % (NetworkCheck.SOURCE_TYPE_NAME, instance_source_type_name)

# Get the handles you want to notify
notify = instance.get('notify', self.init_config.get('notify', []))
notify_message = ""
if notify:
notify_list = []
for handle in notify:
notify_list.append("@%s" % handle.strip())
notify_message = " ".join(notify_list) + " \n"

if status == Status.DOWN:
title = "[Alert] %s reported that %s is down" % (self.hostname, name)
alert_type = "error"
msg = """%s %s %s reported that %s (%s:%s) failed %s time(s) within %s last attempt(s).
Last error: %s""" % (notify_message,
custom_message, self.hostname, name, host, port, nb_failures, nb_tries, msg)
event_type = EventType.DOWN

else: # Status is UP
title = "[Recovered] %s reported that %s is up" % (self.hostname, name)
alert_type = "success"
msg = "%s %s %s reported that %s (%s:%s) recovered." % (notify_message,
custom_message, self.hostname, name, host, port)
event_type = EventType.UP

return {
'timestamp': int(time.time()),
'event_type': event_type,
'host': self.hostname,
'msg_text': msg,
'msg_title': title,
'alert_type': alert_type,
"source_type_name": source_type,
"event_object": name,
}

def report_as_service_check(self, sc_name, status, instance, msg=None):
instance_name = self.normalize(instance['name'])
host = instance.get('host', None)
Expand Down
2 changes: 1 addition & 1 deletion tcp_check/manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"mac_os",
"windows"
],
"version": "1.0.0",
"version": "2.0.0",
"guid": "c514029e-0ed8-4c9f-abe5-2fd4096726ba",
"public_title": "Datadog-TCP Check Integration",
"categories":["network", "web"],
Expand Down
Loading

0 comments on commit bbe194d

Please sign in to comment.