-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[redis] new post 2.8 replication metrics #1350
Conversation
@@ -215,6 +218,16 @@ def _check_db(self, instance, custom_tags=None): | |||
self.warning("{0} key not found in redis".format(key)) | |||
self.gauge('redis.key.length', 0, tags=key_tags) | |||
|
|||
# Save the replication delay for each slave | |||
for key in info.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldn't use .keys () as it would create a copy of the keys.
You should just do:
for key in info
Looks great besides the nitpicks and the comments i told you offline! Thanks a lot! |
5a93dd3
to
52d68fa
Compare
master_offset = info.get('master_repl_offset') | ||
if slave_offset and master_offset and master_offset - slave_offset >= 0: | ||
delay = master_offset - slave_offset | ||
slave_tags = tags + ['slave_ip:%s' % info[key]['ip']] if 'ip' in info[key] else tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more readable this way:
slave_tags = tags
for slave_tag in ('ip', 'port'):
if slave_tag in info[key]:
slave_tags.append('slave_{0}:{1}'.format(slave_tag, info[key][slave_tag]))
@hkaj thanks for taking a stab at it.
Thanks 👍 |
@LeoCavaille Sorry for the delay. I think the last commits address your comments, please let me know if anything else is missing. |
New metrics: redis.replication.backlog_histlen shows the current replication backlog size in bytes used from the allocated repl-backlog-size in the config. redis.replication.master_repl_offset this metric is available on masters, it shows the current offset of the replication process on the master in bytes redis.replication.slave_repl_offset idem for the slaves. redis.replication.delay on masters only, measures the difference between the last offset reported by the slave and the current replication offset, which indicates how far your slave is behind. This metric is tagged by slave_ip, slave_port and slave_id (internal ID in the master). Also added on Travis a replication config between the two instances we have to assert the presence of this new metric. Bonus: now the redis instances that were launched by the CI are cleaned properly by the cleanup task. see issue #915
d395c61
to
8cc0cb7
Compare
def check(self, instance): | ||
if (not "host" in instance or not "port" in instance) and not "unix_socket_path" in instance: | ||
if ("host" not in instance or "port" not in instance) and "unix_socket_path" not in instance: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that's exactly what we want, see https://gist.github.com/LeoCavaille/801f5bb9288a94ca8d94
Especially the 4th example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LeoCavaille I only moved the "not" keywords to the right place, that didn't change the behavior, which I think is what we want.
As long as there's a socket_path we're good, if there is no socket path, there must be a host and a port. We probably don't want to raise an exception if a user switch from a host:port couple to a socket and forget to remove the host from the config.
See https://github.com/DataDog/dd-agent/blob/haissam/redis-repl-metrics/checks.d/redisdb.py#L103-L107 for the use case.
Except the last small nitpick on this config check, it looks good. Just waiting to hear back from you on that and then I'll merge it! Thanks |
👍 |
[redis] new post 2.8 replication metrics
[redis] add new replication metrics for >= 2.8 instances
New metrics:
redis.replication.backlog_histlen
shows the current replication backlog size in bytes used from the
allocated
repl-backlog-size
in the config.redis.replication.master_repl_offset
this metric is available on masters, it shows the current offset of
the replication process on the master in bytes
redis.replication.slave_repl_offset
idem for the slaves.
redis.replication.delay
on masters only, measures the difference between the last offset
reported by the slave and the current replication offset, which
indicates how far your slave is behind.
This metric is tagged by
slave_ip
,slave_port
andslave_id
(internal ID in the master).
Also added on Travis a replication config between the two instances we
have to assert the presence of this new metric.
Bonus: now the redis instances that were launched by the CI are cleaned
properly by the
cleanup
task.see issue #915