Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ping Input Plugin adds a success entry to Influxdb even during failure #4549

Closed
prashanthjbabu opened this issue Aug 13, 2018 · 8 comments
Closed
Labels
bug unexpected problem or unintended behavior
Milestone

Comments

@prashanthjbabu
Copy link
Contributor

prashanthjbabu commented Aug 13, 2018

Relevant telegraf.conf:

[[inputs.ping]]
urls = ["8.8.8.8"] # required

System info:

prash@prash-laptop:/Downloads/telegraf/usr/bin$ ./telegraf -version
Telegraf v1.8.0
5f8c983 (git: master 5f8c983)

Description:

Telegraf Ping Plugin does not support certain versions of ping . For example , the busybox version of ping is unsupported since it does not support the interval option (-i) which is needed by the ping plugin.In such cases , it is seen that an entry is made into influx as shown below :
`2018-08-13T11:34:10Z E! Error in plugin [inputs.ping]: Fatal error processing ping output: 192.168.1.1

ping,host=3ebf7bd6fcd0,url=192.168.1.1 result_code=0i 1534160050000000000

2018-08-13T11:34:20Z E! Error in plugin [inputs.ping]: Fatal error processing ping output: 192.168.1.1

ping,host=3ebf7bd6fcd0,url=192.168.1.1 result_code=0i 1534160060000000000

2018-08-13T11:34:30Z E! Error in plugin [inputs.ping]: Fatal error processing ping output: 192.168.1.1
`
As seen above , result_code = 0 meaning success(according to the documentation). This can be misleading since it could imply a successful ping even though ping didn't run at all.

Steps to reproduce:

  1. Run an alpine docker - docker run -ti alpine sh
  2. Copy telegraf and a config file which has ping options
  3. Run telegraf and you will see the output above.

Expected behavior:

result_code should indicate an error instead of success.
Example:
ping,host=e3e8019bc54b,url=192.168.1.1 result_code=2i 1534160890000000000
where result_code=2 can indicate ping error

Actual behavior:

result_code output is "0" indicating successful ping.

Additional info:

@prashanthjbabu
Copy link
Contributor Author

This issue is addressed in PR #4550

@glinton
Copy link
Contributor

glinton commented Aug 14, 2018

Can you paste your full [[inputs.ping]] config please? It shouldn't be setting the interval unless ping_interval is set.

@prashanthjbabu
Copy link
Contributor Author

prashanthjbabu commented Aug 18, 2018

Here's the entire telegraf.conf file that I'm using

/ # cat /tmp/telegraf.conf
[[inputs.ping]]
urls = ["www.google.com"]
[[outputs.file]]
files = ["stdout"]

The error thrown is :

/bin/ping: unrecognized option: i
BusyBox v1.28.4 (2018-05-30 10:45:57 UTC) multi-call binary.

Usage: ping [OPTIONS] HOST

Send ICMP ECHO_REQUEST packets to network hosts

-4,-6		Force IP or IPv6 name resolution
-c CNT		Send only CNT pings
-s SIZE		Send SIZE data bytes in packets (default 56)
-t TTL		Set TTL
-I IFACE/IP	Source interface or IP address
-W SEC		Seconds to wait for the first response (default 10)
		(after all -c CNT packets are sent)
-w SEC		Seconds until ping exits (default:infinite)
		(can exit earlier with -c CNT)
-q		Quiet, only display output at start
		and when finished
-p HEXBYTE	Pattern to use for payload

Looking at the code , it is quite evident that pingInterval has a default value

return &Ping{
pingHost: hostPinger,
PingInterval: 1.0,
Count: 1,
Timeout: 1.0,
Deadline: 10,
}

And the -i option gets set when the PingInterval is > 0

if p.PingInterval > 0 {
args = append(args, "-i", strconv.FormatFloat(p.PingInterval, 'f', -1, 64))
}

I tried changing the default PingInterval value to 0.0 and it worked , although I'm not sure if there would be any collateral by doing so.

@glinton
Copy link
Contributor

glinton commented Aug 21, 2018

You are correct, I only read the Readme which says it defaults to 0. Setting it to 0 will be fine for your case. I'm going to leave this open and rename it as an issue with the ping readme.

@glinton glinton added the bug unexpected problem or unintended behavior label Aug 21, 2018
@prashanthjbabu
Copy link
Contributor Author

So just to make it clear , I just need to set "ping_interval" to 0.0 explicity in telegraf.conf file and it would work for busybox and other such ping applications which don't have too many ping options(like -i for interval).

@glinton
Copy link
Contributor

glinton commented Aug 22, 2018

correct, but there is still the issue as you originally described it, where a valid ping metric gets recorded by telegraf rather than it failing on error.

@prashanthjbabu
Copy link
Contributor Author

Yes that's correct ! The PR mentioned earlier addresses that! Once that goes in , it should be resolved!

@danielnelson danielnelson added this to the 1.7.4 milestone Aug 23, 2018
@danielnelson
Copy link
Contributor

Closed in #4550

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants