Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraph Docker Input Plugin : per docker running status support #4259

Merged
merged 8 commits into from
Jun 18, 2018
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions plugins/inputs/docker/docker.go
Original file line number Diff line number Diff line change
Expand Up @@ -435,6 +435,22 @@ func (d *Docker) gatherContainer(
}
}
}
if info.State != nil {
statefields := map[string]interface{}{
"status": info.State.Status,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have this value as a tag on all docker_container_* measurements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be too much of duplication since the same information is across multiple measurements?

Copy link
Contributor

@danielnelson danielnelson Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little bit of duplication, but it seems useful to be able to "group by" status on any of the container measurements.

Side note: If I could do it all over from scratch I would just have a single docker_container measurement, but now it's not worth the disruption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the code to add status for each container . It can get tricky for cases like CPU(where there is a per-cpu option) and Network(where there is a per interface option). Also since these collectors are all over the place , we would have to make a separate "docker inspect" API call at each of these places . It seems a bit inefficient considering how it currently is designed . Would you still like me to do it that way?
To handle the "group by" feature you requested , we could plug in "container_id" into this which allows you to match with it.

Copy link
Contributor

@danielnelson danielnelson Jun 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not if we need to make extra API calls. Let's just add it to the new measurement as a tag and we can try to fish it through to other measurements later on, if needed.

Copy link
Contributor Author

@prashanthjbabu prashanthjbabu Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this one ! I've added it as a tag.. So now it comes not only for docker_container_status , but also for cpu,mem etc

Attached some logs :

docker_container_status,container_image=ubuntu,container_name=vibrant_benz,container_status=running,container_version=unknown,deviceid=mydeviceid,engine_host=raspberrypi,server_version=18.05.0-ce started_at="2018-06-14T05:48:53.266176036Z",finished_at="0001-01-01T00:00:00Z",oomkilled=false,pid=5278i,exitcode=0i 1528956922000000000
docker_container_mem,container_image=ubuntu,container_name=vibrant_benz,container_status=running,container_version=unknown,deviceid=mydeviceid,engine_host=raspberrypi,server_version=18.05.0-ce limit=0i,usage=0i,max_usage=0i,usage_percent=0,container_id="f85ef98b08bde3ad6a93fadcacfa33a57a1f833119cce0309cadb15c45df5cf4" 1528956921000000000
docker_container_cpu,container_image=ubuntu,container_name=vibrant_benz,container_status=running,container_version=unknown,cpu=cpu-total,deviceid=mydeviceid,engine_host=raspberrypi,server_version=18.05.0-ce usage_total=102424740i,usage_in_usermode=60000000i,usage_system=10927430000000i,throttling_periods=0i,throttling_throttled_time=0i,usage_in_kernelmode=20000000i,throttling_throttled_periods=0i,container_id="f85ef98b08bde3ad6a93fadcacfa33a57a1f833119cce0309cadb15c45df5cf4",usage_percent=0 1528956921000000000
docker_container_cpu,container_image=ubuntu,container_name=vibrant_benz,container_status=running,container_version=unknown,cpu=cpu0,deviceid=mydeviceid,engine_host=raspberrypi,server_version=18.05.0-ce usage_total=7975104i,container_id="f85ef98b08bde3ad6a93fadcacfa33a57a1f833119cce0309cadb15c45df5cf4" 1528956921000000000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests seem to be failing after adding "container_status" as a tag . I'm not quite sure how to deal with this. Any ideas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this . All tests pass now.

"running": info.State.Running,
"paused": info.State.Paused,
"restarting": info.State.Restarting,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to save the running, paused, restarting, dead values, since this is all encompassed by the status field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point . I will remove them....

"oomkilled": info.State.OOMKilled,
"dead": info.State.Dead,
"pid": info.State.Pid,
"exitcode": info.State.ExitCode,
"error": info.State.Error,
"startedat": info.State.StartedAt,
"finishedat": info.State.FinishedAt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure if error, startedat, finishedat are interesting enough to include, I feel like many people will want to exclude them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel startedat,finishedat is important since it gives useful information on the uptime of the container or even how long ago the container died(based on "status") . If you still think its not needed , we can take it off!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding error . I will remove it..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, lets just rename them started_at and finished_at and convert them to unixnano format (unix epoch timestamp in nanosecond precision). This is the format we have been using recently for encoding timestamps into a field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed this to started_at and finished_at .
I have a query regarding converting the time from RFC3339 format (Example : 2018-06-14T05:48:53.266176036Z ) to Unix Nano format . Golang has the option to convert using the following code :
t1, err := time.Parse(time.RFC3339,"0001-01-01T00:00:00Z")
and then use
t1.UnixNano()
Here's some sample output of the timestamps :

docker_container_status,container_image=ubuntu,container_name=vibrant_benz,container_status=running,container_version=unknown,deviceid=mydeviceid,engine_host=raspberrypi,server_version=18.05.0-ce started_at="2018-06-14T05:48:53.266176036Z",finished_at="0001-01-01T00:00:00Z",oomkilled=false,pid=5278i,exitcode=0i 1528956922000000000

In the above case the docker is still running so it has a valid start time , although its endtime is invalid which is "0001-01-01T00:00:00Z" . If i use the above go code to convert this to unix nano , I get "-6795364578871345152" . As you can see it returns a long negative invalid value . Would this be okay? How else would you like me to handle this ?If this is fine then I will go ahead and commit this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should be able to use time.IsZero(), and if true skip adding the field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! I've done that.

}
acc.AddFields("docker_container_status", statefields, tags, time.Now())
}

if info.State.Health != nil {
healthfields := map[string]interface{}{
Expand Down