Add tracing in etcd server #11166

jingyih · 2019-09-18T23:33:28Z

Today, etcd server has generic warning if a request is taking too long to be applied. If a range request takes over 100ms to finish, the server generates a log message like this:
W | etcdserver: read-only range request "key:\"foo\" " with result "range_response_count:0 size:5" took too long (1.310910984s) to execute

When this happens, it is usually not easy to track down the exact cause. There are multiple OSS issues opened regarding this, both in etcd and Kubernetes repo. I think the most interesting request type is range, because: A) it is usually frequently served. B) client could ask for a lot of keys in a single request, which often results in a very long time to finish.

When a range request is taking too long, it will be good to know the time spend on each step of the request lifecycle. Example steps could be:

Get read index from raft leader
Wait until the server’s local backend store finishes applying the raft entry corresponding to the leader’s read index.
Check user’s permission on the requested keys
Get range result from mvcc store
- Range keys in in-memory index tree
- Range keys in underlying bolt db
Filter and sort key value pairs

At the same time, we need to make sure:

Tracing does not generate too much entries in log.
Tracing does not noticeably degrade server performance.

The text was updated successfully, but these errors were encountered:

jingyih · 2019-09-18T23:33:35Z

I think @YoyinZyc has already started working on this feature.

jfbai · 2019-12-04T07:57:10Z

@jingyih @YoyinZyc Hi, is this trace log enabled by default? I did not observed trace log along with took too long.

My etcd server version is 3.4.2, which is upgraded from 3.3.10.

YoyinZyc · 2019-12-04T18:18:12Z

Jingyi Hu Yuchen Zhou Hi, is this trace log enabled by default? I did not observed trace log along with took too long.

My etcd server version is 3.4.2, which is upgraded from 3.3.10.

Could you provide more details about the request you made? Basically, tracing is enabled for range, put and compact requests. Btw, it works only when you enable --logger=zap which is not default currently.

jfbai · 2019-12-05T02:44:04Z

Jingyi Hu Yuchen Zhou Hi, is this trace log enabled by default? I did not observed trace log along with took too long.
My etcd server version is 3.4.2, which is upgraded from 3.3.10.

Could you provide more details about the request you made? Basically, tracing is enabled for range, put and compact requests. Btw, it works only when you enable --logger=zap which is not default currently.

@YoyinZyc Got it, thanks a lot. :)

In my case, etcd is used as k8s backend storage, range, put operations are much more concerned.

stale · 2020-04-06T18:27:09Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

jingyih added the type/feature label Sep 18, 2019

YoyinZyc mentioned this issue Sep 24, 2019

Add tracing in etcd server to range, put and compact requests #11179

Merged

stale bot added the stale label Apr 6, 2020

stale bot closed this as completed Apr 27, 2020

luwang-vmware mentioned this issue Jun 28, 2020

'get authentication metadata' and 'range keys from in-memory index tree' costs above 500ms and 1s sometimes #12029

Closed

dashpole mentioned this issue Nov 6, 2020

Add Distributed Tracing using OpenTelemetry #12460

Open

giorio94 mentioned this issue May 26, 2023

Clustermesh-apiserver: CiliumEndpoint out of order deletions leads to dropping packets for valid network policies cilium/cilium#25562

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tracing in etcd server #11166

Add tracing in etcd server #11166

jingyih commented Sep 18, 2019

jingyih commented Sep 18, 2019 •

edited

Loading

jfbai commented Dec 4, 2019

YoyinZyc commented Dec 4, 2019

jfbai commented Dec 5, 2019

stale bot commented Apr 6, 2020

Add tracing in etcd server #11166

Add tracing in etcd server #11166

Comments

jingyih commented Sep 18, 2019

jingyih commented Sep 18, 2019 • edited Loading

jfbai commented Dec 4, 2019

YoyinZyc commented Dec 4, 2019

jfbai commented Dec 5, 2019

stale bot commented Apr 6, 2020

jingyih commented Sep 18, 2019 •

edited

Loading