Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull Queries: Heartbeat: Don't log errors when heartbeating dead nodes #4807

Closed
vpapavas opened this issue Mar 17, 2020 · 0 comments · Fixed by #4809
Closed

Pull Queries: Heartbeat: Don't log errors when heartbeating dead nodes #4807

vpapavas opened this issue Mar 17, 2020 · 0 comments · Fixed by #4809
Assignees
Milestone

Comments

@vpapavas
Copy link
Member

Is your feature request related to a problem? Please describe.
KsqlTarget LL:202 logs errors when an async http request fails. The HeartbeatAgent of a server keeps sending heartbeats to all previously discovered servers, whether they are alive or dead. If a server is dead, the async request fails and returns an error code which is logged. This spams the log as 1) heartbeating happens very often and 2) these are not really errors so we don't need them in the log.

Background on why the heartbeat agent sends heartbeats to servers that are dead: Assume servers A and B. On startup, if server B is delayed for some reason and does not send heartbeats, A will mark it as dead. If the agent was not sending heartbeats to dead server, then A would not send to B, hence B would mark A as dead as well. Now, both nodes have marked each other as dead and none sends heartbeats to the other. So, we have essentially a deadlock. That is why servers send heartbeats even to servers that are dead.

Describe the solution you'd like
Do not log errors when heartbeats fail if a server is down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants