Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay child disconnect update #18712

Merged
merged 4 commits into from
Oct 9, 2024

Conversation

stelfrag
Copy link
Collaborator

@stelfrag stelfrag commented Oct 8, 2024

Summary

When a child node disconnects from a parent delay sending the disconnect message to the cloud expecting the child
will be restarting.

The parent will use information from the child (start and shutdown times) to calculate when the child is expected to reconnect before sending out the message to the cloud.

It will wait (start + shutdown) * 1.25 seconds (max of 30 seconds) before sending the message.

thiagoftsm
thiagoftsm previously approved these changes Oct 8, 2024
Copy link
Contributor

@thiagoftsm thiagoftsm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is working as expected. LGTM!

@stelfrag
Copy link
Collaborator Author

stelfrag commented Oct 9, 2024

Dealing with an issue on older systems...

Assume the child will return at 125% of that time
Add timers to handle scheduluing of the node update events to the cloud
Handle faster reconnection and reschedule timer to correct state
@thiagoftsm
Copy link
Contributor

It looks like our sqlite has an issue:

In function 'sqlite3Strlen30',
    inlined from 'sqlite3ColumnSetColl' at /home/thiago/Netdata/tests_netdata/src/database/sqlite/sqlite3.c:121492:10:
/home/thiago/Netdata/tests_netdata/src/database/sqlite/sqlite3.c:34720:28: warning: 'strlen' reading 1 or more bytes from a region of size 0 [-Wstringop-overread]
34720 |   return 0x3fffffff & (int)strlen(z);
      |                            ^~~~~~~~~

We should update it when it is possible.

Copy link
Contributor

@thiagoftsm thiagoftsm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is fine. On local dashboard and cloud I had the child set as stale.

@stelfrag stelfrag merged commit 9bf07d0 into netdata:master Oct 9, 2024
140 checks passed
@stelfrag stelfrag deleted the delay-disconnect-update branch October 9, 2024 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants