Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Use run_interval for log_summary #2472

Merged
merged 1 commit into from
Apr 15, 2020
Merged

Conversation

mfornet
Copy link
Member

@mfornet mfornet commented Apr 15, 2020

Temporary solution for #2463

@gitpod-io
Copy link

gitpod-io bot commented Apr 15, 2020

Copy link
Collaborator

@bowenwang1996 bowenwang1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether this is intended. I think the point is that client actor should not be blocked but it is less of an issue if log is delayed.

@maxzaver
Copy link
Contributor

I am not sure whether this is intended. I think the point is that client actor should not be blocked but it is less of an issue if log is delayed.

Printing a string into stdout should be lightweight so it should not be blocking anything.

@mfornet
Copy link
Member Author

mfornet commented Apr 15, 2020

I am not sure whether this is intended. I think the point is that client actor should not be blocked but it is less of an issue if log is delayed.

@bowenwang1996 This PR address the annoying issue of delayed logs when client actor is blocked. I think we should approve this, since this is the correct way to schedule summary logs.

@mfornet mfornet requested a review from bowenwang1996 April 15, 2020 23:14
@nearprotocol-bulldozer nearprotocol-bulldozer bot merged commit b129c63 into master Apr 15, 2020
@nearprotocol-bulldozer nearprotocol-bulldozer bot deleted the tmp-fix-2463 branch April 15, 2020 23:57
bowenwang1996 added a commit that referenced this pull request Apr 27, 2020
#2472 is supposed to make log summary run regularly and not get blocked by something else on the event loop. However, it didn't actually achieve that. What's worse, I think it caused the correct number of calls to be scheduled, but not executed at the same time. When the event loop is busy, it can end up executing 10 calls at the same time, which results in the confusing logs:
```
Apr 22 05:37:57.914  INFO stats: # 3166239 Downloading headers 48% -/10  6/6/40 peers ⬇ 11.7kiB/s ⬆ 22 B/s 0.00 bps 0 gas/s CPU: 110%, Mem: 50.2 MiB    
Apr 22 05:37:57.915  INFO stats: # 3166239 Downloading headers 48% -/10  6/6/40 peers ⬇ 11.7kiB/s ⬆ 22 B/s 0.00 bps 0 gas/s CPU: NaN%, Mem: 50.2 MiB    
Apr 22 05:37:57.916  INFO stats: # 3166239 Downloading headers 48% -/10  6/6/40 peers ⬇ 11.7kiB/s ⬆ 22 B/s NaN bps 9223372.04 Tgas/s CPU: NaN%, Mem: 50.2 MiB    
Apr 22 05:38:48.839  INFO stats: # 3166239 Downloading headers 48% -/10  7/7/40 peers ⬇ 26.1kiB/s ⬆ 1.8kiB/s 0.00 bps 0 gas/s CPU: 108%, Mem: 52.5 MiB    
Apr 22 05:38:48.840  INFO stats: # 3166239 Downloading headers 48% -/10  7/7/40 peers ⬇ 26.1kiB/s ⬆ 1.8kiB/s NaN bps 9223372.04 Tgas/s CPU: NaN%, Mem: 52.5 MiB
```
This is because time elapsed between two `log_summary` calls is less than one millisecond and it causes some divide by zero errors. This PR therefore reverts the change in #2472.

Test plan
---------
Run a node locally to see that there is no more `NaN`s.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants