-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement unclean-shutdown marker #21893
Conversation
Example execution
And after yet another
|
I think in general it would be more useful to log each shutdown separately. E.g.
You can also use PrettyDuration to not have too big of a age string there, and I'd also use |
Otherwise we only knwo that there was 5 shutdown, but we don;t know if they are recent or old, so it's not particularly useful. We might also add some form of cleanup so we don't keep too many old events? |
Yes, if we show many we definitely need to clean them up. So, keep the most recent 5? 10? |
10 recent + a counter to how many older ones there were? |
Ok!
|
The very first time it runs, it prints out
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicks
core/rawdb/accessors_metadata.go
Outdated
var uncleanShutdowns ucmList | ||
// Read old data | ||
if data, err := db.Get(uncleanShutdownKey); err != nil { | ||
log.Warn("Error reading USM", "error", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps: Failed to read the unclean shutdown markers?
if discards > 0 { | ||
log.Warn("Old unclean shutdowns found", "count", discards) | ||
} | ||
for _, tstamp := range uncleanShutdowns { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's inaccurate actually. It's the timestart of the START, not the CRASH. Usually geth will run for a very long time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have an underlying thread to update the shutdown marker e.g. every 1 minute? For the cost wise, it's basically no overhead but at least we can decrease the bias
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could do that in a follow-up PR?
For now I can replace time
with booted
, to clarify that it's the startup timestamp, not the crash timestamp
core/rawdb/accessors_metadata.go
Outdated
// Add a new (but cap it) | ||
uncleanShutdowns.Recent = append(uncleanShutdowns.Recent, uint64(time.Now().Unix())) | ||
if l := len(uncleanShutdowns.Recent); l > crashesToKeep+1 { | ||
uncleanShutdowns.Recent = uncleanShutdowns.Recent[l-crashesToKeep-1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a pretty fancy way to discard the first element, wouldn't Recent[1:]
suffice?
(I assume that only one element can be added/discarded, since Discarded++
is not Discarded += len(Recents)-crashesToKeep
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Myeah, I intended for it to handle the case where we discard more than one, e.g. if we decide to change the limit to 5
or there's some mishap.
So the proper thing to do is rather to uncrease uncleanShutdowns.Discarded
appropriately. Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Replaces #21133
Some todos :
https://github.com/ethereum/go-ethereum/pull/21133/files#r437253799: