Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #20859 added unsafe shutdown detector #21133

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion cmd/geth/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,6 @@ func geth(ctx *cli.Context) error {
// miner.
func startNode(ctx *cli.Context, stack *node.Node) {
debug.Memsize.Add("node", stack)

// Start up the node itself
utils.StartNode(stack)

Expand Down
13 changes: 13 additions & 0 deletions eth/backend.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"runtime"
"sync"
"sync/atomic"
"time"

"github.com/ethereum/go-ethereum/accounts"
"github.com/ethereum/go-ethereum/accounts/abi/bind"
Expand Down Expand Up @@ -231,6 +232,15 @@ func New(ctx *node.ServiceContext, config *Config) (*Ethereum, error) {
return nil, err
}

// Check for invalid shutdown
invalidShutdown, _ := chainDb.Get([]byte("unsafe-shutdown"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're using unsafe-shutdown in quite a few places and in different packages, it might make sense to move this into core/rawdb/schema.go. We have a lot of other metadata keys defined there.

if invalidShutdown != nil {
log.Error("unsafe shutdown detected", "time", string(invalidShutdown))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please capitalize the log message. Also please use a proper time value, not a pre-stringified version.

}
// Create an invalid shutdown in database in case the app crashed
if err = chainDb.Put([]byte("unsafe-shutdown"), []byte(time.Now().String())); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls use time.Now().Unix() to store the current timestamp instead of a localized string

log.Warn("Failed to record possible future unsafe shutdown", "err", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to return nil, err here. Although it's not a critical system, if writing to the database failed something's going to go belly up either way.

}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR uses a single unsafe-shutdown key to store a single timestamp. That's a good start, but we need a bit more information, otherwise quirky scenarios get lost. Notably, it's important to retain past crashes too, so a user might see if there was a fault recently, not just in the last restart (user's tend to restart their nodes a few times until attempting to investigate an issue, so a restart cannot delete the cause).

A simplistic suggestion would be to use a slice of timestamps instead of a single timestamp as the value of the entry. When Geth starts, you can append the current time, and on shutdown, you can pop off the tail. That way on startup you can actually list all the recent crashes, not just the last one.

return eth, nil
}

Expand Down Expand Up @@ -567,6 +577,9 @@ func (s *Ethereum) Stop() error {
s.lesServer.Stop()
}

if err := s.chainDb.Delete([]byte("unsafe-shutdown")); err != nil {
log.Error("err", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to move this down, right above chaindb.Close(). The biggest offender for an unclean shutdown is the blockchain shutdown mechanism.

// Then stop everything else.
s.bloomIndexer.Close()
close(s.closeBloomHandler)
Expand Down
15 changes: 15 additions & 0 deletions les/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,16 @@ func New(ctx *node.ServiceContext, config *eth.Config) (*LightEthereum, error) {
log.Warn("Ultra light client is enabled", "trustedNodes", len(leth.handler.ulc.keys), "minTrustedFraction", leth.handler.ulc.fraction)
leth.blockchain.DisableCheckFreq()
}

// Check for invalid shutdown
invalidShutdown, _ := chainDb.Get([]byte("unsafe-shutdown"))
if invalidShutdown != nil {
log.Error("unsafe shutdown detected", "time", string(invalidShutdown))
}
// Create an invalid shutdown in database in case the app crashed
if err = chainDb.Put([]byte("unsafe-shutdown"), []byte(time.Now().String())); err != nil {
log.Warn("Failed to record possible future unsafe shutdown", "err", err)
}
return leth, nil
}

Expand Down Expand Up @@ -303,6 +313,11 @@ func (s *LightEthereum) Stop() error {
s.txPool.Stop()
s.engine.Close()
s.eventMux.Stop()
// Delete the unsafe shutdown from DB if shutting down safely
if err := s.chainDb.Delete([]byte("unsafe-shutdown")); err != nil {
log.Error("err", err)
}

s.chainDb.Close()
s.wg.Wait()
log.Info("Light ethereum stopped")
Expand Down