-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Looong startup time with no indication in log of what's happening. #2778
Comments
Thanks @dan-da for the extensive report, I really appreciate the time you took to root-cause this.
This might be related to the DDoS attack that electrum servers are currently suffering from, I'm not too familiar with how
We might instead opt for a lower
Some calls are definitely necessary, i.e., that we are on the correct chain or what the current blockheight so we know from where to start, so moving all calls below the fork might not be such a good idea (it'd be forking and exiting immediately, without the option of displaying an error message to the user, whereas now it'll tell you some things before forking which is a better user experience imho). But the call to |
@cdecker thx for looking at this.
ok, I'll check into that if it keeps happening, thx for the lead.
I think that a bitcoind-rpc-timeout option would be a good idea, yes. However, I would like to focus your attention on what I perceive to be the biggest issue here, which is that the log was not giving me any indication of what operation was being attempted (and blocking). So I wasted a bunch of time googling "We seem to be missing gossip messages", restarting many times, and generally going in circles. A log message prior to each call would fix this. I realize it is a lot more verbose, but hopefully could be an option. Or it could be that at lower log levels, only the pre rpc-call messages are logged, and at log-level=debug the end calls (with duration) are also printed.
I don't see any other rpc-calls being made prior to forking in the log on this particular startup, but I take your word for it. regarding user experience, it is a bad one when it just hangs without reporting anything in the shell or log. I think it would be just fine if it just forked and exited immediately so that the json-rpc api is available. Any bitcoind (comm) errors can be reported via getinfo call or via debug.log. So I would suggest that the main error to report at the shell is if the something prevents the fork and/or rpc server from coming up.
yes please! :-) |
environment: Running against spruned on ubuntu. Both spruned and lightning running over tor.
When I start lightningd, it does not quickly exit back to command-line (enter daemon mode) as it is supposed to. Instead, it just seems to hang. In debug.log, there are only two entries re "creating db" and "missing gossip messages".
After trying several things, including removing ~/.lightning entirely, I just let it stay that way. After approx 15 minutes, there is a line that estimatesmartfee rpc exited with status 1 and finally the daemon starts up.
Okay, so a few observations here:
Seemingly spruned is taking a long time to serve estimatesmartfee for some reason. though if I try the same command now, it is instant.
I can't be certain that the time is spent waiting for spruned, as no message is logged when the rpc call begins, only upon return. And further, the log message does not indicate start or duration time of the call, only the end time. This is useless for profiling.edit: I see that duration in milliseconds is included in the log message that is (finally) printed after the call note: using log-level=debug.Clearly lightningd is making rpc call to bitcoind/spruned before it forks into daemon mode. I'm not sure why that is necessary. Anyway, it has the potential to stall lightning-cli (rpc api) for an arbitrary amount of time at startup.
Suggestions:
when log-level=debug, make a log entry before each rpc call, as well as after. Thus, it becomes more clear what is happening if something gets stuck.
If possible, move the call to estimatesmartfee to after the point at which the daemon forks.
And of course I am open to any ideas/explanations what may be going on here to make things so slow, and how to fix it on my end.
The text was updated successfully, but these errors were encountered: