Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Wallet] Retry if geth fails to init because of no internet #5700

Merged
merged 11 commits into from
Nov 5, 2020

Conversation

etuleu
Copy link
Contributor

@etuleu etuleu commented Nov 4, 2020

Description

We explicitly crash the app if geth fails to initialize on startup. There are a few different cases how that can happen but one way to cause it is to start the app in airplane mode. This causes a network failure which immediately sends the user to the "Oops" screen.

The fix is to implement a retry. Of course this can go on forever if the network connection never comes back, so we could implement a limited number of retries. I don't think that is a good design though as we should never see the Oops screen or intentionally kill the app. Instead we should provide hints to the user as to how they might fix it. In this case we do have a Connecting banner, but we could do more such as showing a popup maybe. In general it is hard to tell why the app cannot connect. For example there might be a port block and the network is fine otherwise. I have some ideas here, such as using google firebase to check network connectivity, but that's for another time.

Other changes

There are other geth failure modes such as when the timeout hits. This is a bit trickier to fix because we would need to cancel the initial init and start a new one. I will leave that for another PR after discussing with @jeanregisser how to implement that.

Tested

Steps to reproduce on simulator or real device (TODO: write an e2e test):

  1. kill app
  2. turn off wifi if simulator, or go to airplane mode on device
  3. open app

Related issues

Backwards compatibility

Yes.

Copy link
Contributor

@jeanregisser jeanregisser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this! 👍
This is in a critical path, can you please update the test for this?

let restartAppAutomatically: boolean = false
let errorContext: string

switch (failureResult) {
case GethInitOutcomes.NETWORK_ERROR_FETCHING_GENESIS_BLOCK: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we'd have a similar issue when fetching the genesis block on the first app launch if the network is down.
Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly! also the other remaining case is geth timeout. I wanted to discuss with you possible ways to recover from those. I tried using the same code path for those errors as well but we run into multiple geth inits happening at the same time. We need a clean way to cancel before we init a new one.

Or, if geth itself has any sort of timeouts, we need to make our own timeouts at least as large as those, so that when we retry we know that the previous one stopped trying. Does that make sense? I would love to hear your thoughts and what we've already tried in the past.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes let's continue the discussion. Can you open an issue so we can address these edge cases in a separate PR? 🙏

Copy link
Contributor

@jeanregisser jeanregisser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@jeanregisser jeanregisser added the automerge Have PR merge automatically when checks pass label Nov 5, 2020
@mergify mergify bot merged commit 2bfaecb into master Nov 5, 2020
@mergify mergify bot deleted the etuleu/oops-fix branch November 5, 2020 10:44
@Lss-Ankit
Copy link

Lss-Ankit commented Nov 9, 2020

@etuleu @jeanregisser Verified with Test flight build v1.4.0 (28) and Android play store (internal build) v1.4.0 (1004294317) while doing testing on mentioned build we have observed Oops something went wrong screen on one time on a Friday after that we have not observed such screen.

Also, tried with give steps but did not observed the oops screen.

  1. kill app
  2. turn off wifi if simulator, or go to airplane mode on device
  3. open app

Actual result of above steps:

  • "Unable to load your feed please try again later" message is shown when user launch the app with airplane mode

However, we will update the ticket if we faced Oops screen again any time.

FYI, Informed this issue here as an observation as facing issue randomly

@Lss-Ankit
Copy link

Lss-Ankit commented Nov 9, 2020

@etuleu Just observed oops screen but not with exact give steps.

  1. kill app
  2. go to airplane mode on device
  3. open app. stay on app for 7-10 min and observed.

Build: Android play store (internal build) v1.4.0 (1004294317)
Device: Google Pixel XL (8.1)
Please find the logs which may helps you to fix the issue.
celo_logs (1).txt

@etuleu
Copy link
Contributor Author

etuleu commented Nov 9, 2020

Thank you for testing this Ankit! From the logs I don't see a crash or anything like that. Did you turn the internet connection back on at some point?

@Lss-Ankit
Copy link

Hi @etuleu Yes, I did not observed any crash but observed Oops screen when launching the app without internet connection.
Yes, I turned on the internet connection after some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Have PR merge automatically when checks pass wallet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Intermittent Crashes on iOS when App is Backgrounded
3 participants