Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: offline state recovery for Filter subscription #2049

Merged
merged 26 commits into from
Aug 28, 2024

Conversation

weboko
Copy link
Collaborator

@weboko weboko commented Jun 27, 2024

Problem

In case light node goes offline some steps should be taken to recover as per RFC recommendation

Solution

Implement recovery by doing ping and re-subscription if needed.

Notes

@weboko weboko force-pushed the weboko/fitler-offline branch from ba8867c to 4c446d5 Compare July 17, 2024 11:08
@weboko weboko marked this pull request as ready for review July 17, 2024 11:11
@weboko weboko requested a review from a team as a code owner July 17, 2024 11:11
@weboko weboko changed the title feat!: improve offline state handling for Filter subscription feat: improve offline state handling for Filter subscription Jul 17, 2024
@weboko weboko changed the title feat: improve offline state handling for Filter subscription feat: offline state recovery for Filter subscription Jul 17, 2024
Copy link

github-actions bot commented Jul 17, 2024

size-limit report 📦

Path Size Loading time (3g) Running time (snapdragon) Total time
Waku node 86.83 KB (+0.43% 🔺) 1.8 s (+0.43% 🔺) 3.1 s (+78.51% 🔺) 4.8 s
Waku Simple Light Node 154.52 KB (+0.28% 🔺) 3.1 s (+0.28% 🔺) 4.5 s (+56.03% 🔺) 7.5 s
ECIES encryption 22.83 KB (0%) 457 ms (0%) 750 ms (-25.4% 🔽) 1.3 s
Symmetric encryption 22.33 KB (0%) 447 ms (0%) 703 ms (-59.36% 🔽) 1.2 s
DNS discovery 72.27 KB (0%) 1.5 s (0%) 2 s (+7.09% 🔺) 3.4 s
Peer Exchange discovery 73.75 KB (0%) 1.5 s (0%) 2.3 s (+25.3% 🔺) 3.7 s
Local Peer Cache Discovery 67.57 KB (0%) 1.4 s (0%) 2.5 s (+8.06% 🔺) 3.9 s
Privacy preserving protocols 38.91 KB (0%) 779 ms (0%) 1.6 s (-16.92% 🔽) 2.4 s
Waku Filter 81.27 KB (+0.25% 🔺) 1.7 s (+0.25% 🔺) 2.3 s (+26.88% 🔺) 4 s
Waku LightPush 78.6 KB (0%) 1.6 s (0%) 2.4 s (+65.55% 🔺) 3.9 s
History retrieval protocols 79.1 KB (0%) 1.6 s (0%) 2.7 s (+13.29% 🔺) 4.3 s
Deterministic Message Hashing 7.31 KB (0%) 147 ms (0%) 427 ms (+65.74% 🔺) 573 ms

Copy link
Collaborator

@danisharora099 danisharora099 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we introduce some tests that validate the diff for this? We can do this by mocking disconnection

packages/sdk/src/protocols/filter.ts Outdated Show resolved Hide resolved
packages/sdk/src/protocols/filter.ts Outdated Show resolved Hide resolved
@weboko weboko force-pushed the weboko/fitler-offline branch from a5217ae to cf3974d Compare August 20, 2024 13:26
})
);
public isConnected(): boolean {
if (globalThis?.navigator && !globalThis?.navigator?.onLine) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is tricky. I made it that was as to comply with behavior of

const storedPeersData = localStorage.getItem("waku:peers");

when if we run localStorage discovery in nodejs it won't fail tho it won't work

the thing is even though we don't support nodejs - there are some people running js-waku in such env

@danisharora099
Copy link
Collaborator

Re your comment of adding tests within the scope of #2078, it'll be great to have the tests strictly for this change of the PR

@weboko weboko requested a review from danisharora099 August 23, 2024 10:12
Copy link
Collaborator

@danisharora099 danisharora099 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thoughts about a few things here:

1/
The idea seems to be to watch for the navigator:offline/online event, renew (and resubscribe) to subscriptions. Here, I have a few questions:

  • should we instead attempt redials to the peers to confirm if they lost connectivity? - Are we certain the node didn't lose connection to the peer & simply to the Filter subscriptions?
  • If it only lost the subscription, should we really renew the peer instead of simply resubscribing to the same set of peers? (perhaps check the susbcription status first before renewal?)

2/
Regarding testing the offline/online status

even if we consider using a mock for window in test environment AND make it seems like it is offline - it won't be enough as js-waku node won't drop connection to peers so that filter subscription will just continue working;
Maybe I am missing something but I don't see a way to test navigator.onLine changes

We should stub the offline/online event emission, along with stubbing relevant peer disconnections. This assumes that the peer indeed lost subscriptions/connections.

3/
This is more of a reminder than it is a suggestion: renewals and resubscriptions also happen regularly at an interval. It's probably better to do it promptly when observability over the online/offline status events.
No suggestion here, I think this is indeed a right direction to consider.

Please let me know what you think @weboko

--
Side note: some smaller unrelated diffs seem to be included including change of method names, movement of methods, new internal APIs which is why this is also probably more confusing to me (excuse me if that is the case), would be helpful to make the diff leaner

packages/sdk/src/protocols/filter.ts Show resolved Hide resolved
packages/sdk/src/protocols/filter.ts Outdated Show resolved Hide resolved
packages/sdk/src/protocols/filter.ts Outdated Show resolved Hide resolved
packages/sdk/src/protocols/filter.ts Outdated Show resolved Hide resolved
packages/sdk/src/protocols/filter.ts Show resolved Hide resolved
@weboko
Copy link
Collaborator Author

weboko commented Aug 26, 2024

should we instead attempt redials to the peers to confirm if they lost connectivity?

if connectivity is lost then no redials will help as internet is absent, in case it was some application level we could try to redial, I don't see it helping in that case

If it only lost the subscription, should we really renew the peer instead of simply resubscribing to the same set of peers?

that's a good point and in this PR instead of renewing right away - we filter.ping peers first and those that fail are renewed.

We should stub the offline/online event emission, along with stubbing relevant peer disconnections. This assumes that the peer indeed lost subscriptions/connections.

added test covering this one in connection manager. I don't see any value of covering it in filter as filter already has isConnected/waku:connected tests 3b4d203

@weboko weboko requested a review from danisharora099 August 26, 2024 11:42
Copy link
Collaborator

@danisharora099 danisharora099 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unblocking; please feel free to include whatever you think will be helpful in this PR, rest can be followed up as separate PRs
thank you!

@danisharora099
Copy link
Collaborator

if connectivity is lost then no redials will help as internet is absent, in case it was some application level we could try to redial, I don't see it helping in that case

i meant when the application gains connectivity again, perhaps instead of directly renewing and/or resubscribing, redialing the same peer would be beneficial first

Comment on lines 223 to 226
// @ts-expect-error: mocking blobal object
globalThis.navigator = {};
// @ts-expect-error: mocking global object
globalThis.navigator.onLine = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be better to mock this with Sinon

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried wit Sinon originally but then understood that under current structure it is better to properly place eventEmitter and mocked object. It is due to us not needing to check access (i.e that function was called and how many times) and that due to this test being e2e we want to see how code works when browser API is getting triggered (i.e flow of actions).

Maybe you can share an example? I can follow up with moving it to Sinon

@weboko weboko merged commit eadb85a into master Aug 28, 2024
10 of 11 checks passed
@weboko weboko deleted the weboko/fitler-offline branch August 28, 2024 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants