Skip to content
This repository has been archived by the owner on May 22, 2023. It is now read-only.

tBTC chain extension: Incporporating the new Ethereum event subscription API and background event pull loop #679

Merged
merged 13 commits into from
Feb 2, 2021

Conversation

pdyraga
Copy link
Member

@pdyraga pdyraga commented Jan 26, 2021

Refs #680

Depends on #671
Depends on keep-network/tbtc#750

Overview

There are two major changes to Ethereum subscriptions proposed here, as a result of the changes implemented in keep-network/keep-common#63:

  • new subscription API with OnEvent and Pipe functions,
  • background monitoring loop pulling past events from the chain.

The first change should allow implementing some handler logic easier and to avoid complex logic leading to bugs such as keep-network/keep-core#1333 or keep-network/keep-core#2052. ECDSA keep client is currently not that much affected by this chain but this may change when proper event deduplication will be implemented.

The second change should improve client responsiveness for operators running their nodes against Ethereum deployments that are not very reliable on the event delivery front. This should hopefully improve SLA of some mainnet operators of ECDSA client.

New API

Event subscription API has been refactored in keep-network/keep-common#63 to resemble the proposition from keep-network/keep-core#491. The new event subscription mechanism allows installing event callback handler function with OnEvent function as well as piping events from a subscription to a channel with Pipe function.

Example usage of OnEvent:

handlerFn := func(
    submittingMember common.Address,
    conflictingPublicKey []byte,
    blockNumber uint64,
) {
   // (...)
}
subscription := keepContract.ConflictingPublicKeySubmitted(
	nil, // default SubscribeOpts
	nil, // no filtering on submitting member
).OnEvent(handlerFn)

The same subscription but with a Pipe:

sink := make(chan *abi.BondedECDSAKeepConflictingPublicKeySubmitted)

subscription := keepContract.ConflictingPublicKeySubmitted(
	nil, // default SubscribeOpts
	nil, // no filtering on submitting member
).Pipe(sink)

Currently, all our event subscriptions in ECDSA client use function handlers and the code has been adjusted to the new API but is still using function handlers. This may or may not change in the future depending on an individual use case.

Background monitoring loop

Some nodes in the network are running against Ethereum setups that are not particularly reliable in delivering events. Events are not delivered, nodes are not starting key-generation, or are not participating in redemption signing. Another problem is the stability of the event subscription mechanism (see #663). If the web socket connection is dropped too often, the resubscription mechanism is not enough to receive events emitted when the connection was in a weird, stale state.

To address this problem, keep-network/keep-common#63 introduces a background loop periodically pulling past events from the chain next to a regular watchLogs subscription. How often events are pulled and how many blocks are taken into account can be configured with SubscribeOpts parameters.

This way, even if the event was lost by watchLogs subscription for whatever reason, it should be pulled by a background monitoring loop later. An extremely important implication of this change is that handlers should have logic in place allowing them to de-duplicate received events even if a lot of time passed between receiving the original event and the duplicate.

The original version of tBTC deposit monitoring code - without the modification here - was mostly safe. There is 24 hours monitoring period eliminating duplicate events within this time frame.

Just to be absolutely safe, in case 24 hours time period is exceeded for a duplicate, I added an initial state check in this PR (see ff611ec). Before the monitoring starts on start function signal, we check the initial deposit state whether
it's an expected one. If not, we do not start the monitoring.

@pdyraga pdyraga changed the base branch from master to resubscriptions-dance January 26, 2021 08:54
@pdyraga pdyraga changed the base branch from resubscriptions-dance to pipe-it January 26, 2021 08:55
Updated tbtc dependency version to the one with regenerated contract
bindings with keep-common changes about resubscription and background
subscription monitoring process.
Adjusted the code to the regenerated Go contract bindings in tbtc
repository. The regenerated bindings contains keep-common improvements
around event resubscription mechanism and background event monitoring
loop fetching events from the chain periodically in case some event was
dropped by websocket subscription.
Provided ConfirmWithTimeoutDefaultBackoff function that will let us use
default values for backoff and max backoff, similarly to how we do with
DoWithDefaultRetry.

Increased timeout for redemption signature state check from 30 seconds
to 60 seconds, just in case.
After Ethereum event subscription refactoring, it may happen that the
same event is emitted multiple times. A duplicate can be emitted right
after the original event or a long time after the original event has
been emitted.

The original version of tBTC deposit monitoring code - without the
modification here - was mostly safe. There is 24 hours monitoring period
eliminating duplicate events within this time frame.

Just to be absolutely safe, in case 24 hours time period is exceeded for
a duplicate, we are adding initial state check. Before monitoring starts
on start function signal, we check the initial deposit state whether
it's an expected one. If not, we do not start the monitoring.
pkg/client/event_deduplicator.go Outdated Show resolved Hide resolved
pkg/utils/wrappers.go Outdated Show resolved Hide resolved
pkg/client/client.go Outdated Show resolved Hide resolved
pkg/client/client.go Outdated Show resolved Hide resolved
pkg/extensions/tbtc/tbtc.go Show resolved Hide resolved
@pdyraga pdyraga changed the title tBTC chain extension: Incporporating the new ethereum event subscription API and background event pull loop tBTC chain extension: Incporporating the new Ethereum event subscription API and background event pull loop Jan 29, 2021
Base automatically changed from pipe-it to master January 29, 2021 15:35
nkuba added a commit that referenced this pull request Jan 29, 2021
Incorporating the new Ethereum event subscription API and background event pull loop

Refs #680

Depends on #663
Depends on keep-network/keep-common#63

# Overview 

There are two major changes to Ethereum subscriptions proposed here, as a result of the changes implemented in keep-network/keep-common#63:
- new subscription API with `OnEvent` and `Pipe` functions,
- background monitoring loop pulling past events from the chain.

The first change should allow implementing some handler logic easier and to avoid complex logic leading to bugs such as keep-network/keep-core#1333 or keep-network/keep-core#2052. ECDSA keep client is currently not that much affected by this chain but this may change when proper event deduplication will be implemented.

The second change should improve client responsiveness for operators running their nodes against Ethereum deployments that are not very reliable on the event delivery front. This should hopefully improve SLA of some mainnet operators of ECDSA client.


# New API

Event subscription API has been refactored in keep-network/keep-common#63 to resemble the proposition from keep-network/keep-core#491. The new event subscription mechanism allows installing event callback handler function with `OnEvent` function as well as piping events from a subscription to a channel with `Pipe` function.

Example usage of `OnEvent`:
```
handlerFn := func(
    submittingMember common.Address,
    conflictingPublicKey []byte,
    blockNumber uint64,
) {
   // (...)
}
subscription := keepContract.ConflictingPublicKeySubmitted(
	nil, // default SubscribeOpts
	nil, // no filtering on submitting member
).OnEvent(handlerFn)
```

The same subscription but with a `Pipe`:
```
sink := make(chan *abi.BondedECDSAKeepConflictingPublicKeySubmitted)

subscription := keepContract.ConflictingPublicKeySubmitted(
	nil, // default SubscribeOpts
	nil, // no filtering on submitting member
).Pipe(sink)
```

Currently, all our event subscriptions in ECDSA client use function handlers and the code has been adjusted to the new API but is still using function handlers. This may or may not change in the future depending on an individual use case.

# Background monitoring loop

Some nodes in the network are running against Ethereum setups that are not particularly reliable in delivering events. Events are not delivered, nodes are not starting key-generation, or are not participating in redemption signing. Another problem is the stability of the event subscription mechanism (see #663). If the web socket connection is dropped too often, the resubscription mechanism is not enough to receive events emitted when the connection was in a weird, stale state.

To address this problem, keep-network/keep-common#63 introduces a background loop periodically pulling past events from the chain next to a regular `watchLogs` subscription. How often events are pulled and how many blocks are taken into account can be configured with `SubscribeOpts` parameters. 

This way, even if the event was lost by `watchLogs` subscription for whatever reason, it should be pulled by a background monitoring loop later. An extremely important implication of this change is that handlers should have logic in place allowing them to de-duplicate received events even if a lot of time passed between receiving the original event and the duplicate. 

This part has been implemented in `event_deduplicator.go` for four events:
- opened keep (key generation requested),
- redemption signature requested,
- keep closed,
- keep archived.

tBTC-specific events are covered separately in #679.

The only event that is not covered with deduplication is conflicting public key submitted event but it does not look to me as something that needs to be deduplicated as the subscription is canceled immediately after this event is received.
@pdyraga pdyraga marked this pull request as ready for review February 1, 2021 10:08
The previous approach assumed that if the given deposit address is
cached in monitoredDepositsCache, shouldMonitorDeposit returns true
withoug executing any further checks.

Together with acquireMonitoringLock and releaseMonitoringLock it worked
fine for most of the cases but not for all of them. Specifically, for
the case when the deposit has already transitioned to the next state, we
could run into a situation when monitoring lock has already been
released and shouldMonitorDeposit returns true just because the client
has a member in that deposit, without executing the initial state check.

The order of cache checks has been changed in this commit to address it.

Also, renamed monitoredDepositsCache to memberDepositsCache, and
notMonitoredDepositsCache to notMemberDepositsCache to state it clearly
those caches hold information whether the client has or has not a member
in the given deposit.

The new approach is:
1. Check if the given deposit is in notMonitoredDepositsCache. If so, we
are sure the client is not interested in monitoring this deposit no
matter what - it has no member in it.

2. Check initial state of the deposit to do not start monitoring for
duplicate events from the past. If initial state is not as expected, the
client is not interested in monitoring this deposit as the event
triggering the monitoring is something from the past that does not match
the current chain state.

3. Check if the given deposit is in memberDepositCache. If so, start the
monitoring. This check allows to do not repeate getSignerIndex call. If
the deposit is not in memberDepositCache, execute getSignerIndex call,
check members, update caches and return true or false depending if the
client has a member in the deposit.
@lukasz-zimnoch
Copy link
Member

LGTM! I've tested and haven't found any problems. Specifically, I've checked:

  • starting all the monitorings (retrieve pubkey, provide redemption signature, provide redemption proof) on appropriate events
  • stopping the retrieve pubkey monitoring when the key has been retrieved by a third-party
  • stopping the provide redemption signature monitoring when the signature has been provided by a third-party
  • stopping the provide redemption proof monitoring when the proof has been provided by a third-party
  • retrieving the public key on retrieve pubkey timeout
  • submitting redemption signature on provide redemption signature timeout
  • increasing redemption fee on provide redemption proof timeout
  • resubscription and monitoring behavior after Ethereum node crash
  • reaction on event duplicates
  • usual keygen and signing

@lukasz-zimnoch lukasz-zimnoch merged commit f240e81 into master Feb 2, 2021
@lukasz-zimnoch lukasz-zimnoch deleted the pipe-it-all branch February 2, 2021 12:12
@pdyraga pdyraga added this to the v1.7.0 milestone Feb 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants