-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(swaps): recover crashed swap deals #1081
Conversation
f50795b
to
16216a5
Compare
Questions:
TODO:
Misc:
True, but isn't that exactly what Moshe's breaking swap tests are supposed to do? In any way, it's close. So I think we should unify these. |
Here's the summary of what I'm trying to cover in this PR, I'm going to push out updated code later today that covers the pending payment cases more thoroughly. The notable gap that I have is how we can claim or settle a raiden payment upon resuming xud. I think in terms of how to handle edge cases where lnd or raiden crash while xud is running (if a payment goes through without us getting a response, for instance), that should be covered in a separate PR since the scope of this PR is already quite large. Phases
|
16216a5
to
e464677
Compare
The updated approach tracks any outgoing payments that are still pending upon xud restarting. If an outgoing payment is still in flight and we do not have the preimage for it, we add it to a set of "pending" swaps and check on it on a scheduled interval until we can determine whether it has failed or succeeded. I set the timer to trigger every 5 minutes to check on any swaps that are still pending. A new Swaps that time out during the The key missing piece is that Raiden currently does not expose an API call to push a preimage to claim an incoming payment, instead we print a warning to the log for now. One solution to this shortcoming would be to have Raiden continuously query the resolver endpoint if it does not receive a response, in which case once xud learns of the preimage it can pass it to raiden the next time there is a resolve request for that hash. Another consideration would be to list any pending swaps via the rpc layer as a way to inform users that xud is monitoring pending payments. This could be added to the The next step is to add simulation tests to ensure that this functionality is working as expected, since this is hard to test manually. I'll add that to this PR in a separate commit. |
lib/swaps/SwapRecovery.ts
Outdated
|
||
public beginTimer = () => { | ||
if (!this.pendingSwapsTimer) { | ||
this.pendingSwapsTimer = setInterval(this.checkPendingSwaps, 300000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment that this is in ms
e464677
to
2656d12
Compare
2656d12
to
7e5a8f6
Compare
7e5a8f6
to
8d4b763
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks that the logic that detects pending HTLCs for lnd is working. 👍
I simulated a xud crash immediately after sending a payment to the taker:
deal.rPreimage = await swapClient.sendPayment(deal);
console.log('maker got preimage', deal.rPreimage);
process.exit(1); // xud "crashes" after sending payment
// return deal.rPreimage;
The taker claimed the payment successfully. When I restarted maker's xud node I got the following output:
[1] 30/08/2019 14:37:25.460 [SWAPS] info: recovering swap deal 5df992aee84c500633500827bcd5e98a8c84e339996d50acddd4fb7d59cbeace
[1] 30/08/2019 14:37:25.462 [SWAPS] error: TypeError: this.invoices[methodName] is not a function
[1] at Promise (/home/ar/xud/dist/lndclient/LndClient.js:113:42)
[1] at new Promise (<anonymous>)
[1] at LndClient.unaryInvoiceCall (/home/ar/xud/dist/lndclient/LndClient.js:108:20)
[1] at LndClient.listPayments (/home/ar/xud/dist/lndclient/LndClient.js:587:25)
[1] at LndClient.<anonymous> (/home/ar/xud/dist/lndclient/LndClient.js:562:41)
[1] at Generator.next (<anonymous>)
[1] at /home/ar/xud/dist/lndclient/LndClient.js:7:71
[1] at new Promise (<anonymous>)
[1] at __awaiter (/home/ar/xud/dist/lndclient/LndClient.js:3:12)
[1] at LndClient.lookupPayment (/home/ar/xud/dist/lndclient/LndClient.js:561:41)
[1] at SwapRecovery.<anonymous> (/home/ar/xud/dist/swaps/SwapRecovery.js:76:69)
[1] at Generator.next (<anonymous>)
[1] at /home/ar/xud/dist/swaps/SwapRecovery.js:7:71
[1] at new Promise (<anonymous>)
[1] at __awaiter (/home/ar/xud/dist/swaps/SwapRecovery.js:3:12)
[1] at SwapRecovery.recoverDeal (/home/ar/xud/dist/swaps/SwapRecovery.js:51:38)
[1] at swapDealInstances.forEach (/home/ar/xud/dist/swaps/Swaps.js:64:39)
[1] at Array.forEach (<anonymous>)
[1] at Swaps.<anonymous> (/home/ar/xud/dist/swaps/Swaps.js:61:31)
[1] at Generator.next (<anonymous>)
[1] at fulfilled (/home/ar/xud/dist/swaps/Swaps.js:4:58)
@erkarl I moved the type and fixed the issue you ran into, thanks for catching that. |
@sangaman after testing this with your updated code I'm getting the following output for maker's xud:
However, when I check the channel balances it looks like the maker did not receive any LTC. I also checked the balance on lnd level. Looking at the |
That's strange, as this means setle invoice should have been called. I'll see if I can reproduce. |
724679c
to
e06e29e
Compare
@erkarl @kilrau I fixed the issue with the recovered payment not settling correctly, and added a new suite of "instability" simulation tests that use a new instability branch. It wasn't easy but it simulates exactly the cases we are trying to address with this PR and the tests pass consistently. See the second commit message and the updated top post of this PR for more details. |
@sangaman looks like I'm now able to successfully recover and claim the payment when the maker's xud crashes after sending the payment. 💯 It's really nice that we now have simulation tests in place for these scenarios. My only concern is the maintainability of the Also, should we add the instability tests to our CI pipeline in |
I think keeping the I'm neutral on whether it should be part of the travis tests, I followed the |
This commit attempts to recover swap deals that were interrupted due to a system or `xud` crash. In the case where we are the maker and have attempted to send payment for the second leg of the swap, we attempt to query the swap client for the preimage of that payment in case it went through. We can then use that preimage to try to claim the payment from the first leg of the swap. In case the payment is known to have failed, we simply attempt to close any open invoices and mark the swap deal as having errored. If an outgoing payment is still in flight and we do not have the preimage for it, we add it to a set of "pending" swaps and check on it on a scheduled interval until we can determine whether it has failed or succeeded. A new `SwapRecovery` class is introduced to contain the logic for recovering interrupted swap deals and for tracking swaps that are still pending. Any pending swaps are listed in the `GetInfo` response. Raiden currently does not expose an API call to push a preimage to claim an incoming payment or to reject an incoming payment, instead we print a warning to the log for now. The recovery attempts happen on `xud` startup by looking for any swap deals in the database that have an `Active` state. This commit includes a suite of test cases for the newly added functionality. Closes #1079.
This adds a new suite of simulation tests to test how `xud` responds to instability. It simulates crashes immediately after the maker sends payment as part of a swap. When the maker comes back online, it will fetch the preimage for the successful payment and use it to settle the incoming payment from the interrupted swap. This tests two key cases, one where the maker's payment goes through while the maker is offline, and another where the payment goes through only after the maker has come back online. In the latter case, the maker must detect that the payment is in a pending state and monitor it in case it goes through. Closes #1183.
e06e29e
to
d401b0c
Compare
Closes #1079 & #1183. This builds off of PR #1080.
This commit attempts to recover swap deals that were interrupted due to a system or
xud
crash. In the case where we are the maker and have attempted to send payment for the second leg of the swap, we attempt to query the swap client for the preimage of that payment in case it went through. We can then use that preimage to try to claim the payment from the first leg of the swap. In all other cases, we simply attempt to close any open invoices and mark the swap deal as having errored.Raiden currently does not expose an API call to query for the preimage of a completed payment. Any pending swaps are listed in the
GetInfo
response.The recovery attempts happen once on
xud
startup by looking for any swap deals in the database that have anActive
state.I haven't tested this code yet but it is ready to review, if I have time I will also start adding some test cases. This may be a good candidate for a simulation test, since recovering funds in the case of xud crashing while waiting for the
sendPayment
response depends substantially on lnd.This commit includes a suite of test cases for the newly added functionality. It also adds a new suite of simulation tests to test how
xud
responds to instability. It simulates crashes immediately after the maker sends payment as part of a swap. When the maker comes back online, it will fetch the preimage for the successful payment and use it to settle the incoming payment from the interrupted swap.This tests two key cases, one where the maker's payment goes through while the maker is offline, and another where the payment goes through only after the maker has come back online. In the latter case, the maker must detect that the payment is in a pending state and monitor it in case it goes through.