Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

channeldb+routing: refactor payment lifecycle #6683

Closed

Conversation

yyforyongyu
Copy link
Member

@yyforyongyu yyforyongyu commented Jun 28, 2022

Depends on,

This PR refactors the payment lifecycle to provide cleaner layers of abstractions so it’s easier to understand and reason about them. Changes included,

  • Remove duplicate logic re payment state in the payment lifecycle and treat the PaymentStatus as the single source of truth.
  • Remove shardHandler as its responsibility is too vague.
  • Decomposed launchShard so requesting route, registering, and sending HTLC are separated since they deal with different subsystems like PaymentSession, ControlTower and htlcswitch. Previously the errors returned from different systems were mixed which made it hard to distinguish a terminal error vs a temp error. By separating them we can now easily decide when to fail the lifecycle, fail or retry the payment.
  • Fail an HTLC attempt before failing a payment, if applicable. This makes sure we won’t attempt another HTLC while the payment is being marked as failed.

TODOs

  • refactor SendToRoute
  • fix unit tests in routing
  • add mo unit tests for the new methods
  • update itest to reflect the new statuses

@yyforyongyu yyforyongyu added code health Related to code commenting, refactoring, and other non-behaviour improvements routing refactoring labels Jun 28, 2022
@Roasbeef Roasbeef added this to the v0.16.0 milestone Jul 1, 2022
@Roasbeef Roasbeef requested review from Roasbeef and bhandras August 24, 2022 20:18

// PaymentStatus is a memory representation of the current status of the
// payment.
type PaymentStatus byte
Copy link
Contributor

@joostjager joostjager Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an awesome PR. So many smaller and bigger improvements.

It is fairly large though. For review it is probably beneficial to isolate the trivial changes (renames, moves) and minor refactors such as exitWithErr in a preparatory PR? The more involved changes such as the addition of a new channel and state machine can then receive exclusive focus in a part 2. Also potentially saves rebase headaches.

// | false | false | true | yes | StatusFailed |
// | false | false | true | no | StatusAttemptsFailed |
// | false | false | false | yes | StatusFailed |
// | false | false | false | no | StatusInitiated |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would almost express this table in code and look up the status directly from that rather than implementing it as a switch statement.

StatusFailed PaymentStatus = 7

// StatusPendingError is the status where a payment has conflict states
// across its HTLCs and has inflight HTLCs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it end up with conflict states and is this an internal error?

func (m *MPPayment) Terminated() bool {
return m.Status == StatusFailed || m.Status == StatusSucceeded
return m.Status == StatusFailed || m.Status == StatusSucceeded ||
m.Status == StatusError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message says pure move, but this looks like a functional change. Different commit?

routing/payment_lifecycle.go Outdated Show resolved Hide resolved
// Once the result is collected, we signal it by writing an
// empty struct to `attemptResultChan`. We do this by first
// freeing the channel in a non-blocking read.
select {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting use of channels here. I wonder if is more intuitive if this goroutine is just always sending, and the receiver (next commit) does an extra for { select { ... } } to discard any extra signals if they would be waiting.

routing/payment_lifecycle.go Outdated Show resolved Hide resolved
@lightninglabs-deploy
Copy link

@bhandras: review reminder
@Roasbeef: review reminder
@yyforyongyu, remember to re-request review from reviewers when ready

@yyforyongyu yyforyongyu marked this pull request as draft September 20, 2022 07:37
@yyforyongyu yyforyongyu force-pushed the refactor-payment branch 2 times, most recently from 7ff8dd8 to cafb615 Compare November 17, 2022 22:58
@yyforyongyu yyforyongyu force-pushed the refactor-payment branch 2 times, most recently from 81b2199 to 496f767 Compare November 19, 2022 09:29
@yyforyongyu yyforyongyu force-pushed the refactor-payment branch 2 times, most recently from 139f01c to ee0be30 Compare January 16, 2023 06:52
@joostjager
Copy link
Contributor

This PR refactors the payment lifecycle to provide cleaner layers of abstractions so it’s easier to understand and reason about them.

For context, is this a pure refactor PR or are there actual functional changes for users? If that is the case, it might be useful to describe the visible problems that are fixed in the PR description to emphasise the relevance.

@yyforyongyu
Copy link
Member Author

For context, is this a pure refactor PR or are there actual functional changes for users? If that is the case, it might be useful to describe the visible problems that are fixed in the PR description to emphasise the relevance.

Def. Will update it once it's out of draft mode.

if reason == nil {
return nil
// Fail the attempt.
_, err = p.failAttempt(attempt, sendErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit looks to be the first real fix. Perhaps it is worth splitting off all the previous renaming/moving in a separate pr to get that out of the way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah was thinking about the same thing. will do!

This commit removes the `launchOutcome` and `shardResult` and uses
`attemptResult` instead. This struct is also used in `failAttempt` so we
can future distinguish critical vs non-critical errors when handling
HTLC attempts.
This commit adds a new method `registerAttempt` to take care of creating
and saving an htlc attempt to disk.
This commit starts handling switch error inside `sendAttempt` when an
error is returned from sending the HTLC. To make sure the updated
`HTLCAttempt` is always returned to the callsite, `handleSwitchErr` now
also returns a `attemptResult`.
@yyforyongyu yyforyongyu force-pushed the refactor-payment branch 7 times, most recently from bb6da1b to cd0d456 Compare March 8, 2023 16:09
This commit removes the method `launchShard` and splits its original
functionality into two steps - first create the attempt, second send the
attempt. This enables us to have finer control over "which error is
returned from which system and how to handle it".
This commit refactors the `resumePayment` method by adding the methods
`checkTimeout` and `requestRoute` so it's easier to understand the flow
and reason about the error handling.
This commit makes sure we only fail attempt inside `handleSwitchErr` to
ensure the orders in failing payment and attempts. It refactors
`collectResult` to return `attemptResult`, and expands `handleSwitchErr`
to also handle the case where the attemptID is not found.
This commit adds a new struct, `stateStep`, to decide the workflow
inside `resumePayment`.

It also refactors `collectResultAsync` introducing a new channel
`resultCollected`. This channel is used to signal the payment
lifecycle that an HTLC attempt result is ready to be processed.
This commit adds a new interface method `TerminalInfo` and changes its
implementation to return an `*HTLCAttempt` so it includes the route for
a successful payment. Method `GetFailureReason` is now removed as its
returned value can be found in the above method.
The old payment lifecycle is removed due to it's not "unit" -
maintaining these tests probably takes as much work as the actual
methods being tested, if not more so. Moreover, the usage of the old
mockers in current payment lifecy test is removed as it re-implements
other interfaces and sometimes implements it uniquely just for the
tests. This is bad as, not only we need to work on the actual interface
implementations and test them , but also re-implement them again in the
test without testing them!
This commit adds more mockers to be used in coming unit tests and
simplified the mockers to be more straightforward.
Thus adding following unit tests can be a bit easier.
This commit adds unit tests for `resumePayment`. In addition, the
`resumePayment` has been split into two parts so it's easier to be
tested, 1) sending the htlc, and 2) collecting results. As seen in the
new tests, this split largely reduces the complexity involved and makes
the unit test flow sequential.

This commit also makes full use of `mock.Mock` in the unit tests to
provide a more clear testing flow.
@yyforyongyu
Copy link
Member Author

Closed and replaced it with a new one as the comments here are very outdated.

@yyforyongyu yyforyongyu deleted the refactor-payment branch March 10, 2023 09:32
@saubyk saubyk removed this from the v0.16.1 milestone Apr 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code health Related to code commenting, refactoring, and other non-behaviour improvements refactoring routing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants