Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3/3]: Blinded Route Error Handling #8485

Merged
merged 17 commits into from
Apr 26, 2024

Conversation

carlaKC
Copy link
Collaborator

@carlaKC carlaKC commented Feb 17, 2024

This PR is the third and final in the route blinding forwarding series, including:

  • Conversion of errors in a blinded route to INVALID_ONION_BLINDING
  • Itest coverage for error handling.
  • Advertizing of optional blinding feature bit.

Depends on #8160

Update: Writing up a summary of my understanding of error handling in LND / how this relates to route blinding because I got this conceptually wrong on my first attempt.

Description

TL;DR for route blinding errors:

  • Introduction node should always return an UpdateFailHTLC with invalid_onion_blinding
  • Relaying node should always return an UpdateFailMalformedHTLC with invalid_onion_blinding

In LND, there are a few places where a HTLC can fail:

  1. Incoming Add: processRemoteAdds fails to add the incoming HLTC -> sendHTLCError
  2. Switch: handlePacketForward fails to transmit the incoming HTLC to the outgoing link -> failAddPacket
  3. Outgoing Add: handleDownstreamUpdateAddHtlc fails to add the outgoing HTLC -> FailAdd
  4. Downstream Failure: handleUpstreamMsg receives a failure from peer, propagated to incoming link -> FailHTLC

Following review from @bitromortac, approach has been updated to handle all of these errors on the incoming link so that we don't have to scatter handling all over the codebase (see below for previous approach).

Something that's key for this PR is that we need to know when we're in a blinded route when we're handling the error on the incoming link so that we send back the correct message. We could do this with a bool propagated through the switch, but there are some pesky edge cases around restarts/ on-chain resolutions where we don't have this information on hand (started with this approach - full branch here).

This PR takes the approach of using the existing ErrorEncrypterType to store a "marker" struct (which just wraps our existing implementation) so that we have the information available to us when we're resolving the HTLC across restarts. We're already storing the type for mock vs real encryptor, so this persistence costs us nothing extra. This trick is very helpful because we can just restore the circuit on the packet on re-forwrad through the switch and we'll have full information about the type of forward this was available.

Previous Approach There are a few subtleties about this error handling that are important: * There are *a lot* of different paths that errors can take, so we have to account for the switch required in route blinding in a few places * With the exception of (1) where we've just handled the onion, we only handle error encryption (when we're the source of the error) _in the switch_, so we can't just handle errors on the incoming link because we don't have our encrypted available (and surfacing it is a bad layering violation) * Malformed errors are handled on the outgoing link, and converted into regular failures

With all this in mind, the following approach is taken:

  • For (1/2), simply switch out the error on the spot since we have what we need (and it's a unique path)
  • For (3/4) convert all blinded errors to UpdateFailHTLC on the outgoing link, errors will be appropriately encrypted in the switch as required
  • When the error reaches the original incoming link, switch it out for a UpdateMalformedHTLC if we are a relaying node, leave as-is otherwise (which will appropriately transmit both introduction node errors and regular unblinded errors)

This does lead to an admittedly odd flow for relaying nodes that get an error from their peer: receive malformed -> convert to failure -> convert back to malformed, but we don't transmit malformed errors across the switch (and I don't think that we should). And while we're sending this message to the peer, we handle everything else exactly like a failure so I think it's worth leaving flow as as .

Copy link
Contributor

coderabbitai bot commented Feb 17, 2024

Important

Auto Review Skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This update introduces significant enhancements to the Lightning Network, focusing on improving privacy and functionality through the support of blinded payments. By simplifying configuration and adding new features for route blinding, the changes aim to enhance anonymity and usability. The modifications span across configuration, documentation, feature management, error handling, and testing, reflecting a comprehensive effort to support more secure and private transactions.

Changes

Files Change Summary
config.go Removed declaration for ProtocolOptions in DefaultConfig.
docs/release-notes/release-notes-0.18.0.md Enhanced support for forwarding blinded payments.
feature/... (default_sets.go, deps.go, manager.go) Added support for lnwire.RouteBlindingOptional and dependencies.
htlcswitch/... (circuit.go, hop/..., link.go, mock.go, switch.go) Improved error handling and functionality for blinded routes.
itest/... (list_on_test.go, lnd_route_blinding_test.go) Added tests for blinded routes and error handling scenarios.
lncfg/... (protocol.go, protocol_integration.go) Removed redundant functions for route blinding configuration.
lnrpc/... (lightning.proto, .swagger.json, peersrpc/peers.swagger.json, routerrpc/router.swagger.json) Added support for ROUTE_BLINDING_REQUIRED and OPTIONAL.
lnwire/features.go, server.go Added feature bits for blinded payments and updated server configuration.

Possibly related issues

  • [feature]: Support Payment to Multiple Blinded Paths #8363: The enhancements made in this PR could potentially address the objectives of extending LND to support payment to multiple blinded paths, as it introduces foundational support for blinded payments and improved error handling in blinded routes. The focus on privacy and usability aligns with the issue's goals.

🐰 A Poem by CodeRabbit

In a network, lightning fast,
Privacy's shadow, overcast.
Blinded routes, now take the stage,
In the quest for privacy, a new age.
🌟 Through code, we weave a safer net,
Anonymity, a promise set.
🚀 To the future, we hop with glee,
In a world more private, we aim to be.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@carlaKC
Copy link
Collaborator Author

carlaKC commented Feb 17, 2024

Opening this as a draft because I still want to test this against other implementations, but basic functionality is in place.

@saubyk saubyk added this to the v0.18.0 milestone Feb 18, 2024
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch 3 times, most recently from b7b9bc3 to 6b7c6d2 Compare February 21, 2024 01:01
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch 3 times, most recently from 9d21a0c to f430ed4 Compare March 7, 2024 20:07
@carlaKC
Copy link
Collaborator Author

carlaKC commented Mar 7, 2024

Hold off on review here - ran into an impl issue while testing :(

Update: ready for review!

lnwire/blinding_point.go Outdated Show resolved Hide resolved
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch from 34cb0f6 to fe61ef2 Compare March 8, 2024 22:47
@carlaKC carlaKC marked this pull request as ready for review March 9, 2024 22:49
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch 3 times, most recently from f8a2c5b to cc09f5f Compare March 11, 2024 15:32
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch 2 times, most recently from d8a1835 to bb5380f Compare March 15, 2024 21:09
Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow I learned a lot here 🤯 Thanks so much for the detailed PR description - it gave me a great blueprint of the relevant switch & link code.

Most (if not all?) of my comments are questions mostly for my understanding.
This all looks really good!!

htlcswitch/hop/iterator.go Outdated Show resolved Hide resolved
htlcswitch/link.go Outdated Show resolved Hide resolved
htlcswitch/link.go Outdated Show resolved Hide resolved
@@ -1456,10 +1456,31 @@ func (s *Switch) checkCircularForward(incoming, outgoing lnwire.ShortChannelID,
// The ciphertext will be derived from the failure message proivded by context.
// This method returns the failErr if all other steps complete successfully.
func (s *Switch) failAddPacket(packet *htlcPacket, failure *LinkError) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still making my way through the diff so this might be answered in a later commit but: what about local switch failures for where we are the intro node? - could this be done in a catch all here by having a isBlinding field in htlcPacket?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cause at this point we have already parsed the onion so would know if a blinding point is embedded there

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is wrong here (sorry!) - by the time we get here we're looking at our outgoing HTLC so the blinding point will always be set if we're part of a blinded route

// blindedRouteFailure looks up the originally added htlc, and provides a
// switched out failure message if it was part of a blinded route.
func (l *channelLink) blindedRouteFailure(htlcIdx uint64) (
lnwire.FailureMessage, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do "lnwire.FailureMessage, bool, error" just to stick to the "if a bool is returned, check that before referencing" pattern? just to avoid accidental nil pointer dereferences in future.

@@ -2036,6 +2036,20 @@ func (l *channelLink) handleUpstreamMsg(msg lnwire.Message) {
}
}

// Lookup the original htlc, if it was part of a blinded route
// switch out our error.
blindingFailure, err := l.blindedRouteFailure(msg.ID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so just checking my understanding: if our peer is behaving correctly, the case lnwire.CodeInvalidBlinding: case above will always be overwritten here right? cause they only send Malformed back if we are also part of the path

Copy link
Collaborator Author

@carlaKC carlaKC Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if our peer is behaving correctly, the case lnwire.CodeInvalidBlinding: case above will always be overwritten here

Yeah. Didn't want to tie the blinded conversion to getting the Malformed (in case of buggy/bad peers) so I think this is the simplest way of handling it?

📓 Note to self to add a comment here.

@@ -2083,9 +2097,36 @@ func (l *channelLink) handleUpstreamMsg(msg lnwire.Message) {
}
}

// Lookup the original htlc, if it was part of a blinded route
// switch out our error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another understanding check: if we receive an UpdateFailHTLC from our peer when we are also in the blinded route, then they are breaking the rules right? Cause they should only send us UpdateFailMalformed... right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah 👿

htlcswitch/link.go Show resolved Hide resolved
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch 2 times, most recently from a29ea3c to 4cde71c Compare March 19, 2024 15:30
Copy link
Collaborator

@bitromortac bitromortac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow I learned a lot here 🤯 Thanks so much for the detailed PR description - it gave me a great blueprint of the relevant switch & link code.

Can second that, great PR 🙏, especially the itests! Will need to look closer into spec compliance and the general flow.

htlcswitch/hop/payload.go Outdated Show resolved Hide resolved
htlcswitch/link.go Outdated Show resolved Hide resolved
itest/list_on_test.go Show resolved Hide resolved
itest/lnd_route_blinding_test.go Show resolved Hide resolved
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch from 4cde71c to 40e969c Compare March 20, 2024 15:04
carlaKC added 11 commits April 25, 2024 09:15
…ypes

This commit moves all our validation related to the presence of fields
into ValidateParsedPayloadTypes so that we can handle them in a single
place. We draw the distinction between:
- Validation of the payload (and the context within it's being parsed,
  final hop / blinded hop etc)
- Processing and validation of encrypted data, where we perform
  additional cryptographic operations and validate that the fields
  contained in the blob are valid.

This helps draw the line more clearly between the two validation types,
rather than splitting some payload-releated blinded hop processing
into the encrypted data processing part. The downside of this approach
(vs doing the blinded path payload check _after_ payload validation)
is that we have to pass additional context into payload validation
(ie, whether we got a blinding point in our UpdateAddHtlc - as we
already do for isFinalHop).
When handling blinded errors, we need to know whether there was a
blinding key in our payload when we successfully parsed our payload
but then found an invalid set of fields. The combination of
parsing and validation in NewPayloadFromReader means that we don't know
whether a blinding point was available to us by the time the error is
returned.

This commit splits parsing and validation into two functions so that
we can take a look at what we actually pulled of the payload in between
parsing and TLV validation.
We need to know what role we're playing to be able to handle errors
correctly, but the information that we need for this is held by our
iterator:
- Whether we had a blinding point in update add (blinding kit)
- Whether we had a blinding point in payload

As we're now going to use the route role return value even when our
err!=nil, we rename the error to signal that we're using less
canonical golang here.

An alternative to this approach is to attach a RouteRole to our
ErrInvalidPayload. The downside of that approach is:
- Propagate context through parsing (whether we had updateAddHtlc)
- Clumsy handling for errors that are not of type ErrInvalidPayload
Introduce two wrapper types for our existing SphinxErrorEncrypter
that are used to represent error encrypters where we're a part of a
blinded route. These encrypters are functionally the same as a sphinx
encrypter, and are just used as "markers" so that we know that we
need to handle our error differently due to our different role.

We need to persist this information to account for restart cases where
we've resovled the outgoing HTLC, then restart and need to handle the
error for the incoming link. Specifically, this is relevant for:
- On chain resolution messages received after restart
- Forwarding packages that are re-forwarded after restart

This is also generally helpful, because we can store this information
in one place (the circuit) rather than trying to reconstruct it in
various places when forwarding the failure back over the switch.
Create our error encrypter with a wrapped type if we have a blinding
point present. Doing this in the iterator allows us to track this
information when we have both pieces of information available to us,
compared to trying to handle this later down the line:
- Downstream link on failure: we know that we've set a blinding point
  for out outgoing HTLC, but not whether we're introduction or not
- Upstream link on failure: once the failure packet has been sent
  through the switch, we no longer know whether we were the introduction
  point (without looking it up / examining our payload again /
  propagating this information through the switch).
Set obfuscator for use in blinded error handling when we forward
failures through the switch.
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch from 777b36c to c64c012 Compare April 25, 2024 14:14
@carlaKC carlaKC requested a review from bitromortac April 25, 2024 14:14
Copy link
Collaborator

@bitromortac bitromortac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉 awesome work!

There are some itest failures, all failing at SendToRouteV2 in the on chain to blinded test. It seems to be a too long test for the DefaultTimeout together with the call. A way to circumvent would be to call the SendToRouteV2 on the Router with a larger timeout.

@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch from c64c012 to cdc18ce Compare April 25, 2024 18:02
@carlaKC
Copy link
Collaborator Author

carlaKC commented Apr 25, 2024

It seems to be a too long test for the DefaultTimeout together with the call.

Bumped here, hopefully a minute is long enough for the GH potato - will see how this run goes and increase further if anything flakes!

@Roasbeef
Copy link
Member

Fixed a race that popped up in the uint tests here (unrelated) here: #8694

@Roasbeef
Copy link
Member

Roasbeef commented Apr 25, 2024

With that aside, there is a failure of a new itest added in this PR (neutrino backend):

--- FAIL: TestLightningNetworkDaemon/tranche03/145-of-154/neutrino/on_chain_to_blinded (40.61s)

@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch from cdc18ce to 696652b Compare April 26, 2024 13:23
@carlaKC
Copy link
Collaborator Author

carlaKC commented Apr 26, 2024

With that aside, there is a failure of a new itest added in this PR (neutrino backend):

--- FAIL: TestLightningNetworkDaemon/tranche03/145-of-154/neutrino/on_chain_to_blinded (40.61s)

Thought our timeout might be too slow, bumping up timeout to larger window so that there's time for CI to resolve on-chain.

For tests where our payments require an on-chain resolution,
provide longer timeout and return cancel functions.
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch from 696652b to e826d5e Compare April 26, 2024 14:57
carlaKC added 4 commits April 26, 2024 11:35
This itest adds a test that we still propagate blinded errors back
properly after a restart with an on-chain resolution. The test also
updates our sendpayment timeout to longer so that there's time to
resolve the on chain claim.
@carlaKC carlaKC force-pushed the 7298-3-forwardblindedroutes branch from e826d5e to 6572f5c Compare April 26, 2024 15:35
@Roasbeef
Copy link
Member

Made #8696 to address that sqlite race.

@Roasbeef Roasbeef merged commit add2691 into lightningnetwork:master Apr 26, 2024
24 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants