Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zeroconf: Less pedantic interpretation of lightning/bolts#910 #5275

Merged
merged 36 commits into from
Jul 4, 2022

Conversation

cdecker
Copy link
Member

@cdecker cdecker commented May 19, 2022

Just yolo channel_ready messages whenever we feel like it :-)

Closes #5237

@cdecker cdecker self-assigned this May 19, 2022
Copy link
Contributor

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 6d3bf09

@cdecker
Copy link
Member Author

cdecker commented May 19, 2022

ACK 6d3bf09

Why are you asking what is clearly an incomplete draft of the functionality? 😅

@vincenzopalazzo
Copy link
Contributor

Why are you asking what is clearly an incomplete draft of the functionality? sweat_smile

Was thinking that you were divided the json code in a different PR to divide the work that you are doing in #5237 I got confused by the commit message :) I ignored the PR title! My bad!

@cdecker
Copy link
Member Author

cdecker commented May 20, 2022

Nope, I always publish PRs as drafts in order to get early feedback and CI runs ensuring that my PR is not requiring refactoring once I have all the features in. I'll mark this as ready to review once it's... ready to review :-)

@cdecker cdecker force-pushed the zeroconf-v2 branch 6 times, most recently from 0f0a2aa to 5104906 Compare May 25, 2022 09:49
@cdecker cdecker force-pushed the zeroconf-v2 branch 2 times, most recently from e79365a to a5484a0 Compare May 25, 2022 15:13
@cdecker cdecker force-pushed the zeroconf-v2 branch 2 times, most recently from 2c34249 to 2749121 Compare June 7, 2022 17:34
@cdecker cdecker marked this pull request as ready for review June 9, 2022 15:08
@cdecker
Copy link
Member Author

cdecker commented Jun 9, 2022

This is ready for review, and has been interop checked last week at the LNdev meetup 👍

@t-bast
Copy link

t-bast commented Jun 10, 2022

Great, I've been looking at the various implementations over the past few days, and it seems like the interop test was quite light. I'd like to do more extensive interop testing, is this branch the right way to go for cln? Should I just ignore #5237?

I've tried to test this against eclair and compiled this branch with experimental features activated. My cln node correctly advertises zero_conf and scid_alias. However, when I try to open a channel cln rejects it saying it doesn't support the channel_type. I've tried the following combinations, all were rejected:

  • anchor_outputs + zero_conf (public)
  • anchor_outputs + scid_alias (private)
  • anchor_outputs + scid_alias + zero_conf (private)

Is there something I need to do to make cln accept them?

I'd also like to test the case where cln is funder: how do I specify a channel_type with zero_conf and/or scid_alias in fundchannel?

@cdecker
Copy link
Member Author

cdecker commented Jun 21, 2022

Great, I've been looking at the various implementations over the past few days, and it seems like the interop test was quite light. I'd like to do more extensive interop testing, is this branch the right way to go for cln? Should I just ignore #5237?

I've tried to test this against eclair and compiled this branch with experimental features activated. My cln node correctly advertises zero_conf and scid_alias. However, when I try to open a channel cln rejects it saying it doesn't support the channel_type. I've tried the following combinations, all were rejected:

  • anchor_outputs + zero_conf (public)
  • anchor_outputs + scid_alias (private)
  • anchor_outputs + scid_alias + zero_conf (private)

Is there something I need to do to make cln accept them?

I'd also like to test the case where cln is funder: how do I specify a channel_type with zero_conf and/or scid_alias in fundchannel?

Following a discussion with @bluematt, I opted to not add the zeroconf at all to the channel_type since from his explanation it was pretty much just there for lnd's use. Signalling support for it in init is sufficient, and the channel_type should therefore not include neither the option_scid_alias nor option_zeroconf from my current reading of the spec.

Notice that #5237 did include negotiation in the channel_type, but was deemed too pedantic by other implementors (and less flexible by me, as it required static configuration to opt into zeroconf and scid_alias based on the connection open, rather than at channel negotiation).

We could just be permissive in what we accept in the open_channel message, but I'd prefer not to add a quirks mode in the first place, and rather be explicit in what we allow. In this case: init tells us that we support zeroconf, and then we just send the channel_ready (former funding_locked) whenever we feel like it, no need to double-opt-in again via the open_channel.

@t-bast
Copy link

t-bast commented Jun 21, 2022

Signalling support for it in init is sufficient, and the channel_type should therefore not include neither the option_scid_alias nor option_zeroconf from my current reading of the spec.

It's totally fine for us if you don't include the feature bit in the channel_type you send. If we want to accept a 0-conf channel from you (most likely based on a node_id whitelist that the node operator manages), we'll send channel_ready early. However, you shouldn't reject a channel_type that sets it (if you support the feature), since the spec says:

Each basic type has the following variations allowed:
  - `option_scid_alias` (bit 46)
  - `option_zeroconf` (bit 50)

Even though you're allowed to not actually use 0-conf and wait for confirmations before you send your channel_ready, regardless of what's inside the channel_type.

@cdecker
Copy link
Member Author

cdecker commented Jun 22, 2022

But what's the point of including the options in the channel type, they have no discernible effect and if they had they'd be broken since as we discussed on the spec PR, there is no way for the accepter to have a say in the negotiation.

@cdecker
Copy link
Member Author

cdecker commented Jun 22, 2022

That being said I'm happy to implement checks for variations for the sake of compatibility 👍

@t-bast
Copy link

t-bast commented Jun 22, 2022

IIRC it's mostly lnd folks who wanted to have these options added to the channel_type. We added it to eclair since the spec is currently leaning that way, but I really don't mind if we remove them, our code was cleaner without them and I agree with you that:

  • we'll always advertise scid_alias and want to use it for private channels whenever our peer also advertises it (no need for a channel_type for that)
  • we'll accept 0-conf channels based on a whitelist that each node operator is free to configure however they want, so we don't really need it in the channel_type either

But if we all agree this is the right way forward, we should amend the spec to remove these options from channel_type entirely. @Roasbeef @Crypt-iQ WDYT?

@Roasbeef
Copy link

Roasbeef commented Jun 22, 2022

IMO removing it just makes code less uniform: everything we have today depends on the channel type to know what type of channel to open/w. I know we've disagreed on this in the past (I'm a feature bit and channel type maximalist), but imo, having the channel type means things can fail fast: you try to open a zero conf channel to me, it has the channel type, I reject it. Otherwise, you open the channel to me, pay chain fees, then realize that I didn't actually want to have it be zero conf and you're stuck waiting for a confirmation. In the happy case, both sides echo back the same channel type and have their expectations aligned.

Re the white list: we do something similar, there's a dynamic predicate/hook over a streaming RPC call that users can use to arbitrarily accept/deny channels or use diff policies/params with a given peer. If this isn't registered, then it'll reject all zero conf channels.

Re scid_alias, we support upgrading channel: so a user can have an unadvertised channel, then run w/ the new flag and we'll send the alias over.

One reason we interpret zero conf vs zero conf+aliases differently (and also zero conf but public), is that these channels have additional implications w.r.t the gossip layer: the node needs to keep track of both the fake alias and the real one, to ensure that one doesn't unnecessarily leak out (I send a failure back but it has my real scid alias...oops it leaked now). There're also some other edge cases we ran into related to validating a new policy announcement from our channel peer, which may or may not use the real scid.

cdecker added 28 commits July 4, 2022 12:48
`alias_local` is generated locally and sent to the peer so it knows
what we're calling the channel, while `alias_remote` is received by
the peer so we know what to include in routehints when generating
invoices.
This is used in order to ensure zeroconf doesn't just wait for the
first confirmation despite mindepth being set to 0.
This is for the local channel announcement that'll not leave this
host, as it doesn't have signatures.
We locally generate an update with our local alias, and get one from
the peer with the remote alias, so we need to add them both. We do so
only if using the alias in the first place though.
This is needed for us to transition to CHANNELD_NORMAL for zeroconf
channels, i.e., channels where we don't have a short channel ID yet.

We'll have to call lockin_complete a second time, once we learn the
real scid.
We don't trigger on depth=0 since that'd give us bogus blockheights
and pointers into the chain, instead we defer until we get a first
confirmation. This extracts some of the logic from `lockin_complete`,
into the depth change listener (not the remote funding locked since at
that point we're certainly locked in and we don't really care about
that for bookkeeping anyway).
The direction only depends on the ordering between node_ids, not the
short_channel_id, so we can include it and it won't change. This was
causing some trouble loading the `channel_hints` in the `pay` plugin.
We'll use one of the two, and we reuse the `scid` field, since we
don't really care that much which one it is.
We use this in a couple of places, when we want to refer to a channel
by its `short_channel_id`, I'm moving this into a separate function
primarily to have a way to mark places where we do that.
Not only can the outgoing edge be a zeroconf channel, it can also be
the incoming channel. So we revert to the usual trick of using the
local alias if the short_channel_id isn't known yet.

We use the LOCAL alias instead of the REMOTE alias even though the
sender likely used the REMOTE alias to refer to the channel. This is
because we control the LOCAL alias, and we keep it stable during the
lifetime of the channel, whereas the REMOTE one could change or not be
there yet.
We need to switch to the real short_channel_id at 6 confirmations, and
gossip messages should reflect that in order to be valid to others.
The spec explicitly asks for the first point, while we were using the
most recent one. This worked fine before zeroconf, but with zeroconf
it can happen.
Ran into this with a zeroconf channel, without confs, that was
disconnected.
We used to agree up on the `minimum_depth` with the peer, thus when
they told us that the funding locked we'd be sure we either have a
scid or we'd trigger the state transition when we do. However if we
had a scid, and we got a funding_locked we'd trust them not to have
sent it early. Now we explicitly track the depth in the channel while
waiting for the funding to confirm.

Changelog-Fixed: channeld: Enforce our own `minimum_depth` beyond just confirming
With zeroconf we have to duplicate the `local_channel_announcement`
since we locally announce the aliased version, and then on the first
confirmation we also add the funding scid version.
I had to parse quite a few of these files debugging zeroconf, so I
thought it might be nice to have direct access here.

Changelog-Added: pyln-testing: Added utilities to read and parse `gossip_store` file for nodes.
Needed so we can blank optional bits when comparing channel_types.
If zeroconf was negotiated we'll add it to the basic channel
type. Similarly we'll accept it if it was negotiated too.
If we have no real short-channel-id this is the best we can do. Use
the local one since we can be sure we have assigned one.
@cdecker cdecker merged commit 669bca4 into ElementsProject:master Jul 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants