Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BOLT7: extend channel range queries with optional fields #557

Merged
merged 11 commits into from
Sep 16, 2019

Conversation

sstone
Copy link
Collaborator

@sstone sstone commented Jan 23, 2019

This is a new pull requests that supersedes #519 .

It addresses issues with the original proposal, mainly that it defined a new set of messages, adding complexity to a simple gossip protocol that we knew was limited in the first place.

This proposal does not add new messages, or feature bits, and is fully compatible with existing implementations. Instead of defining new messages it extends existing ones with additional data, that will be ignored by nodes which do not implement extended queries (see BOLT #1).

Nodes that support extended queries will append an additional extended query flag to their query_channel_range queries. If the receiver supports extended queries and understands this flag, it will append the requested additional data to its reply_channel_range message.

There is currently only one type of additional data: one timestamp and one checksum per channel_update.
The checksum is a simple Adler32 checksum computed over the channel_update with timestamp and signature omitted.
Together they can be used to avoid querying channel_updates that are older than the ones you already have, or that are newer but don't include new information.

Nodes can then append additional data to their query_short_channel_ids messages, which consists in one flag per short channel id and specifies what they would like to receive (channel_announcement, or/and one channel_update or both`).

@sstone
Copy link
Collaborator Author

sstone commented Jan 24, 2019

For completeness, there is even a simpler solution: leave query_channel_range as is (no extra flag to signify that you want extended data), and if the receiver supports extended queries they always include extended data in their reply_channel_range.

I chose instead to explicitly signal support for extended queries in query_channel_range, which is more bandwidth efficient (replies won't include extended data unless you ask for it).

07-routing-gossip.md Outdated Show resolved Hide resolved
@pm47
Copy link
Collaborator

pm47 commented Mar 11, 2019

@rustyrussell does this correspond to feature option_fec_gossip in #571?
edit: nevermind

@pm47
Copy link
Collaborator

pm47 commented Mar 11, 2019

For those interested, this feature is now implemented on endurance: 03933884aaf1d6b108397e5efe5c86bcf2d8ca8d2f700eda99db9214fc2712b134@34.250.234.192:9735

@sstone
Copy link
Collaborator Author

sstone commented Mar 12, 2019

There is no need for a specific feature bit for this: additional "extended" data appended to query messages will simply be ignored by nodes which do not implement extended queries (see BOLT 1). It's one of the reasons why this PR is much better than my first proposal: extended queries are completely optional and compatible with current implementations and deployed nodes.

option_fec_gossip refers to new messages to efficiently sync routing tables based on set reconciliation techniques (IBLT, minisketch, ...) that have not been defined yet.

@pm47
Copy link
Collaborator

pm47 commented Mar 19, 2019

Here are a few test vectors, hope that helps:

{
  "msg" : {
    "type" : "QueryChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 100000,
    "numberOfBlocks" : 1500
  },
  "hex" : "01070f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000186a0000005dc"
}
{
  "msg" : {
    "type" : "QueryChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 35000,
    "numberOfBlocks" : 100,
    "extendedQueryFlags_opt" : "TIMESTAMPS_AND_CHECKSUMS"
  },
  "hex" : "01070f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000088b80000006401"
}
{
  "msg" : {
    "type" : "ReplyChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 756230,
    "numberOfBlocks" : 1500,
    "complete" : 1,
    "shortChannelIds" : {
      "encoding" : "UNCOMPRESSED",
      "array" : [ "0x0x142", "0x0x15465", "0x69x42692" ]
    },
    "optionExtendedQueryFlags_opt" : "TIMESTAMPS_AND_CHECKSUMS"
  },
  "hex" : "01080f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000b8a06000005dc01001900000000000000008e0000000000003c69000000000045a6c401"
}
{
  "msg" : {
    "type" : "ReplyChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 1600,
    "numberOfBlocks" : 110,
    "complete" : 1,
    "shortChannelIds" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ "0x0x142", "0x0x15465", "0x4x3318" ]
    },
    "optionExtendedQueryFlags_opt" : "TIMESTAMPS_AND_CHECKSUMS"
  },
  "hex" : "01080f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000006400000006e01001601789c636000833e08659309a65878be010010a9023a01"
}
{
  "msg" : {
    "type" : "ReplyChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 122334,
    "numberOfBlocks" : 1500,
    "complete" : 1,
    "shortChannelIds" : {
      "encoding" : "UNCOMPRESSED",
      "array" : [ "0x0x12355", "0x7x30934", "0x70x57793" ]
    },
    "optionExtendedQueryFlags_opt" : "TIMESTAMPS_AND_CHECKSUMS",
    "extendedInfo_opt" : {
      "array" : [ {
        "timestamp1" : 164545,
        "checksum1" : 1111,
        "timestamp2" : 948165,
        "checksum2" : 2222
      }, {
        "timestamp1" : 489645,
        "checksum1" : 3333,
        "timestamp2" : 4786864,
        "checksum2" : 4444
      }, {
        "timestamp1" : 46456,
        "checksum1" : 5555,
        "timestamp2" : 9788415,
        "checksum2" : 6666
      } ]
    }
  },
  "hex" : "01080f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e22060001ddde000005dc01001900000000000000304300000000000778d6000000000046e1c1010030000282c100000457000e77c5000008ae000778ad00000d0500490ab00000115c0000b578000015b300955bff00001a0a"
}
{
  "msg" : {
    "type" : "ReplyChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 500,
    "numberOfBlocks" : 100,
    "complete" : 1,
    "shortChannelIds" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ "0x18x54897", "0x74x47820", "0x69x42692" ]
    },
    "optionExtendedQueryFlags_opt" : "TIMESTAMPS_AND_CHECKSUMS",
    "extendedInfo_opt" : {
      "array" : [ {
        "timestamp1" : 164545,
        "checksum1" : 1111,
        "timestamp2" : 948165,
        "checksum2" : 2222
      }, {
        "timestamp1" : 489645,
        "checksum1" : 3333,
        "timestamp2" : 4786864,
        "checksum2" : 4444
      }, {
        "timestamp1" : 46456,
        "checksum1" : 5555,
        "timestamp2" : 9788415,
        "checksum2" : 6666
      } ]
    }
  },
  "hex" : "01080f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000001f40000006401001a01789c63600002a16b85208ac16bd71930edbaec08002c7804d9010030000282c100000457000e77c5000008ae000778ad00000d0500490ab00000115c0000b578000015b300955bff00001a0a"
}
{
  "msg" : {
    "type" : "QueryShortChannelIds",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "shortChannelIds" : {
      "encoding" : "UNCOMPRESSED",
      "array" : [ "0x0x142", "0x0x15465", "0x69x42692" ]
    }
  },
  "hex" : "01050f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206001900000000000000008e0000000000003c69000000000045a6c4"
}
{
  "msg" : {
    "type" : "QueryShortChannelIds",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "shortChannelIds" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ "0x0x4564", "0x2x47550", "0x69x42692" ]
    }
  },
  "hex" : "01050f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206001801789c63600001c12b608a69e73e30edbaec0800203b040e"
}
{
  "msg" : {
    "type" : "QueryShortChannelIds",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "shortChannelIds" : {
      "encoding" : "UNCOMPRESSED",
      "array" : [ "0x0x12232", "0x0x15556", "0x69x42692" ]
    },
    "queryFlags_opt" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ 1, 2, 4 ]
    }
  },
  "hex" : "01050f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e22060019000000000000002fc80000000000003cc4000000000045a6c4000c01789c6364620100000e0008"
}
{
  "msg" : {
    "type" : "QueryShortChannelIds",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "shortChannelIds" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ "0x0x14200", "0x0x46645", "0x69x42692" ]
    },
    "queryFlags_opt" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ 1, 2, 4 ]
    }
  },
  "hex" : "01050f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206001801789c63600001f30a30c5b0cd144cb92e3b020017c6034a000c01789c6364620100000e0008"
}

07-routing-gossip.md Outdated Show resolved Hide resolved
@pm47
Copy link
Collaborator

pm47 commented Mar 20, 2019 via email

@rustyrussell
Copy link
Collaborator

OK, so I've implemented this. I've kept the existing logic to send node_announcements for any channel_announcement we send (uniqified). Am testing the vectors now...

@rustyrussell
Copy link
Collaborator

OK, I've got a fixup which I applied to your INV gossip PR, which should have gone here instead. Will push. Also, rebased onto v1.0, otherwise I get unrelated regressions...

@rustyrussell rustyrussell force-pushed the bolt7-extended-channel-queries branch from 1e9483f to 8eeb1c5 Compare April 1, 2019 03:32
@rustyrussell
Copy link
Collaborator

... and another proposed cleanup (no semantic changes, just integrating the requirements better, I think).

@rustyrussell rustyrussell added the Meeting Discussion Raise at next meeting label Apr 1, 2019
@sstone
Copy link
Collaborator Author

sstone commented Apr 4, 2019

This PR has been discussed in our dev meetings a few times and there are still open points, I'll try and summarise them here. They were raised mostly by @Roasbeef and @cfromknecht , If I've missed anything please add a question/comment which I'll try and address here.

Why do we need this ? We could use heuristics, be more optimistic on the sender's side and work with a "good enough" routing table without trying to get all updates asap. Better heuristics could also be use to broadcast enable/disable updates and reduce "flapping" gossip.

It's very true. But this PR really is a fix for current channel queries which are very good for learning about channels you don't have, but almost unusable for updating the ones you already have. Basically they're just a way of reconciling your list of channel ids with your peer's. Then what ? You've learned nothing about what has changed. If you do nothing, you will get updates as your payments fails with an "update" error, but it's bad UX, especially for mobile nodes when users turn them on, make a single or just a few payments, then turn them off again. And gossip filters (which say “send me everything that is more recent than timestamp X”) don’t really help when you’re offline most of the time since you don’t know what you’re missing to begin with.

I think adding timestamps, and a flag to query specific items and not everything, should not be controversial, especially since now they're completely optional. Do we all agree on that ?

Why do we also need a checksum ? We should use better heuristics to fix the flapping channels issue.

Because then you have the option of ignoring updates that don't change routing policies (or simply flagging them as low priority). Again, better heuristics for broadcasting these updates, as well as stricter peer selection/banning heuristics, would help a lot, but having the option to know that an update does not really change anything * before * you've downloaded it is very useful (and this proposal very cheap from a bandwitdh/cpu point of view).

Should we use TLV for the new optional fields ?

I'm now having second thoughts about this one. Current proposal is simple and consistent, I think that using TLV would just slow things down and would not brings us anything ? But I'll give it a try.

07-routing-gossip.md Outdated Show resolved Hide resolved
@sstone sstone force-pushed the bolt7-extended-channel-queries branch from 9a747a3 to f322ffc Compare April 26, 2019 17:17
@sstone
Copy link
Collaborator Author

sstone commented Apr 26, 2019

I've updated this PR:

  • optional fields now use TLV format
  • timestamps and checksums are now independent

These changes heavily borrow on Rusty's tlv branch (mistakes are mine of course :))

Here's an updated test vector:

{
  "msg" : {
    "type" : "QueryChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 100000,
    "numberOfBlocks" : 1500,
    "extensions" : [ ]
  },
  "hex" : "01070f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000186a0000005dc"
}
{
  "msg" : {
    "type" : "QueryChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 35000,
    "numberOfBlocks" : 100,
    "extensions" : [ "WANT_TIMESTAMPS | WANT_CHECKSUMS" ]
  },
  "hex" : "01070f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000088b800000064010103"
}
{
  "msg" : {
    "type" : "ReplyChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 756230,
    "numberOfBlocks" : 1500,
    "complete" : 1,
    "shortChannelIds" : {
      "encoding" : "UNCOMPRESSED",
      "array" : [ "0x0x142", "0x0x15465", "0x69x42692" ]
    }
  },
  "hex" : "01080f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000b8a06000005dc01001900000000000000008e0000000000003c69000000000045a6c4"
}
{
  "msg" : {
    "type" : "ReplyChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 1600,
    "numberOfBlocks" : 110,
    "complete" : 1,
    "shortChannelIds" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ "0x0x142", "0x0x15465", "0x4x3318" ]
    }
  },
  "hex" : "01080f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206000006400000006e01001601789c636000833e08659309a65878be010010a9023a"
}
{
  "msg" : {
    "type" : "ReplyChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 122334,
    "numberOfBlocks" : 1500,
    "complete" : 1,
    "shortChannelIds" : {
      "encoding" : "UNCOMPRESSED",
      "array" : [ "0x0x12355", "0x7x30934", "0x70x57793" ]
    },
    "timestamps" : {
      "encoding" : "UNCOMPRESSED",
      "timestamps" : [ {
        "timestamp1" : 164545,
        "timestamp2" : 948165
      }, {
        "timestamp1" : 489645,
        "timestamp2" : 4786864
      }, {
        "timestamp1" : 46456,
        "timestamp2" : 9788415
      } ]
    },
    "checksums" : {
      "checksums" : [ {
        "checksum1" : 1111,
        "checksum2" : 2222
      }, {
        "checksum1" : 3333,
        "checksum2" : 4444
      }, {
        "checksum1" : 5555,
        "checksum2" : 6666
      } ]
    }
  },
  "hex" : "01080f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e22060001ddde000005dc01001900000000000000304300000000000778d6000000000046e1c1011900000282c1000e77c5000778ad00490ab00000b57800955bff031800000457000008ae00000d050000115c000015b300001a0a"
}
{
  "msg" : {
    "type" : "ReplyChannelRange",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "firstBlockNum" : 122334,
    "numberOfBlocks" : 1500,
    "complete" : 1,
    "shortChannelIds" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ "0x0x12355", "0x7x30934", "0x70x57793" ]
    },
    "timestamps" : {
      "encoding" : "COMPRESSED_ZLIB",
      "timestamps" : [ {
        "timestamp1" : 164545,
        "timestamp2" : 948165
      }, {
        "timestamp1" : 489645,
        "timestamp2" : 4786864
      }, {
        "timestamp1" : 46456,
        "timestamp2" : 9788415
      } ]
    },
    "checksums" : {
      "checksums" : [ {
        "checksum1" : 1111,
        "checksum2" : 2222
      }, {
        "checksum1" : 3333,
        "checksum2" : 4444
      }, {
        "checksum1" : 5555,
        "checksum2" : 6666
      } ]
    }
  },
  "hex" : "01080f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e22060001ddde000005dc01001801789c63600001036730c55e710d4cbb3d3c080017c303b1012201789c63606a3ac8c0577e9481bd622d8327d7060686ad150c53a3ff0300554707db031800000457000008ae00000d050000115c000015b300001a0a"
}
{
  "msg" : {
    "type" : "QueryShortChannelIds",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "shortChannelIds" : {
      "encoding" : "UNCOMPRESSED",
      "array" : [ "0x0x142", "0x0x15465", "0x69x42692" ]
    },
    "extensions" : [ ]
  },
  "hex" : "01050f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206001900000000000000008e0000000000003c69000000000045a6c4"
}
{
  "msg" : {
    "type" : "QueryShortChannelIds",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "shortChannelIds" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ "0x0x4564", "0x2x47550", "0x69x42692" ]
    },
    "extensions" : [ ]
  },
  "hex" : "01050f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206001801789c63600001c12b608a69e73e30edbaec0800203b040e"
}
{
  "msg" : {
    "type" : "QueryShortChannelIds",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "shortChannelIds" : {
      "encoding" : "UNCOMPRESSED",
      "array" : [ "0x0x12232", "0x0x15556", "0x69x42692" ]
    },
    "extensions" : [ {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ 1, 2, 4 ]
    } ]
  },
  "hex" : "01050f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e22060019000000000000002fc80000000000003cc4000000000045a6c4010c01789c6364620100000e0008"
}
{
  "msg" : {
    "type" : "QueryShortChannelIds",
    "chainHash" : "0f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206",
    "shortChannelIds" : {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ "0x0x14200", "0x0x46645", "0x69x42692" ]
    },
    "extensions" : [ {
      "encoding" : "COMPRESSED_ZLIB",
      "array" : [ 1, 2, 4 ]
    } ]
  },
  "hex" : "01050f9188f13cb7b2c71f2a335e3a4fc328bf5beb436012afca590b1a11466e2206001801789c63600001f30a30c5b0cd144cb92e3b020017c6034a010c01789c6364620100000e0008"
}

Copy link
Collaborator

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work @sstone! really like the direction this is taking and addition of distinct bits for timestamps and checksums. as a whole the proposal looks to be in a good place, i've left some minor comments in line. looking forward to implementing this :)

07-routing-gossip.md Outdated Show resolved Hide resolved
07-routing-gossip.md Outdated Show resolved Hide resolved
07-routing-gossip.md Outdated Show resolved Hide resolved
07-routing-gossip.md Show resolved Hide resolved
07-routing-gossip.md Outdated Show resolved Hide resolved
07-routing-gossip.md Outdated Show resolved Hide resolved
07-routing-gossip.md Show resolved Hide resolved
07-routing-gossip.md Outdated Show resolved Hide resolved
07-routing-gossip.md Outdated Show resolved Hide resolved
@t-bast
Copy link
Collaborator

t-bast commented Jul 5, 2019

ACK 1978561

@sstone
Copy link
Collaborator Author

sstone commented Jul 5, 2019

Since #607 has been approved (yes !), I've implemented a suggestion by @t-bast to use minimally-encoded varints instead of single bytes for query flags. Given the number of options atm they will all still be encoded on a single byte, and test vectors remain valid. I think we're good to go now and hope this will be merged with the TLV PR.

@rustyrussell
Copy link
Collaborator

OK, so meeting agreed to remove compression flag and use a straight array for checksums:

http://www.erisian.com.au/meetbot/lightning-dev/2019/lightning-dev.2019-08-05-20.03.html

So I'm going to merge this for c-lightning (only with --enable-experimental-features) for next release: ElementsProject/lightning#2900

I'm not completely opposed to removing the EXPERIMENTAL_FEATURES conditional in the next few days (we tag -rc1 on the 10th of alternate months) if spec is sorted, feature flag is added, and Eclair confirms interoperation in practice. But that's a tight timeline!

sstone and others added 7 commits August 6, 2019 09:52
… (folded)

Nodes can append additional data to their `query_short_channel_ids`
messages, which consists in one flag per short channel id and
specifies what they would like to receive (`node_announcement`,
`channel_announcement`, or/and one `channel_update` or both).
…folded)

Nodes that support extended queries will append an additional extended query flag to
their `query_channel_range` queries. If the receiver supports extended queries and
understands this flag, it will append the required additional data to its
`reply_channel_range` message.

There is currently only one type of additional data: one timestamp and one checksum
per `channel_update`.
The checksum is a CRC32 checksum computed over the `channel_update`
with `timestamp` and `signature` omitted.

Along with query_short_channel_ids extension, this can be used to
avoid querying `channel_updates` that are older than the ones you
already have, or that are newer but don't include new information.
Formatting changes only.

This make tools/extract-formats.py work (well, it misses some stuff
until the tlv-testcases merge, but then it's OK).

We use `tlvs` (for tlv stream), and we refer to TLV records as "being
included" rather than re-using the TLV name.

We even use subtypes for the pairs of checksums and timestamps.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since some can be zero (missing updates), it's probably worth
doing the compression thing optionally.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This is the one in SSE4, FWIW, and the iSCSI RFC contains test
vectors.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@sstone sstone force-pushed the bolt7-extended-channel-queries branch from b726493 to c11c35a Compare August 6, 2019 08:31
sstone added 3 commits August 6, 2019 11:28
We use the more tool-friendly `...*` description for TLV extensions.
Checksums are now serialized as raw arrays, as using zlib compression here would not help.
@sstone
Copy link
Collaborator Author

sstone commented Aug 6, 2019

@rustyrussell I've changed the formatting and added a JSON test vector. Its format is just a bit different from the one you used but I could get your test to pass with a few changes

rustyrussell added a commit to rustyrussell/lightning that referenced this pull request Aug 10, 2019
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to ElementsProject/lightning that referenced this pull request Aug 10, 2019
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
sstone added a commit to ACINQ/eclair that referenced this pull request Aug 22, 2019
We will use them on mainnet as soon as lightning/bolts#557 has been merged.
pm47 pushed a commit to ACINQ/eclair that referenced this pull request Aug 22, 2019
* Extended Queries: use TLV format for optional data

Optional query extensions now use TLV instead of a custom format.
Flags are encoded as varint instead of bytes as originally proposed. With the current proposal they will all fit on a single byte, but will be
much easier to extends this way.

* Move query message TLVs to their own namespace

We add one new class for each TLV type, with specific TLV types, and encapsulate codecs.

* Optional TLVs are represented as a list, not an optional list

TLVs that extend regular LN messages can be represented as a TlvStream and not an Option[TlvStream] since we don't need
to explicitely terminate the stream (either by preprending its length or using a specific terminator) as we do in Onion TLVs.

No TLVs simply means that the TLV stream is empty.

* Update to match  BOLT PR

Checksums in ReplyChannelRange now have the same encoding as short channel ids and timestamps: one byte for
the encoding type (uncompressed or zlib) followed by encoded data.

* TLV Stream: Implement a generic "get" method for TLV fields

If a have a TLV stream of type MyTLV which is a subtype of TLV, and MyTLV1 and MYTLV2 are both
subtypes of MyTLV then we can use stream.get[MyTLV1] to get the TLV record of type MYTLV1 (if any)
in our TLV stream.

* Extended range queries: Implement latest BOLT changes

Checksums are just transmitted as a raw array, with optional compression as it would be useless here.

* Use extended range queries on regtest and testnet

We will use them on mainnet as soon as lightning/bolts#557 has been merged.

* Address review comments

* Router: rework handling of ReplyChannelRange

We remove the ugly and inefficient zipWithIndex we had before

* NodeParams: move fee base check to its proper place

* Router: minor cleanup
Do not reply with a node_announcement if the query includes an optional query flag that does not request it.
The current wording could be interpreted as "always follow with node announcements whenever
you reply with a channel announcements" which defeats the point of using query flags (if you want the node
announcements just set the corresponding bits).
@rustyrussell
Copy link
Collaborator

OK, I found a bug in my code while implementing the protocol tests; my checksums are wrong.

I assume the crc32 is not supposed to cover the 2 type bytes at the start of the channel_update? ie. our code (now!):

	assert(tal_count(channel_update) > 2 + 64 + 32 + 8 + 4);
	sum = crc32c(0, channel_update + 2 + 64, 32 + 8);
	sum = crc32c(sum, channel_update + 2 + 64 + 32 + 8 + 4,
		     tal_count(channel_update) - (64 + 2 + 32 + 8 + 4));

And for a specific test cases:

  1. signature=76df7e70c63cc2b63ef1c062b99c6d934a80ef2fd4dae9e1d86d277f47674af3255a97fa52ade7f129263f591ed784996eba6383135896cc117a438c80293282 chain_hash=06226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f short_channel_id=103x1x0 timestamp=1565587763 message_flags=0 channel_flags=0 cltv_expiry_delta=144 htlc_minimum_msat=0 fee_base_msat=1000 fee_proportional_millionths=10
    010276df7e70c63cc2b63ef1c062b99c6d934a80ef2fd4dae9e1d86d277f47674af3255a97fa52ade7f129263f591ed784996eba6383135896cc117a438c8029328206226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f00006700000100005d50f933000000900000000000000000000003e80000000a
    crc32 = 0x1112fa30

  2. signature=06737e9e18d3e4d0ab4066ccaecdcc10e648c5f1c5413f1610747e0d463fa7fa39c1b02ea2fd694275ecfefe4fe9631f24afd182ab75b805e16cd550941f858c chain_hash=06226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f short_channel_id=109x1x0 timestamp=1565587765 message_flags=1 channel_flags=0 cltv_expiry_delta=48 htlc_minimum_msat=0 fee_base_msat=100 fee_proportional_millionths=11 htlc_maximum_msat=100000
    010206737e9e18d3e4d0ab4066ccaecdcc10e648c5f1c5413f1610747e0d463fa7fa39c1b02ea2fd694275ecfefe4fe9631f24afd182ab75b805e16cd550941f858c06226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f00006d00000100005d50f935010000300000000000000000000000640000000b00000000000186a0
    crc32 = f32ce968

@sstone
Copy link
Collaborator Author

sstone commented Aug 27, 2019

@rustyrussell thanks! we do skip the first 2 bytes of the encoded channel_update and find the same checksums, I'll make new test vectors and include your data.

pm47 added a commit to ACINQ/eclair that referenced this pull request Aug 28, 2019
This is the implementation of lightning/bolts#557.

* Correctly handle multiple channel_range_replies

The scheme we use to keep tracks of channel queries with each peer would forget about
missing data when several channel_range_replies are sent back for a single channel_range_queries.

* RoutingSync: remove peer entry properly

* Remove peer entry on our sync map only when we've received
a `reply_short_channel_ids_end` message.
* Make routing sync test more explicit

* Routing Sync: rename Sync.count to Sync.totalMissingCount

* Do not send channel queries if we don't want to sync

* Router: clean our sync state when we (re)connect to a peer

We must clean up leftovers for the previous session and start the sync process again.

* Router: reset sync state on reconnection

When we're reconnected to a peer we will start a new sync process and should reset our sync
state with that peer.

* Extended Queries: use TLV format for optional data

Optional query extensions now use TLV instead of a custom format.
Flags are encoded as varint instead of bytes as originally proposed. With the current proposal they will all fit on a single byte, but will be
much easier to extends this way.

* Optional TLVs are represented as a list, not an optional list

TLVs that extend regular LN messages can be represented as a TlvStream and not an Option[TlvStream] since we don't need
to explicitely terminate the stream (either by preprending its length or using a specific terminator) as we do in Onion TLVs.

No TLVs simply means that the TLV stream is empty.

* TLV Stream: Implement a generic "get" method for TLV fields

If a have a TLV stream of type MyTLV which is a subtype of TLV, and MyTLV1 and MYTLV2 are both
subtypes of MyTLV then we can use stream.get[MyTLV1] to get the TLV record of type MYTLV1 (if any)
in our TLV stream.

* Use extended range queries on regtest and testnet

We will use them on mainnet as soon as lightning/bolts#557 has been merged.

* Channel range queries: send back node announcements if requested (#1108)

This PR adds support for sending back node announcements when replying to channel range queries:
- when explicitly requested (bit is set in the optional query flag)
- when query flags are not used and a channel announcement is sent (as per the BOLTs)

A new configuration option `request-node-announcements` has been added in the `router` section. If set to true, we
will request node announcements when we receive a channel id (through channel range queries) that we don't know of.
This is a setting that we will probably turn off on mobile devices.

* Extended Channel Queries: add CL interop test
rustyrussell added a commit to ElementsProject/lightning-rfc-protocol-test that referenced this pull request Sep 2, 2019
In particular, this is effectively a merge of lightning#557 and lightning#655, so
you can run all the protocol tests at once.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@t-bast
Copy link
Collaborator

t-bast commented Sep 3, 2019

CL and Eclair have implemented this and correctly inter-operate.
LL said implementation will come later, but concept acked.
@cfromknecht can you tell us if you're ok with the current state of the PR and merging it as-is?

Copy link
Collaborator

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final version LGTM!! 😀

@sstone sstone merged commit c8e53fe into lightning:master Sep 16, 2019
sstone added a commit to ACINQ/eclair that referenced this pull request Oct 8, 2019
* Update list of commands in eclair-cli help (#1091)

* Add missing API endpoints to eclair-cli help

* Documentation update (#1092)

* Typed amounts (#1088)

* Route computation: fix fee check (#1101)

Fee check during route computation is:
- fee is below maximum value
- OR fee is below amout * maximum percentage

The second check was buggy and route computation would failed when fees we above maximum value but below maximum percentage of amount being paid.

* Publish transactions during transitions (#1089)

Follow up to #1082.

The goal is to be able to publish transactions only after we have
persisted the state. Otherwise we may run into corner cases like [1]
where a refund tx has been published, but we haven't kept track of it
and generate a different one (with different fees) the next time.

As a side effect, we can now remove the special case that we were
doing when publishing the funding tx, and remove the `store` function.

NB: the new `calling` transition method isn't restricted to publishing
transactions but that is the only use case for now.

[1] ACINQ/eclair-mobile#206

* Typed cltv expiry (#1104)

Untyped cltv expiry was confusing: delta and absolute expiries really need to be handled differently.
Even variable names were sometimes misleading.
Now the compiler will help us catch errors early.

* Extended queries optional (#899)

This is the implementation of lightning/bolts#557.

* Correctly handle multiple channel_range_replies

The scheme we use to keep tracks of channel queries with each peer would forget about
missing data when several channel_range_replies are sent back for a single channel_range_queries.

* RoutingSync: remove peer entry properly

* Remove peer entry on our sync map only when we've received
a `reply_short_channel_ids_end` message.
* Make routing sync test more explicit

* Do not send channel queries if we don't want to sync

* Router: clean our sync state when we (re)connect to a peer

We must clean up leftovers for the previous session and start the sync process again.

* Router: reset sync state on reconnection

When we're reconnected to a peer we will start a new sync process and should reset our sync
state with that peer.

* Extended Queries: use TLV format for optional data

Optional query extensions now use TLV instead of a custom format.
Flags are encoded as varint instead of bytes as originally proposed. With the current proposal they will all fit on a single byte, but will be
much easier to extends this way.

* TLV Stream: Implement a generic "get" method for TLV fields

If a have a TLV stream of type MyTLV which is a subtype of TLV, and MyTLV1 and MYTLV2 are both
subtypes of MyTLV then we can use stream.get[MyTLV1] to get the TLV record of type MYTLV1 (if any)
in our TLV stream.

* Channel range queries: send back node announcements if requested (#1108)

This PR adds support for sending back node announcements when replying to channel range queries:
- when explicitly requested (bit is set in the optional query flag)
- when query flags are not used and a channel announcement is sent (as per the BOLTs)

A new configuration option `request-node-announcements` has been added in the `router` section. If set to true, we
will request node announcements when we receive a channel id (through channel range queries) that we don't know of.
This is a setting that we will probably turn off on mobile devices.

* Rework router data structures (#902)

Instead of using two separate maps (for channels and channel_updates), we now use a single map, which groups channel+channel_updates. This is also true for data storage, resulting in the removal of the channel_updates table.

* Add more numeric utilities to MilliSatoshi (#1103)

Add comparisons and postfix operators.
Update most of the codebase to leverage those.

* Use unsigned comparison for 'maxHtlcValueInFlightMsat' (#1105)

* Add a sync whitelist (#954)

We will only sync with whilelisted peer. If the whitelist is empty then
we sync with everyone.

* Move http APIs to subproject eclair-node (#1102)

* Fix regression in `Commitments.availableForSend` (#1107)

We must consider `nextRemoteCommit` when applicable.

This is a regression caused in #784. The core bug only exists when we
have a pending unacked `commit_sig`, but since we only send the
`AvailableBalanceChanged` event when sending a signature (not when
receiving a revocation), actors relying on this event to know the
current available balance (e.g. the `Relayer`) will have a wrong
value in-between two outgoing sigs.

* Bolt4: remove final_expiry_too_soon error message (#1106)

It allowed probing attacks and the spec deprecated it in favor of IncorrectOrUnknownPaymentDetails.
Also add better support for unknown failure messages.

* Fix maven mirror (#1120)

* Use Long to back the UInt64 type (#1109)

* Define comparison operators between UInt64 and MilliSatoshi

* Implement Bolt 11 invoice feature bits (#1121)

lightning/bolts#656 introduced invoice feature bits as a pre-requisite for AMP and other advanced payment use-cases.

* Update docker build (#1123)

* Update docker base image to jdk11, update maven to 3.6.2 [ci skip]

* Reject expired invoices before payment flow starts (#1117)

* Made sync params configurable (#1124)

This allows us to choose smaller parameters for tests and reduce cpu
requirement during testing.

NB: The default value of 3500 for `reply_channel_range` was wrong. Theoretical max is ~2700.

* Activate support for variable-length onion (#1087)

This is now enabled by default.
We forward variable-length onions if we receive some.
We accept variable-length payments.
However for maximum compatibility with the network, we send payments using legacy payloads.

* Add Semaphore CI (#1125)

* Router computes network stats (#1116)

* Add comments and fix warnings in graph processing
* Add small feature to set the htlcMaximumMsat for routing hints (otherwise the graph processing algorithm used a minimum value which slightly reduced the benefits of those routing hints)
* Add the computation of network statistics to the router: this will be useful for multi-part payments to decide what thresholds should be used to split a payment

* Add monitoring with Kamon (disabled by default) (#1126)

For now:
- we only track some tasks (especially in the router, but not even
`node_announcement` and `channel_update`
- all db calls are monitored
- kamon is disabled by default

* Check funds in millisatoshi when sending/receiving an HTLC (#1128)

Instead of satoshi, which could introduce rounding errors.

Also, we check first the balance before the max-inflight amount, because
it makes more sense in terms of error management.

Co-Authored-By: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>

* Don't hardcode the channel version (#1129)

Instead of hardcoding the channel version when we instantiate the
`Commitments` object, we rather define it when the channel is
instantiated. This is saner and prepares future usage.

* Removed Globals class (#1127)

This is a prerequisite to parallelization of tests.

* Make tests run in parallel (#1112)

There are two level of parallelization:
- between test suites (a suite = a test file)
- within a suite (depends on tests suites, some rely on sequential execution of tests, some don't)

* Add codecov integration to semaphore CI (#1134)

* Remove codecov integration from travis CI

* Drop support for Java 8 (#1135)

We already have Java 7 (for Android) and Java 11. Supporting Java 8
would require crossbuilding, which we are not doing (two recent PRs
broke the build on Java 8).

* Sphinx: accept invalid downstream errors (#1137)

When a downstream node sends us an onion error with an invalid length, we must forward the failure.
The recipient won't be able to extract the error but at least it knows the payment failed.

* Update string to match on bitcoind while it's indexing (#1138)

* Check for bitcoind's getrawtransaction availablilty during startup

* Peer: disable kamon

* Payment lifecycle refactoring (#1130)

* Unify payment events (no more duplication between payment types and events)
* Factorize DB and eventStream interactions: this paves the way for sub-payments that shouldn't be stored in the DB nor emit events.
* Add more fields to the payments DB:
  * bolt 11 invoice for sent payment
  * external id (for app developers)
  * parent id (AMP)
  * target node id
  * fees
  * route (if success)
  * failures (if failed)
* Re-work the PaymentsDb interface
* Clarify use of seconds / milliseconds in DB interfaces -> milliseconds everywhere
* Run SQL migrations inside transactions

* Improve error handling when we couldn't find all the channels for a supplied route in /sendtoroute API (#1142)

* Improve error handling when we couldn't find all the channels for a supplied route in /sendtoroute

* Handle fees increases when channel is OFFLINE (#1080)

* Add 'close-on-offline-feerate-mismatch' configuration to avoid closing offline channel when the feerate mismatch if over the threshold.

* Derive channel keys from the channel funding pubkey (#1097)

We now generate a random funding key for each new channel, and use its public key to deterministically derive all channel keys and secrets. This will let us easily recover funds using DLP even if we've lost everything but our seed: we just need to connect to the node we had a channel with, ask them to publish their commit tx, and once we see it on the blockchain we can extract our funding pubkey, recompute channel keys and spend our output.

* Add a "funding pubkey path" option to the channel version field

This option is checked when we need to compute channel keys. For old channels it won't be set, and we always set it for new ones.

* ChannelVersion: make sure that all bits are set to 0 for legacy channels

* ChannelVersion: USE_PUBKEY_KEYPATH is set by default

* Check if remote funder can handle an updated commit fee when sending HTLC (#1084)

If the sender of an htlc isn't the funder, then both sides will have to afford the payment:
- the sender needs to be able to afford the htlc amount
- the funder needs to be able to afford the greater commit tx fee incurred by the additional htlc output.

Fixes #1081.

Co-Authored-By: Pierre-Marie Padiou <pm47@users.noreply.github.com>

* Fix and expand channel keypath (#1147)

* Fix funding pubkey to channel key path computation

Channel key path is generated from 8 bytes computed from our funding pubkey, but we extracted 4 uint32 values instead of 2 (last 2 were always 0). We now use 128 bits to derive channel key paths.

* Add a channel key path compatibility test

This test will fail if we change the way we compute channel key paths, which would break existing channels.

* Use the same chain hash reference in all channel updates

To save memory, once we check that a channel_update's chain hash matches what
we expect we just replace it with a reference to our own chain hash.

* Commitments: take HTLC fee into account (#1152)

Our balance computation was slightly incorrect. If you want to know how much you can send (or receive), you need to take into account the fact that you'll add a new HTLC which adds weight to the commit tx (and thus adds fees).

* Android: add a spray-based API to eclair-node

This is a copy of the spray-based API developped by @araspitzu (akka-http does not
work for akka 2.3 which we use on the android branch)

* HTTP API: add type hints for payment status (#1150)

Cleans up the JSON payment status (easier to interpret for callers).

* Use "mock" Kamon library

Kamon does not work on Android and does not make much sense, so we replace
it with a basic Mock implementation that does nothing.

* Electrum: improve coin selection (fixes #1146) (#1149)

Our previous coin selection would sometimes fail when there was one wallet utxo and and low 
 feerate, because our first pass used a fee estimate that was too high and could sometimes not be met.

* Extend funding key path to 256 bits (#1154)

Our random funding key path is now 8 * 32 bits plus a 1' (funder) or 0' (fundee).
Channel key paths are computed from the sha256 of the funding public key (we take all 256 bits).

* Use bitcoin 0.18.1 in the test (#1148)

* Upgrade new unit tests to bitcoin 0.18.1 API (#1157)

We had 2 open PRs, one that added new tests using the 0.API, one that switched to 0.18.1, when they were merged the new tests failed since they had not been upgraded....

* Update netty dependency to 4.1.32 (#1160)

Also:
* explicitely set endpoint identification algorithm in strict mode
* force TLS protocols 1.2/1.3 in strict mode

Co-Authored-By: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>

* Add execution time limit (#1161)

* Android: wipe channels table during db migration

We already wipe the updates table, and this make upgrading much simpler since we had different structures on
android vs mater.

* Activate extended channel range queries (#1165)
By default we now set the `gossip_queries_ex` feature bit.
We also change how we compare feature bits, and will use channel queries (or extended queries) only if the corresponding feature bit is set in both local and remote init messages.

* Use guava to compute CRC32C checksums (#1166)

CRC32C is not available in JDK 7 which we target on Android.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meeting Discussion Raise at next meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants