-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTLC error cleanup #3472
HTLC error cleanup #3472
Conversation
a6551a0
to
e89f6d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice cleanup @rustyrussell 👍 sorry it took so long to review.
- cleanup: It is very unfortunate that we call out to
gossipd
every time we
forward a payment. I don't really have any numbers as to how much time we
are spending on these calls, but I assume it is quite expensive. We should
instead cache these details inlightningd
and push updates if anything
changes. - I'm really loving how 51ff523 simplifies
channeld
👍
The only real issue is that the migration SQL statement is not portable (x'2002'
is not valid syntax for postgres) so we need to find an alternative formulation or create a rewrite rule for hex-encoded fields. I'll look into which one works better and report back.
lightningd/peer_htlcs.c
Outdated
|
||
/* This stays around even if we fail it immediately: it *is* | ||
* part of the current commitment. */ | ||
hin = new_htlc_in(channel, channel, added->id, added->amount, | ||
added->cltv_expiry, &added->payment_hash, | ||
shared_secret, added->onion_routing_packet); | ||
failcode ? NULL: &shared_secret, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failcode ? NULL: &shared_secret, | |
failcode ? NULL : &shared_secret, |
wallet/db.c
Outdated
/* For outgoing HTLCs, we now keep a localmsg instead of a failcode. | ||
* Turn anything in transition into a WIRE_TEMPORARY_NODE_FAILURE. */ | ||
{SQL("ALTER TABLE channel_htlcs ADD localfailmsg BLOB;"), NULL}, | ||
{SQL("UPDATE channel_htlcs SET localfailmsg=x'2002' WHERE malformed_onion != 0 AND direction = 1;"), NULL}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not be portable, checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it was deeply unclear to me how to set a BLOB.
lightningd/invoice.c
Outdated
/* FIXME: Allow hook to return its own failmsg directly? */ | ||
if (val == WIRE_TEMPORARY_NODE_FAILURE) | ||
return towire_temporary_node_failure(ctx); | ||
if (val != WIRE_INCORRECT_OR_UNKNOWN_PAYMENT_DETAILS) | ||
log_broken(hin->key.channel->log, | ||
"invoice_payment hook returned failcode %u," | ||
" changing to incorrect_or_unknown_payment_details", | ||
val); | ||
|
||
return failmsg_incorrect_or_unknown(ctx, hin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look reasonable and cover all the plugins I'm aware of 👍
Seems a bit strange though to structure it like this if (positive) {} if (negative) {} positive
I have a fix for that, which works even better. But the series was getting long enough! |
e89f6d5
to
c576c98
Compare
This is clearer, especially when we also deal with raw not-yet-onion-wrapped failure messages. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. forward_htlc sets hout to NULL. 2. forward_htlc passes &hout to send_htlc_out. 3. forward_htlc checks the failcode and frees(NULL) and sets hout to NULL (again). This in fact covers every failcode which send_htlc_out returns. We should ensure send_htlc_out sets *houtp to NULL on failure; in fact, both callers pass houtp, so we can make it unconditional. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
For incoming htlcs, we need failure details in case we need to re-xmit them. But for outgoing htlcs, lightningd is telling us it already knows they've failed, so we just need to flag them failed and don't need the details. Internally, we set the ->fail to a dummy non-NULL value; this is cleaned up next. This matters for the next patch, which moves onion handling into lightningd. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Turn it into temporary node failure: this only happens if we restart with a failed htlc in, but it's clearer and more robust to handle it generically. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I hadn't realized that lightningd asks gossipd every time we forward a payment. But I'm going to abuse it here to get the latest channel_update, otherwise (as lightningd takes over error message generation) lightningd needs to do an async request at various painful points. So have gossipd tell us the lastest update (stripped so compatible with the strange in-onion-error format). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Instead of making it ourselves, lightningd does it. Now we only have two cases of failed htlcs: completely malformed (BADONION), and with an already-wrapped onion reply to send. This makes channeld's job much simpler. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It's only called from the db code, and failing_channel is always NULL. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We're going to change our internal structure next, so this is preparation. We populate existing errors with temporary node failures, for simplicity. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We'll use this in the next patch for when we need to create errors to send back to lightningd; most commonly when the channel doesn't have capacity for the HTLC. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
…g HTLCs At the moment, we store e.g. WIRE_TEMPORARY_CHANNEL_FAILURE, and then lightningd has a large demux function which turns that into the correct error message. Such an enum demuxer is an anti-pattern. Instead, store the message directly for output HTLCs; channeld now sends us an error message rather than an error code. For input HTLCs we will still need the failure code if the onion was bad (since we need to prompt channeld to send a completely different message than normal), though we can (and will!) eliminate its use in non-BADONION failure cases. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We tell channeld that an htlc is bad by sending it a 'struct failed_htlc'. This usually contains an onionreply to forward, but for the case where the onion itself was bad, it contains a failure code instead. This makes the "send a failed_htlc for a bad onion" a completely separate code path, then we can work on removing failcodes from the other path. In several places 'failcode' is now changed to 'badonion' to reflect that it can only be a BADONION failcode. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
… failcode. Unfortunately the invoice_payment_hook can give us a failcode, so I simply restrict it to the two sensible ones. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-deprecated: plugins: invoice_payment_hook "failure_code" only handles simple cases now, use "failure_message".
…TLC errors (unless BADONION). This cleans up the "local failure" callers for incoming HTLCs to hand an onionreply instead of making us generate it from the code inside make_failmsg. (The db path still needs make_failmsg, so that's next). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-deprecated: Plugins: htlc_accepted_hook "failure_code" only handles simple cases now, use "failure_message".
…all onionreplyies. This completes the conversion; any in-flight HTLC failures get turned into temporary_node_failures. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
That's all it's used for now. And remove unreferenced failoutchannel. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
As promised in the Changelog when we converted from failcodes to messages internally. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
c131842
to
4dbeac8
Compare
Trivial rebase to fold fixup |
Swiftly kicking Travis so it knows it reported a wrong result :-) ACK 4dbeac8 |
Our HTLC error paths are confusing: channeld can detect an incoming HTLC error because it tries to decrypt the onion. This work is done again anyway by lightningd. In many places we have both an onionreply (if we're forwarding an error from a peer) and an error code, which we try to convert in a giant demuxer which is incomplete and growing in complexity.
By the end of this series, we have HTLC errors as follows:
If we forward an outgoing htlc error back to an incoming, we convert the failmsg to an onion (this requires the incoming HTLC secret), and the onion becomes the incomijng htlc failonion.
This is, frankly, much clearer that what we had before, though it's a painful transition in several ways.