-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad commit_sig signature #4152
Comments
Very interesting indeed, looking at the transaction whose signature failed it seems like the transaction that ultimately made it on-chain looks identical to the one which failed the signature. The fact that the unilateral close was ours suggests that the two consecutive states look identical, narrowing this down to a feerate being the cause for the new commit. However we also add and remove an HTLC that are identical except the "htlcs": [
{
"direction": "out",
"id": 95,
"msatoshi": 125000125,
"amount_msat": "125000125msat",
"expiry": 653923,
"payment_hash": "hashA",
"state": "SENT_ADD_ACK_REVOCATION"
},
{
"direction": "out",
"id": 96,
"msatoshi": 125000125,
"amount_msat": "125000125msat",
"expiry": 653923,
"payment_hash": "hashB",
"state": "RCVD_REMOVE_HTLC"
}
] So it could be that the unilateral close was really the one with |
This happened again with a different peer, this time with an empty |
This happened yet again, using git bbfcae6, this time with totally full HTLC slots. All but two of those HTLCs were incoming, in state This bug has cost me a ton of money, and fixing it is clearly not a priority. Despite my significant investment in C-Lightning infrastructure, I have started planning to migrate to an LND node. I am not willing to keep burning satoshis like this. I will also warn other people not to use C-Lightning on mainnet until this bug is fixed so they don't get burned, either. |
OK, the good news is I've caught one of these in my logs, too! Finally... @cdecker's idea that it's around feerate changes seems to be borne out here. Seems like we were updating fees, both sides were on different levels, and then when we tried updating fees again (while they were still in flux), something b0rked. There's a simple workaround, however: we can avoid changing fees again until they're completely quiescent: we know the fee logic works in the simple cases, otherwise we wouldn't keep channels open at all. That means even if LND's or our fee logic doesn't work in complex cases (and I'm still tracking this down to try to get the exact cause here), we won't trigger it in the meantime. |
I totally understand! It's a nasty bug, and because it wasn't happening all the time, it didn't hit the top of our TODO list. However, now I've got some more clues I am pretty sure I can work around it (for existing peers), and make sure that we're actually doing the right thing (fixing LND or c-lightning, whichever is wrong). |
What are you referring to when you say quiescent? Do you mean no feerate update in flight, no HTLCs in flight or no commitment? I noticed that reports often share a couple of HTLCs and some are even close to the default 30 concurrent HTLCs. Some of these nodes seem incredibly busy, so waiting for there to be no HTLCs might starve our fee adjustment. Would remembering the last couple of feerates and just trying them out work? |
Was planning on "no feerates in flight". In my logs, there are multiple fee changes going on at the same time... |
Super glad this might be resolved soon. I really don't want to migrate to LND, so if I stop bleeding satoshis then I am absolutely going to stay on C-Lightning. Thanks for looking at this! |
There are several reports of desynchronizaion with LND here; a simple approach is to only have one feerate change in flight at any time. Even if this turns out to be our fault, it's been a historic area of confusion, so this restriction seems reasonable. Fixes: ElementsProject#4152 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are several reports of desynchronizaion with LND here; a simple approach is to only have one feerate change in flight at any time. Even if this turns out to be our fault, it's been a historic area of confusion, so this restriction seems reasonable. Fixes: ElementsProject#4152 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are several reports of desynchronizaion with LND here; a simple approach is to only have one feerate change in flight at any time. Even if this turns out to be our fault, it's been a historic area of confusion, so this restriction seems reasonable. Changelog-Fixed: Protocol: Don't create more than one feerate change at a time, as this seems to desync with LND. Fixes: ElementsProject#4152 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are several reports of desynchronization with LND here; a simple approach is to only have one feerate change in flight at any time. Even if this turns out to be our fault, it's been a historic area of confusion, so this restriction seems reasonable. Changelog-Fixed: Protocol: Don't create more than one feerate change at a time, as this seems to desync with LND. Fixes: ElementsProject#4152 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There are several reports of desynchronization with LND here; a simple approach is to only have one feerate change in flight at any time. Even if this turns out to be our fault, it's been a historic area of confusion, so this restriction seems reasonable. Changelog-Fixed: Protocol: Don't create more than one feerate change at a time, as this seems to desync with LND. Fixes: #4152 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
az0re
on IRC just reported it. Unfortunately Github won't let them get here because they use Tor.Told them to send the (private) signature, transaction and feerate by mail.
The text was updated successfully, but these errors were encountered: