Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: unable to forward, wrong "insufficient bandwidth" #7108

Closed
C-Otto opened this issue Nov 3, 2022 · 21 comments
Closed

[bug]: unable to forward, wrong "insufficient bandwidth" #7108

C-Otto opened this issue Nov 3, 2022 · 21 comments
Labels
bug Unintended code behaviour path finding routing nodes

Comments

@C-Otto
Copy link
Contributor

C-Otto commented Nov 3, 2022

Background

I have more than enough liquidity in one of my channels and should be able to serve outgoing forward requests. However, this doesn't work. Instead, lnd reports "insufficient bandwidth". Details below.

Your environment

  • lnd v0.15.4-beta
  • Linux server 5.10.0-13-amd64 #1 SMP Debian 5.10.106-1 (2022-03-17) x86_64 GNU/Linux
  • bitcoind v23

Steps to reproduce

Get forwarding request out via channel X.

Expected behaviour

Forwarding request is served, fees are earned, local balance is reduced.

Actual behaviour

"insufficient bandwidth to route htlc: x is larger than y" (canSendHtlc in link.go)

First observed failure:

  • Date: 2022-10-25 18:31:17
  • Requested amount: 188,242,405 msat
  • Available (according to lnd): 3,005,999 msat

Balance at 2022-10-25 18:29:25+02:

  • local: 9,897,025 sat
  • local reserve: 100,000 sat
  • remote: 100,078 sat
  • remote reserve: 100,000 sat

Most recent failure:

  • Date: 2022-11-03 13:18:31
  • Requested amount: 50,024,602 msat
  • Available (according to lnd): 2,843,999 msat

Balance at 2022-11-03 13:14:47+01:

  • local: 9,896,568 sat
  • local reserve: 100,000 sat
  • remote: 100,712 sat
  • remote reserve: 100,000 sat

lncli listchannels:

{
            "active": true,
            "remote_pubkey": "03627ebe50fc6eb80b0caab0c3714958c701eda735e3c29588e83150d6d4a93976",
            "channel_point": "db8a9a4d483e0fd0c84e5a004d51dd51ecbb113329729c18db6a5112bab61a4f:0",
            "chan_id": "834535922707529728",
            "capacity": "10000000",
            "local_balance": "9891564",
            "remote_balance": "108241",
            "commit_fee": "195",
            "commit_weight": "724",
            "fee_per_kw": "269",
            "unsettled_balance": "0",
            "total_satoshis_sent": "2036708",
            "total_satoshis_received": "11928272",
            "num_updates": "9950",
            "pending_htlcs": [
            ],
            "csv_delay": 144,
            "private": false,
            "initiator": false,
            "chan_status_flags": "ChanStatusDefault",
            "local_chan_reserve_sat": "100000",
            "remote_chan_reserve_sat": "100000",
            "static_remote_key": true,
            "commitment_type": "STATIC_REMOTE_KEY",
            "lifetime": "174093",
            "uptime": "174093",
            "close_address": "",
            "push_amount_sat": "0",
            "thaw_height": 0,
            "local_constraints": {
                "csv_delay": 144,
                "chan_reserve_sat": "100000",
                "dust_limit_sat": "354",
                "max_pending_amt_msat": "18446744073709551615",
                "min_htlc_msat": "0",
                "max_accepted_htlcs": 30
            },
            "remote_constraints": {
                "csv_delay": 1201,
                "chan_reserve_sat": "100000",
                "dust_limit_sat": "546",
                "max_pending_amt_msat": "9900000000",
                "min_htlc_msat": "1",
                "max_accepted_htlcs": 483
            },
            "alias_scids": [
            ],
            "zero_conf": false,
            "zero_conf_confirmed_scid": "0"
        }

lncli getchaninfo:

{
    "channel_id": "834535922707529728",
    "chan_point": "db8a9a4d483e0fd0c84e5a004d51dd51ecbb113329729c18db6a5112bab61a4f:0",
    "last_update": 1667403066,
    "node1_pub": "027ce055380348d7812d2ae7745701c9f93e70c1adeb2657f053f91df4f2843c71",
    "node2_pub": "03627ebe50fc6eb80b0caab0c3714958c701eda735e3c29588e83150d6d4a93976",
    "capacity": "10000000",
    "node1_policy": {
        "time_lock_delta": 99,
        "min_htlc": "1",
        "fee_base_msat": "0",
        "fee_rate_milli_msat": "1",
        "disabled": false,
        "max_htlc_msat": "10000000000",
        "last_update": 1667403066
    },
    "node2_policy": {
        "time_lock_delta": 34,
        "min_htlc": "1",
        "fee_base_msat": "0",
        "fee_rate_milli_msat": "135",
        "disabled": false,
        "max_htlc_msat": "104096134",
        "last_update": 1667143395
    }
}
@C-Otto C-Otto added bug Unintended code behaviour needs triage labels Nov 3, 2022
@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 3, 2022

The available local balance (msat), as parsed from lnd's log messages:

 count |  available   
-------+--------------
     1 |    2,603,999
     1 |    2,607,999
     1 |    2,701,999
     1 |    2,702,999
     1 |    2,703,999
     1 |    2,705,999
     4 |    2,706,999
     1 |    2,843,999
     2 |    2,976,999
     1 |    2,980,999
     2 |    2,991,999
     1 |    2,993,999
     2 |    2,994,999
     1 |    2,995,999
    69 |    3,005,999

@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 3, 2022

Note that I was able to forward 200,104 sat after the first failure (a few days later, 2022-10-29 10:54:17+02). As far as I can tell, the channel didn't move any sats between the first error and this outgoing forward.

@positiveblue
Copy link
Contributor

@C-Otto I guess not but is there is any htlc in flight blocking the channel liquidity?

@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 3, 2022

No, not according to the logs, and not right now (see lncli listchannels output above). Considering that I had more than 50 failures, this also seems very unlikely.

@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 3, 2022

Note that all "available local balance" values end with 999msat, so I guess it's this code:
https://github.com/lightningnetwork/lnd/blob/master/lnwallet/channel.go#L7034

@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 3, 2022

Note that I'm able to send payments out via the channel, which could be a related bug.

@Roasbeef
Copy link
Member

Roasbeef commented Nov 3, 2022

FWIW, that's a static key channel (legacy), so any HTLC needs to be able to pay fees for itself at the given fee rate, this can cause you to be unable to send even though it looks like you can.

Anchor channels allow the second level HTLC to be zero fees, so you don't need to commit as much since you just care about the impact of the output and not also the second level transaction itself.

Not sure how many HTLCs you had at the time (or prior instance), but this also jumps out:

                "max_accepted_htlcs": 30

AFAICT, you're not the initaitor in this channel, and in one of the instance the remote party had a very low balance:

remote: 100,078 sat
remote reserve: 100,000 sat

If they can't pay for the fees of an HTLC added, then we won't allow that to proceed as we can get into a situation where they only have dust or no amount left over in the channel. The reserve also comes into play as well: even if they can pay the fee, we can't let them dip below the reserve.

@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 3, 2022

I don't think that max_accepted_htlcs is an issue, as it only limits the number of in-flight HTLCs. I'm pretty sure I didn't have more than one pending (or failed) HTLC in virtually all cases. Correct me if I'm wrong.

Yes, the remote party has a very low balance, especially considering the reserve. However, I don't understand why this stops me from sending funds to their side, i.e. creating a more balanced channel. It feels like the channel drove itself into a dead end?

Assuming that there's a good reason for lnd to reject forwards, why am I able to use the channel for outgoing payments, initiated by me? Was this luck? Did I pick a suitable amount by chance? Is it a bug, i.e. should lnd also not allow these payments?

All in all I'm rather confused and don't understand how the protection idea ("HTLC needs to be able to pay fees for itself") is supposed to work out in this situation if my node doesn't allow the remote balance to increase.

@Roasbeef
Copy link
Member

Roasbeef commented Nov 4, 2022

Yes, the remote party has a very low balance, especially considering the reserve. However, I don't understand why this stops me from sending funds to their side, i.e. creating a more balanced channel. It feels like the channel drove itself into a dead end?

So the initiator pays fees for a commitment transaction the entire time. They need to have enough fees to pay for a force close assigned, and also they pay all the fees for a co-op close.

Each time you add an HTLC, they need to have enough funds to pay for that HTLC (it increases the size of the commitment transaction). This is the case even if they aren't the one adding an HTLC. In the case above, the initiator only has 100,078 sats on their side. The reserve itself is just 78 sats below that at 100k. As as a result, the channel can't be utilized unless the fee rate (for the commitment) transaction is reduced to the point that it can support another HTLC, then things can be rebalanced somewhat.

The relationship between the fees paid, HTLC weight and the reserve mostly works on the spec level. However, there're some degenerate cases when a channel isn't able to be used due to the initiator not being able to pay fees. I think what happened is that they tried to send everything out of the channel, realized the reserve was there, then played around to send the most they could out, leaving that amount.

Returning to the line you linked above: https://github.com/lightningnetwork/lnd/blob/master/lnwallet/channel.go#L7030-L7035

Since theirBalance < htlcCommitFee is true, they don't have enough fees to pay for the HTLC. The function only gets down here if you aren't the initiator. In order to ensure we don't route across this (only hard fails are possible in the protocol when a constraint is violated, you can't "undo" the HTLC). Then ourBalance >= nonDustHtlcAmt is also true, but we don't want to allow this to happen (the channel would basically force close), so we say we have one less than the HTLC amount. This then prevents the HTLC from being routed in general.

There is a way to get around this though: send a small amount continually back to yourself as a rebalance. If the amount is below dust, then no HTLC is added, so no extra fees are required.

Prior spec discussion related to this issue can be found here: lightning/bolts#728.

@Roasbeef
Copy link
Member

Roasbeef commented Nov 4, 2022

We added that logic to fix issues like this: #3787. In this case, the router would keep trying to send across that link, but it can't support it due to fees. Now we'll just ignore that and try another link. There were also prior cases of things like the reserve being violated, an output taking too many fees and becoming unspendable ("negative" balance).

@U-got-IT
Copy link

U-got-IT commented Nov 4, 2022

If I may chime in. I am running the node on the other side and initiated the channel a few weeks ago.
I am running Core Lightning 0.12.1, not LND. When opening the channel I used default values for everything and didn't do anything manually ever since other than automatic fee adjustment depending on imbalance via a plugin. Over time all the liquidity shifted over to C-Otto's side through normal, unattended routing activity. If this can lead to the situation at hand this seems to be a rather severe incompatibility. Please correct me if I'm wrong.

@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 4, 2022

Thanks a lot for the explanation. I think the discussion in lightning/bolts#728 is really helpful, and @Roasbeef's comment above puts this into the context of lnd's code. Personally, I think that this issue could be closed (see below), as it's a weird edge case without an obvious solution.

The only two things missing:

  • Why can I send out payments myself, if routing is forbidden? I'm using SendToRoute where the first hop (the problematic one to @U-got-IT) is fixed. I just sent out 10,000sat, which worked out just fine. Yesterday it worked with 50,000sat. I'd be happy to raise a follow-up issue.
  • I'll add an INFO log message to describe this edge case, so that future me doesn't need to raise yet another issue: lnwallet: add log message for edge case #7115

@U-got-IT please switch to anchor channels, if possible. If not, please campaign for it to be supported.

C-Otto added a commit to C-Otto/lnd that referenced this issue Nov 4, 2022
@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 6, 2022

I now have the same issue with at least five (!) other peers, all of them are anchor channels, though! I managed to send out 10k sat via rebalance-lnd in all cases.

Other users are affected, too. I received one report via Twitter and two via Telegram. In all cases, the situations matches mine (low remote balance, log message indicating x999msat for values <500sat even though there's far more on the local side).

@Roasbeef
Copy link
Member

Roasbeef commented Nov 7, 2022

I just sent out 10,000sat, which worked out just fine. Yesterday it worked with 50,000sat. I'd be happy to raise a follow-up issue.

Your ability to send is a function of the fee rate on the commitment transaction, which is controlled by the initiator. If the fees are low, you might be able to send, otherwise you might not be able to. I think the scenario laid about above is pretty clear, and also the work around as well (trickle payments to unstuck the channel).

@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 7, 2022

Isn't that fee rate the same for forwarded transactions? I don't see why forwards are blocked, while payments initiated by me are OK.

@Roasbeef
Copy link
Member

Were the forwards and the initiated payment of the same amount?

Re the fee rate I mean if they were done at different times, then the applied fee rate would differ.

On the code level, both forwards and local adds go through pretty much the same code pathway:

lnd/htlcswitch/switch.go

Lines 538 to 558 in f6ae63f

// Attempt to fetch the target link before creating a circuit so that
// we don't leave dangling circuits. The getLocalLink method does not
// require the circuit variable to be set on the *htlcPacket.
link, linkErr := s.getLocalLink(packet, htlc)
if linkErr != nil {
// Notify the htlc notifier of a link failure on our outgoing
// link. Incoming timelock/amount values are not set because
// they are not present for local sends.
s.cfg.HtlcNotifier.NotifyLinkFailEvent(
newHtlcKey(packet),
HtlcInfo{
OutgoingTimeLock: htlc.Expiry,
OutgoingAmt: htlc.Amount,
},
HtlcEventTypeSend,
linkErr,
false,
)
return linkErr
}

Which ends up calling AvailableBandwidth, which'll reject conditionally based on the rules mentioned above.

@C-Otto
Copy link
Contributor Author

C-Otto commented Nov 12, 2022

The forwards had larger amounts, my initiated payments (rebalances) had 5k or 10k sat.

I just did a test just two minutes after an observed failure:
Failure at 01:53:17 (requested 17k sat).
At 01:55: Remote balance 53sat + reserve. Successfully sent payment (5k sat).

I don't think network conditions changed a lot in this timeframe (although one new block got mined in between). I'll try to do another test with less delay (no new block), and using a similar/higher amount.

FYI, here's a list of remote-initiated channels that have less than 500sat on the remote side (updated every 30min): https://c-otto.de/lnd/low-remote.txt

@fotongit
Copy link

Hello, foton node here. I am having the same behavior described here on v0.15.4, I noticed it with one node (Mushi) but reviewing in detail this is happening with some others. I get HTLC errors with Insufficient Balance but liquidity is on my side. Weird is I get this messages (I received in the last 24h, 55 HTLC errors to Mushi) but I had 8 successful transactions as well with this node.

All the problematic channels have all the liquidity on my side.

Best,

C-Otto added a commit to C-Otto/lnd that referenced this issue Jan 16, 2023
@nayuta-ueno
Copy link
Contributor

I have same issue.
HTLC manager is logging bandwidth when LND starts up, and it suddenly becomes a small value.

[INF] HSWC: ChannelLink(0be39eb940989bf68726d08e57adc1b80124b4df6691db9494df691e1a718c95:0): HTLC manager started, bandwidth=353999 mSAT

It has occurred several times, usually at "353999 mSAT".

@C-Otto
Copy link
Contributor Author

C-Otto commented May 23, 2023

Superseeded by #7721.

@C-Otto C-Otto closed this as completed May 23, 2023
@dskvr
Copy link

dskvr commented Nov 18, 2023

~10% of my channels are presently unusable by this bug, or in other terms 60% of my local liquidity (outgoing) is frozen and cannot be rebalanced, forwarded or sent. Issue that supercedes appears to be an inversion of this bug, for low local balances. I have high local balances (98%+) and receive this error, I don't believe that this issue has been resolved.

0.17.0-beta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unintended code behaviour path finding routing nodes
Projects
None yet
Development

No branches or pull requests

9 participants