-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
not connected to peer after disconnect #616
Comments
Maybe add logs @offerm |
Not really needed.
This is not a bug - we don't [YET] have the code that suppose to do it
(IMHO).
Log only shows the disconnects and that is it.
…On Tue, Oct 30, 2018 at 10:34 AM Kilian Rausch ⚡️ ***@***.***> wrote:
Maybe add logs @offerm <https://github.com/offerm>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#616 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJQ0cnA7YiaRCDqaYa5k0tXVFLwoWuaAks5uqA8jgaJpZM4YBZTX>
.
|
@kilrau #392 is for the initial connection retries, not after disconnection. We talked before about re-connecting after a disconnection, but decided not to do so. This specific disconnection happens because peer is stalling ping/pong response. It doesn't make sense to define re-connection from the side which closed the connection. Instead we can check But the other side might close the connection as well, and we won't know about it until we're back online. In this case re-connecting make sense. To implement this, we need to define specific cases for reconnecting after peer closed the connection, because in some cases it can be considered as DoS. To do this, we first need to implement #152. |
Would the idea then be to attempt a reconnection after receiving a |
Kind of inventing the wheel here, no?
See how this started, I had a wifi issue, not a disconnect by a remote peer.
IMHO, if I connected to a peer and disconnected, there should be a simple
retry mechanism that try to connect in a similar way we are doing when LND
disconnected.
Now, if the connection was rejected (reject package) by a peer there is no
need to try an reconnect.
IMHO, peer rejection and peer banning have some time... wifi problems and
even closing my my laptop going to `sleep` are here already.
…On Wed, Oct 31, 2018 at 4:14 PM Daniel McNally ***@***.***> wrote:
Would the idea then be to attempt a reconnection after receiving a
DISCONNECTING packet? And I'm guessing the packet should specify that the
reason for the disconnection is because our node became unresponsive?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#616 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJQ0cleSG_WnjE2KoTVZT6_Un0nH4B4Gks5uqbBPgaJpZM4YBZTX>
.
|
I'm not 100% sure but as I understand it the TCP socket will not necessarily close if you lose wifi connectivity temporarily, but your peers will see that you have become unresponsive and disconnect from you. So the actual socket connection would be closed by the peer in this case. Correct me if I'm wrong. |
TCP socket will close when TCP layers notice it. If you have a short drop
that is recovered quickly, TCP me recover without the peers notice.
If TCP detected the problem and fail to revoced the peer will not be able
to recover.
So, when see the error on the application level, it is a done deal.
…On Wed, Oct 31, 2018 at 4:54 PM Daniel McNally ***@***.***> wrote:
I'm not 100% sure but as I understand it the TCP socket will not
necessarily close if you lose wifi connectivity temporarily, but your peers
will see that you have become unresponsive and disconnect from you. So the
actual socket connection would be closed by the peer in this case. Correct
me if I'm wrong.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#616 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJQ0ciRx5JuGrI7qPTMwfIbNAP7d3dPlks5uqbmQgaJpZM4YBZTX>
.
|
Thinking about this some more, your node would likewise see that pings are failing and would disconnect from the peer even if it is technically you that are having connectivity issues. So I'm thinking we wouldn't receive a My question is, how are we going to tell the difference between ourselves going temporarily offline and a peer going offline? We can also attempt pings to google.com or something , but that wouldn't always work if we are on a LAN or something. If we detect that we've lost connectivity and then regained it, we can just attempt to reconnect to all nodes like we do on startup. Another approach would be to just apply the same retry logic we use during initial connection to a node anytime we disconnect to a node due to loss of connectivity and packets timing out. |
Which disconnect packet do you expect to received?
When you disconnect it is final.
No need to look for the reason.
Just attempt to reconnect with the peer (if you created the connection or
if you know how to connect it (not using the temporaty port).
Just as LND is doing :-)
…On Wed, Oct 31, 2018 at 5:29 PM Daniel McNally ***@***.***> wrote:
Thinking about this some more, your node would likewise see that pings are
failing and would disconnect from the peer even if it is technically you
that are having connectivity issues. So I'm thinking we wouldn't receive a
DISCONNECTING packet in either case.
My question is, how are we going to tell the difference between ourselves
going temporarily offline and a peer going offline? We can also attempt
pings to google.com or something , but that wouldn't always work if we
are on a LAN or something.
If we detect that we've lost connectivity and then regained it, we can
just attempt to reconnect to all nodes like we do on startup.
Another approach would be to just apply the same retry logic we use during
initial connection to a node anytime we disconnect to a node due to loss of
connectivity and packets timing out.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#616 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJQ0cgXtNCCqhoM0pnWqfevWixGkQMymks5uqcG_gaJpZM4YBZTX>
.
|
We don't have a disconnection packet currently, it's just an issue/idea. The reason is if the peer intentionally disconnected with you, it's not really related to this topic. I'd be fine with automatically attempting to reconnect if we outgoing connection information, seems like the simplest solution. |
@offerm @sangaman
correct, and in this case you'll probably get the PING/DISCONNECTING packets & socket close event when you’ll be back online. but i'm not sure we can rely on that. |
@moshababo Only time when we are not trying to reconnect is after we got this We should also use that packet during handshake it the peer is banned. |
Anything that speaks against this? @moshababo @sangaman |
@offerm Is there something in what you’r suggesting which differs from what’s specified in #152, besides the latter being more general-purpose? So the idea is:
|
I don't see a need for stalling packet. Just close the connection and retry. I would try to reconnect also after the last point. The time between reconnect attempt should reset only after successful handshake. I wouldn't implement the #152 packet until the situation is stable. |
Let's assume we are still in @offerm's scenario from above. His laptop connected to one of our cloud servers. Wifi issues, connection breaks on tcp level. When he comes back online, I am having a hard time to imagine how he even gets a #152 packet from his peers if the connection was closed on tcp level, we cannot rely on it to still be open as you wrote above. The stalling packet makes sense for some advanced use cases like "I'm going to ban you know because of reason XYZ" as described in #152 but not here I think. So I agree with @offerm to wait with #152 . In short:
That's exactly the situation from a peers point of view, when I loose my wifi connection. It's perfectly fine to do nothing here because as described above I will initiate the reconnect which is much more efficient. |
can you clarify this? -- My assumption is that we don't know our network connectivity status. so if the socket got closed, I don't know whether it's because I was offline and stalling. If it's because I got banned, and i'll try to reconnect, it's a problem because we'll get a feedback loop. So without #152 packet, I see it reasonable to implement only the first bullet from my suggestion:
I think this would be fine as we are expected to inspect the peer stalling responses before the TCP socket will timeout. After closing the socket, and reconnecting, we might still be offline, but connection retries will trigger as well. |
I agree on the necessity for #152 to implement this properly. Since handling TCP layer connectivity issues isn't focus currently, I suggest to do dependency #152 first and then pick up this one in the next milestone. In the meanwhile we try to find a consensus on what we expect from the implementation of this issue. Summary by @moshababo with some fixes:
EDIT: some of the more complicated #152 behavior moved to new issue #693 |
For banning a node solely for outgoing connections, we can simply delete the node from our repository. But if we want to keep it there for historical data, we need to add a new banning mode. We can look into it before implementing though. |
My node was up connected to 3 peers and have peer orders.
wifi down for 3 minutes, wifi back up.
peers are missing (no reconnect).
The text was updated successfully, but these errors were encountered: