-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix utp outgoing FIN #7232
Fix utp outgoing FIN #7232
Conversation
@thrnz ping if you're interested in reviewing |
@arvidn |
While fixing this, I don't think I saw any misbehavior that could cause a crash or use-after-free. Just issues at the uTP protocol level. |
I'm not too familiar with how the upper layer interacts with the uTP layer - the recent uTP performance improvements were mostly at the uTP level itself - but the reasoning behind these changes seems to make sense. I can't spot anything obviously wrong. I might try throwing some live torrents at this later just to make sure there's no weirdness going on. |
I feel compelled to mention a scenario I've been painfully experiencing for a long time that may tie into this: At the least, it should be one of the tests done against this. |
#3542 was reportedly fixed by @thrnz 's recent set of uTP patches a little while ago. |
Prior to 2.0.8, uTP streams weren't closing cleanly and were instead timing out. That might help explain the issue of lingering libtorrent based uTP peers, and should now hopefully be fixed, though doesn't explain those other issues. I know of one more issue related to #3542 that probably still needs addressing - it helps explain incoming packet loss on Windows 7 in particular - though would be surprised if it resolved everything. |
The long-lived uTP seed-to-seed connections has another issue with the unknown (to me!) traffic about 5 seconds after the connection is made. TCP seed-to-seed connections typically disconnect in under a second. Cleaning up the closing may help prevent a 30+ second delay but not a 4+ second one.
I saw a test of uTP traffic both to and from Windows and Linux...and the results suggest only the source needed to be a Windows PC to have the extremely high packet loss result and low artificial speed limit on a per-peer basis. Even Windows 10 (or 11?) probably exhibits the same symptoms, but I haven't bothered to check it. The family PC that has Windows 10 on it...other family members may not appreciate me doing that level of testing on it! Oh, has the 2.0.8 changes been back-ported to libtorrent v1.x as well? |
1ba55da
to
1cc49dc
Compare
69c51c1
to
b69a736
Compare
I believe this patch is pretty solid now. Has anyone else tested it? |
I've tried it out with live torrents as of the most recent push without noticing any obvious problems, though I guess any FIN issues might not be that noticeable to begin with. Is the |
How do you see/check for how many re-transmits are done? Maybe adding adebug/logged counter of those can be of help? |
good point. I'll take a look at this. I should probably restore that behavior. |
…m. Don't automatically respond with a FIN to an incoming FIN. Treat the incoming and outgoing streams as independent
… that we're closing. force send nagle packet and finally send FIN once all data has been sent. don't send FIN until the nagle packet has been force sent
b69a736
to
a2100e4
Compare
if (m_out_eof | ||
&& !m_nagle_packet | ||
&& m_write_buffer_size == 0 | ||
&& state() == state_t::connected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When send_fin()
fails due to stalling, it still progresses to state_t::fin_sent
. Should this check be state() == state_t::fin_sent
(as it was prior to this patch) in order to pick it up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. Maybe the state shouldn't even be transitioned in case the socket is stalled, failing to send the FIN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not obvious to me that checking the socket state here is important either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit might actually be able to be dropped entirely. If sending the FIN packet stalls, it gets marked for resend and ends up in m_outbuf
, and I think should get resent in one of the previous send_pkt()
calls above.
It looks like the second commit in #7153 might have been a bit wonky and tried to address something that was already solved by it's first commit, so the logic here no longer seems to makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. I hesitate dropping this until there are proper unit tests for these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding it correctly, the old conditions would never be met, so functionally nothing was happening here anyway. When state() == state_t::fin_sent
, m_outbuf.at(m_seq_nr)
would be the FIN packet. Once it's ACK'd and removed from m_outbuf
, we'll no longer be in the fin_sent
state. The assumption was that a stalled FIN packet wouldn't have made its way into m_outbuf
(which now happens) and could be detected here and re-sent - it instead gets resent in the send_pkt()
calls above along with any other packets needing to be resent.
I'm not sure what effect the new conditions here using the connected
state instead of the fin_sent
state might have.
Reproducing stalled FIN packets in a test would be ideal.
I think the next step in actually making this uTP implementation robust would be to add unit tests (as opposed to the existing end-to-end simulations). Where exact packets and their timings are crafted and injected by the test itself. The incoming packets callback and the send-packet callback could already be hooked into by a test, the main missing interface is the clock. The utp code currently calls into the Perhaps the timestamp could be passed in to the incoming packet callback. |
There's a I wonder if adding a 'utp stall' counter might be worthwhile. It could be handy for picking up too shallow buffers. |
Sounds like a good idea. Is this resend counter visible in the app?or only with the debugger? |
There's some more info here about the available counters and gauges and how to access them. |
This is a series of patches to uTP cleaning up the handling of closing the streams. It addresses this comment: #7162 (comment)
The correct behavior when the other end sends a
FIN
packet is to record that the stream has been closed and at which sequence number it was closed. Then deliver any undelivered payload to the upper layer and finally deliver theeof
error. The uTP layer should not respond with aFIN
to close down the outgoing direction until the upper layer says to do so.In the case of bittorrent, this will cause the upper layer to close its end of the connection as well.
These patches are separated to be somewhat independent. The last commit is the main one that makes the uTP layer not respond with a
FIN
, but instead let the upper layer close the socket before doing so.With these patches, the socket is kept alive until un-acked data has been acked, and is kept open for one time-out period in order to ACK the other end's
FIN
packet as well, even after the socket has been closed by the upper layer.The rules that have been made stricter are: