-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
muxers/yamux: Mitigation of unnecessary stream drops #3071
Changes from 3 commits
780da3f
bfc9c4d
2932e88
9c8cfd9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,7 @@ use futures::{ | |
use libp2p_core::muxing::{StreamMuxer, StreamMuxerEvent}; | ||
use libp2p_core::upgrade::{InboundUpgrade, OutboundUpgrade, UpgradeInfo}; | ||
use std::collections::VecDeque; | ||
use std::task::Waker; | ||
use std::{ | ||
fmt, io, iter, mem, | ||
pin::Pin, | ||
|
@@ -55,6 +56,8 @@ pub struct Yamux<S> { | |
/// This buffer stores inbound streams that are created whilst [`StreamMuxer::poll`] is called. | ||
/// Once the buffer is full, new inbound streams are dropped. | ||
inbound_stream_buffer: VecDeque<yamux::Stream>, | ||
/// Waker to be called when new inbound streams are available. | ||
inbound_stream_waker: Option<Waker>, | ||
} | ||
|
||
const MAX_BUFFERED_INBOUND_STREAMS: usize = 25; | ||
|
@@ -81,6 +84,7 @@ where | |
}, | ||
control: ctrl, | ||
inbound_stream_buffer: VecDeque::default(), | ||
inbound_stream_waker: None, | ||
} | ||
} | ||
} | ||
|
@@ -101,6 +105,7 @@ where | |
}, | ||
control: ctrl, | ||
inbound_stream_buffer: VecDeque::default(), | ||
inbound_stream_waker: None, | ||
} | ||
} | ||
} | ||
|
@@ -122,6 +127,8 @@ where | |
return Poll::Ready(Ok(stream)); | ||
} | ||
|
||
self.inbound_stream_waker = Some(cx.waker().clone()); | ||
|
||
self.poll_inner(cx) | ||
} | ||
|
||
|
@@ -140,17 +147,22 @@ where | |
) -> Poll<Result<StreamMuxerEvent, Self::Error>> { | ||
let this = self.get_mut(); | ||
|
||
loop { | ||
let inbound_stream = ready!(this.poll_inner(cx))?; | ||
|
||
if this.inbound_stream_buffer.len() >= MAX_BUFFERED_INBOUND_STREAMS { | ||
log::warn!("dropping {inbound_stream} because buffer is full"); | ||
drop(inbound_stream); | ||
continue; | ||
} | ||
let inbound_stream = ready!(this.poll_inner(cx))?; | ||
|
||
if this.inbound_stream_buffer.len() >= MAX_BUFFERED_INBOUND_STREAMS { | ||
log::warn!("dropping {inbound_stream} because buffer is full"); | ||
drop(inbound_stream); | ||
} else { | ||
this.inbound_stream_buffer.push_back(inbound_stream); | ||
|
||
if let Some(waker) = this.inbound_stream_waker.take() { | ||
waker.wake() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The discussion above makes me think: is this muxer polled from multiple tasks? If it is, then There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At least in |
||
} | ||
} | ||
|
||
// Schedule an immediate wake-up, allowing other code to run. | ||
cx.waker().wake_by_ref(); | ||
Poll::Pending | ||
} | ||
|
||
fn poll_close(mut self: Pin<&mut Self>, c: &mut Context<'_>) -> Poll<YamuxResult<()>> { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
poll
contract says that a waker needs to be registered in casePoll::Pending
is returned. While perhaps this unconditional registration may be legal, I think it still violates expectations.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This registration is not unconditional. It is conditional on the list being empty! The list no longer being empty is IMO a valid reason to wake the task that last polled here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, are you saying that
self.poll_inner
will returnPoll::Pending
? That would be fine, then.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH: even if the inner stream does return
Poll::Pending
, it will have registered that same waker already, and it will thus wake it when new streams become available. What am I missing?In general, the Rust async rules are thus: if you create a
Poll::Pending
, then you’re responsible for waking a waker. Otherwise you just pass along poll results.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite.
This
poll
function essentially should we polled again in two different scenarios:poll_inner
)Registering this waker here isn't strictly necessary because we poll the
StreamMuxer
inswarm::Connection
in a loop anyway. But I think it is more correct to still do this here because it showcases that there are two conditions on which the task should be polled again.Like I said, it probably works without too because we will always be implicitly woken again, even if the task that calls the general
poll
consumes everything from the inner socket already and pushed them to the buffer.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this. I don't think the inner working of
impl StreamMuxer for Yamux
should make assumptions on how it is called inConnection
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making things fail due to a missed wake-up would require polling this task through
Arc<Mutex<...>>
, which I hope nobody would ever consider doing, so I agree that this is harmless.This discussion is yet another spotlight on how difficult it is to correctly reason about
poll
functions. This is exacerbated within libp2p by extending the concept — like is being done here — in non-trivial ways. If there actually was some surrounding task that did care about being woken for two different reasons, it would have to manufacture its own wakers (like FuturesUnordered does). But even that usage would be broken because the Waker destined forpoll_inbound
may be passed topoll_inner
as well, overwriting a previously registered Waker that was destined for that purpose.While debugging wake-up loops is painful, it is orders of magnitude easier than debugging missed wake-ups. May I suggest that we adhere to the policy that each async object offers exactly one
poll
function, with Future semantics, that drives exactly (i.e. only and completely) the state machine of that object? Interrogating the state machine should not havepoll
semantics, because that can lead to confusing behaviour and bugs.The following pattern is what I suggest:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design of this trait was inspired by the
AsyncWrite
design which also has multiplepoll_
functions that users need to drive.Sink
is similar.One problem with a single "poll" function design is that it puts more burden on implementations. For example, opening a new stream is not instantaneous, it may require negotiation of new credits with the other party. As such, a "new_outbound" function can only give you an ID or some other kind of handle for a new stream. This means every implementation needs to implement some kind of "stream ID" management. In contrast to that a
poll_new_outbound
function can just returnPending
until the new stream is ready to be used.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never found Sink usable for anything I had to do, so its design carries little weight in my opinion.
Offering multiple “poll” functions to poll the same underlying thing has severe issues, as I argued above — and which you so far have not commented on. The example of a “new_outbound” function boils down to the choice of polling the machinery until the new connection is ready, ignoring everything else that happens in the meantime. This already requires the machinery to aggregate its output and let the poller inspect it, for which there is no reason a priori to offer a
poll
-shaped API. In particular, noContext
is necessary to ask whether events have been emitted, which removes one prolific source of confusion inherent to Rust’s Task design.So my solution to the
new_outbound
problem would be to offer a front-end Future that polls until the new connection is ready and leaves all other side-effects uninspected, to be dealt with by the caller afterwards.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically the
poll_ready
&start_send
API ofSink
then, yes?poll_outbound
is not the only issue. The problem of having to buffer streams is mostly becauseyamux
doesn't allow us to backpressure the number of streams. The QUIC muxer on the other hand allows us to make progress on the connection itself without necessarily accepting new inbound streams. I am happy to change the API for something better but so far I've not found a solution where the caller (swarm::Connection
) can explicitly signal to the muxer that it is now able to take more inbound streams.We could move away from
poll_inbound
by having just apop_inbound
function. This however then requires more documentation on when the caller should call this function again if it ever returnsNone
. At that stage, we are just re-inventing the wheel when we could also be usingPoll
and automatically wake the task when we know that there are newinbound
streams available.