-
-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Channel lacks high watermark flow control #392
Labels
bug
Something isn't working
Comments
Any chances to have that fixed in the next release? This happens to be the last showstopper problem for me :( |
EpicEric
added a commit
to EpicEric/russh
that referenced
this issue
Dec 7, 2024
Fixes Eugeny#392. Simply put, this changes its `UnboundedReceiver` from `unbounded_channel` with a `Receiver` from `channel(...)`, with a hard-coded value in `CHANNEL_BUFFER_SIZE` of 100. This means that, for each channel that's created, it can hold up to 100 `Msg` objects, which are at most as big as the window size plus metadata/indirection pointers. This is enough to force messages to be read from the underlying `TcpStream` only as quickly as the server/client is able to handle them. This has been tested with the example given in Eugeny#392, including a modification to slowly read into a non-expanding buffer, and appears to fix the issue that's been pointed out. Some other considerations: - It is still possible to cause memory to explode by opening multiple channels, although that might be intentional on the user's part. In this case, it's up to the user to handle when channels get created or not. - It might be desirable to let users pick how big the `mpsc:channel`s can be by some configuration, instead of the hardcoded value. It should be simple to modify the code in order to support this. - The limited buffering also means that control `Msg`s (eg. `Eof`) will also be delayed by the transport layer backpressure, although this might be expected by the SSH standard. Reading through RFC 4254, I couldn't find any mention of non-`Data`/`ExtendedData` messages having to be dealt differently, but I don't think it's even possible to do so over TCP. If this assumption is wrong, this might require a separate `mpsc::unbounded_channel` just for control messages. **BREAKING CHANGE**: This removes `fn send_channel_msg`, which doesn't seem to be used anywhere in this library, but is part of the publicly exposed API nonetheless.
EpicEric
added a commit
to EpicEric/russh
that referenced
this issue
Dec 8, 2024
Fixes Eugeny#392. Simply put, this changes its `UnboundedReceiver` from `unbounded_channel` with a `Receiver` from `channel(...)`, with a configurable value in `channel_buffer_size` of 100. This means that, for each channel that's created, it can hold up to 100 `Msg` objects, which are at most as big as the window size plus metadata/indirection pointers. This is enough to force messages to be read from the underlying `TcpStream` only as quickly as the server/client is able to handle them. This has been tested with the example given in Eugeny#392, including a modification to slowly read into a non-expanding buffer, and appears to fix the issue that's been pointed out. Some other considerations: - It is still possible to cause memory to explode by opening multiple channels, although that might be intentional on the user's part. In this case, it's up to the user to handle when channels get created or not. - The limited buffering also means that control `Msg`s (eg. `Eof`) will also be delayed by the transport layer backpressure, although this might be expected by the SSH standard. Reading through RFC 4254, I couldn't find any mention of non-`Data`/`ExtendedData` messages having to be dealt differently, but I don't think it's even possible to do so over TCP. If this assumption is wrong, this might require a separate `mpsc::unbounded_channel` just for control messages. **BREAKING CHANGES**: - This removes `fn send_channel_msg`, which doesn't seem to be used anywhere in this library, but is part of the publicly exposed API nonetheless. - This adds the configuration value `channel_buffer_size` to both servers and clients that allows setting how big the buffer size should be.
EpicEric
added a commit
to EpicEric/russh
that referenced
this issue
Dec 9, 2024
Fixes Eugeny#392. Simply put, this changes its `UnboundedReceiver` from `unbounded_channel` with a `Receiver` from `channel(...)`, with a configurable value in `channel_buffer_size` of 100. This means that, for each channel that's created, it can hold up to 100 `Msg` objects, which are at most as big as the window size plus metadata/indirection pointers. This is enough to force messages to be read from the underlying `TcpStream` only as quickly as the server/client is able to handle them. This has been tested with the example given in Eugeny#392, including a modification to slowly read into a non-expanding buffer, and appears to fix the issue that's been pointed out. Some other considerations: - It is still possible to cause memory to explode by opening multiple channels, although that might be intentional on the user's part. In this case, it's up to the user to handle when channels get created or not. - The limited buffering also means that control `Msg`s (eg. `Eof`) will also be delayed by the transport layer backpressure, although this might be expected by the SSH standard. Reading through RFC 4254, I couldn't find any mention of non-`Data`/`ExtendedData` messages having to be dealt differently, but I don't think it's even possible to do so over TCP. If this assumption is wrong, this might require a separate `mpsc::unbounded_channel` just for control messages. **BREAKING CHANGES**: - This removes `fn send_channel_msg`, which doesn't seem to be used anywhere in this library, but is part of the publicly exposed API nonetheless. - This adds the configuration value `channel_buffer_size` to both servers and clients that allows setting how big the buffer size should be.
I've published v0.50.0-beta.1 which is just 0.49 but with this fix on top ✌️ |
Thank you so much! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I discovered that russh continues reading into an internal buffer even if my application can't process that bytes in time. This could be a big problem for some applications (like mine), causing unacceptable RAM usage or even a termination by OOM killer.
Steps to reproduce:
tokio::time::sleep().await
(you can find full main.rs in attachment):Expected behavior: russh stops reading bytes in an internal buffer after high watermark is reached (suppose 16 Mb by default), until my app consumes some bytes from the channel making that internal buffer available for accepting new bytes again.
That's how it normally works for OpenSSH server sessions, as well as for TCP streams, as well as for UNIX domain sockets, as well as for simple UNIX pipes you're familiar with in UNIX shell, ... (you can continue).
Observed behavior: russh just reads whole 1gb.bin into RAM
Is there any way to overcome that issue? (Apart from implementing some self-control at application protocol level, which isn't possible if I have to keep compatibility with existing protocols such as rsync).
As I didn't tested if a similar issue exists for writer, may I also ask what happens if my app starts writing into channel faster than the client can receive that bytes? Will it await the client once buffer high watermark is reached, or just continues storing extra bytes into memory until terminated by OOM killer?
main.rs.zip
P. S. I'm using the latest russh release, which happens to be 0.46.0
The text was updated successfully, but these errors were encountered: