Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an optional size limit to read_line #25

Closed
wants to merge 3 commits into from

Conversation

djs55
Copy link
Member

@djs55 djs55 commented May 1, 2018

Previously a call to read_line would read and buffer arbitrary amounts of data. If an untrusted entity supplies the data (e.g. consider a web-server reading a request or header line) then the read_line can be provoked into allocating so much memory that it kills the whole process / server.

This patch changes the API by adding an optional ?len parameter (in the same style as read_some and read_exactly) to bound the maximum length of the line. If the bound is exceeded then the function returns an error (Line_too_long)

Users of this library dealing with untrusted data should decide on a suitable len value to avoid possible memory exhaustion.

This patch includes a couple of extra unit tests.

Signed-off-by: David Scott dave@recoil.org

djs55 added 2 commits May 1, 2018 13:20
Previously a call to `read_line` would read and buffer arbitrary amounts of
data. If an untrusted entity supplies the data (e.g. consider a web-server reading
a request or header line) then the `read_line` can be provoked into allocating
so much memory that it kills the whole process / server.

This patch changes the API by adding an optional `?len` parameter (in the same
style as `read_some` and `read_exactly`) to bound the maximum length of the line.
If the bound is exceeded then the function returns an error (`Line_too_long`)

Users of this library dealing with untrusted data should decide on a suitable
`len` value to avoid possible memory exhaustion.

This patch includes a couple of extra unit tests.

Signed-off-by: David Scott <dave@recoil.org>
In the case of an HTTP server using `read_line`, it should return
413 Entity too large if the headers exceed an implementation-specific
limit.

Signed-off-by: David Scott <dave@recoil.org>
djs55 added a commit to djs55/ocaml-cohttp that referenced this pull request May 1, 2018
Although the HTTP spec does not impose a maximum limit on header
lengths, most implementations impose a limit to avoid a client from
exhausting all available server memory.

According to https://stackoverflow.com/a/8623061 the typical limit
is between 4k and 48k, but this usually applies to the sum of the
request line and all the headers. It's more convenient for us to
apply the limit per header.

This requires [mirage/mirage-channel#25]

Signed-off-by: David Scott <dave@recoil.org>
src/mirage_channel.ml Outdated Show resolved Hide resolved
djs55 added a commit to djs55/vpnkit that referenced this pull request May 1, 2018
This imports mirage/mirage-channel#25

Signed-off-by: David Scott <dave.scott@docker.com>
djs55 added a commit to djs55/vpnkit that referenced this pull request May 1, 2018
This imports mirage/mirage-channel#25

Signed-off-by: David Scott <dave.scott@docker.com>
Previously `read_line` would drop the data from the channel when
returning the `Line_too_long` error. This patch puts the data back
in the input queue to make it easier to handle this error. For
example an application would be able to call `read_some` or
`read_exactly` to peek at the data and make a better error message.

Signed-off-by: David Scott <dave@recoil.org>
@hannesm
Copy link
Member

hannesm commented May 2, 2018

My intuition about this is: you'd like to have an upper bound of the allocation for read_line, is that right? I'm curious how this is related to the underlying flow -- i.e. wouldn't Flow.read need to also respect a length parameter to have a real upper bound of resource allocation? (In POSIX land, the caller of read has to provide the buffer to fill.)

Naming: I'd prefer length over len.

Semantics: What happens if

  • (a) the provided length is 0?
  • (b) the line is "foo\r\n" and length is 3 (and 4)? i.e. does the length include the newline (sequence) or not?
  • (c) Am I correct that with length = 5, the returned buffer is only of length 3?

@samoht
Copy link
Member

samoht commented May 10, 2018

just a quick comment about names: len is what is used elsewhere for optional parameters, so we should probably stick with it :-)

@mseri
Copy link

mseri commented Oct 11, 2018

Is this change going to be merged at a certain point?

@avsm
Copy link
Member

avsm commented Nov 14, 2018

@djs55, any thoughts on @hannesm comments above? #25 (comment)

Would like to get this into next cohttp

@dinosaure dinosaure mentioned this pull request Nov 8, 2019
dinosaure added a commit that referenced this pull request Nov 29, 2021
@dinosaure dinosaure closed this Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants