-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate multipart exceptions and accept custom charsets #56
Conversation
vincentjames501
commented
May 19, 2023
- In the previous implementation, when uploading a large file while the remote end hangs up it would throw an unhandled exception generally in the form "Read end dead"
- In the previous implementation, when uploading a file using an InputStream for example and the underlying stream throws an exception, these exceptions would not be propagated to the HttpClient/HttpRequest and would often result in incomplete payloads arriving to the server.
- Allow setting string charset types correctly where previously it would always use UTF-8 when copying to the underlying PipedOutputStream.
…riate - In the previous implementation, when uploading a large file while the remote end hangs up it would throw an unhandled exception generally in the form "Read end dead" - In the previous implementation, when uploading a file using an InputStream for example and the underlying stream throws an exception, these exceptions would not be propagated to the HttpClient/HttpRequest and would often result in incomplete payloads arriving to the server. - Allow setting string charset types correctly where previously it would always use UTF-8 when copying to the underlying PipedOutputStream.
Thanks will take a look. Does this have any impact on #55 ? |
@gnarroway , it doesn't attempt to fix #55 . We likely want to loop through the parts and do our best to determine the content length of each part similar to how OkHttp does when determining the content length header. We can only report the content length if we can accurately determine the size of each part so it's fine if it's each part is a File/String/ByteArray/etc. https://github.com/square/okhttp/blob/f9901627431be098ad73abd725fbb3738747461c/okhttp/src/jvmMain/kotlin/okhttp3/MultipartBody.kt#L109 |
…the multipart body (where appropriate) - This introduces a new protocol called MultipartParam which can be extended to support additional content types. By default we're matching essentially what io/input-stream does by default to minimize any backwards incompatible changes here - When we know the content-length of the multipart body, supply it to the HttpClient
(-> req | ||
(dissoc :multipart) | ||
(assoc :body (multipart/body multipart b)) | ||
(assoc :body | ||
(if (= -1 content-length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are unable to determine the content length, leave the body as an input stream and use transfer encoding chunked. Otherwise, specify the content length so the appropriate content-length header is used.
content-type | ||
(str (:mime-type content-type) | ||
(when-let [charset (:charset content-type)] | ||
(str "; charset=" charset)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an issue today where if you supply a String, we essentially only ever allow UTF-8. This allows the content-type param to be either a string or map which can describe the mime type and charset. I could see an argument for this being another key but this approach more closely aligns with how libraries like Apache HttpClient represent ContentType.
@@ -55,12 +84,94 @@ | |||
(take 30) | |||
(apply str "hatoBoundary"))) | |||
|
|||
(defn body | |||
"Returns an InputStream from the multipart inputs. | |||
(defprotocol MultipartParam |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following protocol defines the possible multiple param content types (can be easily extended). We need to be able to coerce the param into an input stream and determine it's length (if possible). For types that have potentially unknown lengths (such as an InputStream), a client can set it by supplying :content-length in the multipart param map.
(extend-protocol MultipartParam | ||
String | ||
(input-stream [^String content opts] | ||
(io/input-stream (.getBytes content (charset-encoding opts)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make a best guess to determine the charset for the string so we look at either the :content-type string/map and parse it.
@gnarroway , I took a crack at addressing #55 . Appears to be working for me but could certainly use some smoke testing. |
After a quick test, it seems to work just fine on my end, including the fix for #55. |