-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove String conversion in decoders. #671
Remove String conversion in decoders. #671
Conversation
Current coverage is 74.34% (diff: 76.00%)@@ master #671 diff @@
==========================================
Files 30 30
Lines 592 608 +16
Methods 566 586 +20
Messages 0 0
Branches 26 22 -4
==========================================
+ Hits 442 452 +10
- Misses 150 156 +6
Partials 0 0
|
@rpless Thanks a lot for looking into this! Some suggestions:
|
Trying to be more specific here. If you need val byteBuffer = ChannelBufferBuf.Owned.extract(buf).toByteBuffer() If you need val channelBuffer = ChannelBufferBuf.Owned.extract(buf)
val (byteArray, offset, length) =
// assert channelBuffer.hasArray()
(channelBuffer.array(), channelBuffer.readerIndex(), channelBuffer.readableBytes()) |
7f464cf
to
5194731
Compare
I ran the wrk benchmarks for Finch + circe. The results are here. |
Nice! @rpless Do you mind keep the server running and run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits from me. Really nice job!
@@ -1,6 +1,8 @@ | |||
package io.finch.argonaut | |||
|
|||
import argonaut.{CursorHistory, DecodeJson, Json} | |||
import com.twitter.finagle.netty3.ChannelBufferBuf | |||
import com.twitter.io.Charsets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, use StandardCharsets
from JDK. c.t.i.Charsets
are deprecated in the most recent Finagle release.
) | ||
implicit def decodeCirce[A: Decoder]: Decode.Json[A] = Decode.json({ (b, cs) => | ||
val attemptJson = cs match { | ||
case Charsets.Utf8 => parseByteBuffer(ChannelBufferBuf.Owned.extract(b).toByteBuffer()).right.flatMap(_.as[A]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is scalacheck happy with that line length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, its under the line limit by 3 character. Happy to move it down a line if you think that's more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, it's fine. Thanks!
5194731
to
b0bc72a
Compare
Updated benckmarks, results after 3 runs: Without String Conversion (this PR)
With String Conversion (Current Finch)
|
Looks like 12.5% improvement in the throughput! Sweet! |
val buf = ChannelBufferBuf.Owned.extract(b) | ||
if (buf.hasArray) Try(Json.parse(new String(buf.array(), 0, buf.readableBytes(), cs)).as[A]) | ||
else Try(Json.parse(buf.toString(cs)).as[A]) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you might be parsing twice here by accident?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No accident. Its based on this trick that @vkostyukov mentioned in earlier. There may be a better to express this (i.e. only having on call to Json.parse
once and having the if
/ else
only return a Try[String]
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clhodapp I played around with it a little. Do you think this version makes the intent more clear?
val buf = ChannelBufferBuf.Owned.extract(b)
val bufAsString = Try(
if (buf.hasArray) new String(buf.array(), 0, buf.readableBytes(), cs)
else buf.toString(cs)
)
bufAsString.map(Json.parse(_).as[A])
I think Chris is right. We call Json.parse two times and drop the first On Sat, Nov 19, 2016 at 11:14 AM Ryan Plessner notifications@github.com
|
Ah I see it now. The line above is the old code. I'll remove it. |
b0bc72a
to
c7288c6
Compare
Decode.json((b, cs) => Try(Json.parse(BufText.extract(b, cs)).as[A])) | ||
Decode.json({ (b, cs) => | ||
val buf = ChannelBufferBuf.Owned.extract(b) | ||
if (buf.hasArray) Try(Json.parse(new String(buf.array(), 0, buf.readableBytes(), cs)).as[A]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, Json.parse
can parse byte[]
directly. No need for new String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Json.Parse
can take a byte[]
, but when I convert that line to Try(Json.parse(buf.array()).as[A])
it gives me the following in the
Throw(spray.json.JsonParser$ParsingException: Unexpected character '\u0000'
I think I'm missing something about how the ChannelBufferBuf holds onto the underlying array.
cs match { | ||
case StandardCharsets.UTF_8 => | ||
val buf = ChannelBufferBuf.Owned.extract(b) | ||
if (buf.hasArray) Try(JsonParser(new String(buf.array(), 0, buf.readableBytes(), cs)).convertTo[A]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for new String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar issue exists here. The tests raise this error.
Throw(com.fasterxml.jackson.core.JsonParseException: Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between tokens
value => Return(value) | ||
) | ||
) | ||
implicit def decodeCirce[A: Decoder]: Decode.Json[A] = Decode.json({ (b, cs) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove wrapping ()
. Just Decode.json { ... }
is fine.
// Jackson can parse from byte[] automatically detecting the encoding. | ||
Try(mapper.readValue(BufText.extract(b, cs), ct.runtimeClass.asInstanceOf[Class[A]])) | ||
) | ||
): Decode.Json[A] = Decode.json({ (b, cs) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, remove ()
: Decode.json { ... }
.
// See https://github.com/finagle/finch/issues/511 | ||
// PlayJson can parse from byte[] automatically detecting the charset. | ||
Decode.json((b, cs) => Try(Json.parse(BufText.extract(b, cs)).as[A])) | ||
Decode.json({ (b, cs) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, remove wrapping ()
: Decode.json { ... }
.
// See https://github.com/finagle/finch/issues/511 | ||
// SprayJson can parse from byte[] if it represents a UTF-8 string. | ||
Decode.json((b, cs) => Try(BufText.extract(b, cs).parseJson.convertTo[A])) | ||
Decode.json({ (b, cs) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, remove wrapping ()
: Decode.json { ... }
.
c7288c6
to
75768b7
Compare
Ok, it looks like using |
Slice works but it allocates a new array. You should be able to bypass On Sat, Nov 19, 2016 at 12:48 PM Ryan Plessner notifications@github.com
|
Yeah, I'm not a fan of the new array allocation, but it seems like neither Spray Json nor Play Json expose a parsing function that takes the number of bytes to read:
|
Play takes InputStream though. Can we wrap our byte array with For Spray, let's do System.arraycopy it's still better than 'new String'. On Sat, Nov 19, 2016 at 1:10 PM Ryan Plessner notifications@github.com
|
Can do for Spray, but it looks like I gave you the wrong docs for Play Json. We're currently on 2.3.x which does not provide a parse with InputStream. So we can either bump the Play version or System.arraycopy. |
Arraycopy works for me. Let's update it later. On Sat, Nov 19, 2016 at 1:50 PM Ryan Plessner notifications@github.com
|
75768b7
to
582b59a
Compare
Decode.json((b, cs) => Try(Json.parse(BufText.extract(b, cs)).as[A])) | ||
Decode.json { (b, cs) => | ||
val buf = ChannelBufferBuf.Owned.extract(b) | ||
if (buf.hasArray) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be a burden, but can we extract that into a function so we can reuse it?
Also, let's check that if array.length
equals readableBytes
we can skip copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not a burden. Any particular place we should extract it to? I was thinking io.finch.Decode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe the internal
package object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's make it part of internal
(whatever format you prefer).
…s conversions in most JSON Decoders.
582b59a
to
a5335d1
Compare
Thanks again @rpless! Merging this. |
Proposed fix for #511. The only thing I was unsure about was whether to use
Buf.ByteArray.Owned
orBuf.ByteArray.Shared
for the libraries that can convertArray[Byte]
. I wound up usingShared
, but I'm not sure this is necessary.