Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove String conversion in decoders. #671

Merged
merged 1 commit into from
Nov 21, 2016

Conversation

rpless
Copy link
Collaborator

@rpless rpless commented Nov 19, 2016

Proposed fix for #511. The only thing I was unsure about was whether to use Buf.ByteArray.Owned or Buf.ByteArray.Shared for the libraries that can convert Array[Byte]. I wound up using Shared, but I'm not sure this is necessary.

@codecov-io
Copy link

codecov-io commented Nov 19, 2016

Current coverage is 74.34% (diff: 76.00%)

Merging #671 into master will decrease coverage by 0.32%

@@             master       #671   diff @@
==========================================
  Files            30         30          
  Lines           592        608    +16   
  Methods         566        586    +20   
  Messages          0          0          
  Branches         26         22     -4   
==========================================
+ Hits            442        452    +10   
- Misses          150        156     +6   
  Partials          0          0          

Powered by Codecov. Last update 2d479db...a5335d1

@vkostyukov
Copy link
Collaborator

@rpless Thanks a lot for looking into this!

Some suggestions:

  • Instead of Buf.ByteArray.Shared (which always copy), let's use this trick that allows retrieving a byte array from Netty's channel buffer. This heavily depends on the underlying implementation, but I'm fine with that for now.
  • Let run some benchmarks (both JMH and wrk) to see how this affects throughput (I'm excited to see the numbers!)

@vkostyukov
Copy link
Collaborator

Trying to be more specific here.

If you need ByteBuffer, the cheapest way (at least for Netty 3) to get it is:

val byteBuffer = ChannelBufferBuf.Owned.extract(buf).toByteBuffer()

If you need Array[Byte], the cheapest way to get it is:

val channelBuffer = ChannelBufferBuf.Owned.extract(buf)
val (byteArray, offset, length) = 
  // assert channelBuffer.hasArray()
  (channelBuffer.array(), channelBuffer.readerIndex(), channelBuffer.readableBytes()) 

@rpless rpless force-pushed the remove-json-string-conversions branch from 7f464cf to 5194731 Compare November 19, 2016 13:19
@rpless
Copy link
Collaborator Author

rpless commented Nov 19, 2016

I ran the wrk benchmarks for Finch + circe. The results are here.
As for jmh benchmarks, unless I missed something, it doesn't look like we have ones that exercise the the json decoders anymore.

@vkostyukov
Copy link
Collaborator

Nice!

@rpless Do you mind keep the server running and run wrk for 3 times. Usually, it takes some time for JVM to warmup so using the results from 3rd run seems like the closest we can get to the steady state.

Copy link
Collaborator

@vkostyukov vkostyukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits from me. Really nice job!

@@ -1,6 +1,8 @@
package io.finch.argonaut

import argonaut.{CursorHistory, DecodeJson, Json}
import com.twitter.finagle.netty3.ChannelBufferBuf
import com.twitter.io.Charsets
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, use StandardCharsets from JDK. c.t.i.Charsets are deprecated in the most recent Finagle release.

)
implicit def decodeCirce[A: Decoder]: Decode.Json[A] = Decode.json({ (b, cs) =>
val attemptJson = cs match {
case Charsets.Utf8 => parseByteBuffer(ChannelBufferBuf.Owned.extract(b).toByteBuffer()).right.flatMap(_.as[A])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is scalacheck happy with that line length?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, its under the line limit by 3 character. Happy to move it down a line if you think that's more readable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, it's fine. Thanks!

@rpless rpless force-pushed the remove-json-string-conversions branch from 5194731 to b0bc72a Compare November 19, 2016 17:49
@rpless
Copy link
Collaborator Author

rpless commented Nov 19, 2016

Updated benckmarks, results after 3 runs:

Without String Conversion (this PR)

$ wrk -t4 -c24 -d30s -s examples/src/main/scala/io/finch/wrk/wrk.lua http://localhost:8081/
Running 30s test @ http://localhost:8081/
  4 threads and 24 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.68ms    3.67ms  52.43ms   90.18%
    Req/Sec    12.20k     2.26k   23.58k    67.25%
  1458406 requests in 30.09s, 137.69MB read
Requests/sec:  48471.67
Transfer/sec:      4.58MB

With String Conversion (Current Finch)

$ wrk -t4 -c24 -d30s -s examples/src/main/scala/io/finch/wrk/wrk.lua http://localhost:8081/
Running 30s test @ http://localhost:8081/
  4 threads and 24 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.62ms    3.22ms  62.64ms   89.91%
    Req/Sec    10.84k     2.44k   22.20k    68.33%
  1296147 requests in 30.10s, 122.37MB read
Requests/sec:  43067.06
Transfer/sec:      4.07MB

@vkostyukov
Copy link
Collaborator

Looks like 12.5% improvement in the throughput! Sweet!

val buf = ChannelBufferBuf.Owned.extract(b)
if (buf.hasArray) Try(Json.parse(new String(buf.array(), 0, buf.readableBytes(), cs)).as[A])
else Try(Json.parse(buf.toString(cs)).as[A])
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you might be parsing twice here by accident?

Copy link
Collaborator Author

@rpless rpless Nov 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No accident. Its based on this trick that @vkostyukov mentioned in earlier. There may be a better to express this (i.e. only having on call to Json.parse once and having the if/ else only return a Try[String]).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clhodapp I played around with it a little. Do you think this version makes the intent more clear?

val buf = ChannelBufferBuf.Owned.extract(b)
val bufAsString = Try(
  if (buf.hasArray) new String(buf.array(), 0, buf.readableBytes(), cs)
  else buf.toString(cs)
)
bufAsString.map(Json.parse(_).as[A])

@vkostyukov
Copy link
Collaborator

I think Chris is right. We call Json.parse two times and drop the first
result on the floor.

On Sat, Nov 19, 2016 at 11:14 AM Ryan Plessner notifications@github.com
wrote:

@rpless commented on this pull request.

In playjson/src/main/scala/io/finch/playjson/package.scala
#671:

@@ -11,10 +13,11 @@ package object playjson {
* @tparam A the type of the data to decode into
*/
implicit def decodePlayJson[A](implicit reads: Reads[A]): Decode.Json[A] =

  • // TODO: Eliminate toString conversion
  • // See Eliminate meaningless to-string conversion #511
  • // PlayJson can parse from byte[] automatically detecting the charset.
  • Decode.json((b, cs) => Try(Json.parse(BufText.extract(b, cs)).as[A]))
  • Decode.json({ (b, cs) => Try(Json.parse(Buf.ByteArray.Shared.extract(b)).as[A])
  •  val buf = ChannelBufferBuf.Owned.extract(b)
    
  •  if (buf.hasArray) Try(Json.parse(new String(buf.array(), 0, buf.readableBytes(), cs)).as[A])
    
  •  else Try(Json.parse(buf.toString(cs)).as[A])
    
  • })

@clhodapp https://github.com/clhodapp I played around with it a little.
Do you think this version makes the intent more clear?

val buf = ChannelBufferBuf.Owned.extract(b)val bufAsString = Try(
if (buf.hasArray) new String(buf.array(), 0, buf.readableBytes(), cs)
else buf.toString(cs)
)
bufAsString.map(Json.parse(_).as[A])


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#671, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABDDKwEmt9Qm3tSF-W2xoHHcsjNTIvoyks5q_0p5gaJpZM4K3Isa
.

@rpless
Copy link
Collaborator Author

rpless commented Nov 19, 2016

Ah I see it now. The line above is the old code. I'll remove it.

@rpless rpless force-pushed the remove-json-string-conversions branch from b0bc72a to c7288c6 Compare November 19, 2016 19:23
Decode.json((b, cs) => Try(Json.parse(BufText.extract(b, cs)).as[A]))
Decode.json({ (b, cs) =>
val buf = ChannelBufferBuf.Owned.extract(b)
if (buf.hasArray) Try(Json.parse(new String(buf.array(), 0, buf.readableBytes(), cs)).as[A])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, Json.parse can parse byte[] directly. No need for new String.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Json.Parse can take a byte[], but when I convert that line to Try(Json.parse(buf.array()).as[A]) it gives me the following in the

Throw(spray.json.JsonParser$ParsingException: Unexpected character '\u0000'

I think I'm missing something about how the ChannelBufferBuf holds onto the underlying array.

cs match {
case StandardCharsets.UTF_8 =>
val buf = ChannelBufferBuf.Owned.extract(b)
if (buf.hasArray) Try(JsonParser(new String(buf.array(), 0, buf.readableBytes(), cs)).convertTo[A])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for new String.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar issue exists here. The tests raise this error.

Throw(com.fasterxml.jackson.core.JsonParseException: Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between tokens

value => Return(value)
)
)
implicit def decodeCirce[A: Decoder]: Decode.Json[A] = Decode.json({ (b, cs) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove wrapping (). Just Decode.json { ... } is fine.

// Jackson can parse from byte[] automatically detecting the encoding.
Try(mapper.readValue(BufText.extract(b, cs), ct.runtimeClass.asInstanceOf[Class[A]]))
)
): Decode.Json[A] = Decode.json({ (b, cs) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, remove (): Decode.json { ... }.

// See https://github.com/finagle/finch/issues/511
// PlayJson can parse from byte[] automatically detecting the charset.
Decode.json((b, cs) => Try(Json.parse(BufText.extract(b, cs)).as[A]))
Decode.json({ (b, cs) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, remove wrapping (): Decode.json { ... }.

// See https://github.com/finagle/finch/issues/511
// SprayJson can parse from byte[] if it represents a UTF-8 string.
Decode.json((b, cs) => Try(BufText.extract(b, cs).parseJson.convertTo[A]))
Decode.json({ (b, cs) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, remove wrapping (): Decode.json { ... }.

@rpless rpless force-pushed the remove-json-string-conversions branch from c7288c6 to 75768b7 Compare November 19, 2016 20:24
@rpless
Copy link
Collaborator Author

rpless commented Nov 19, 2016

Ok, it looks like using slice like this works: buf.array().slice(0, buf.readableBytes()). Does that make sense?

@vkostyukov
Copy link
Collaborator

Slice works but it allocates a new array. You should be able to bypass
readableBytes to the parse function.

On Sat, Nov 19, 2016 at 12:48 PM Ryan Plessner notifications@github.com
wrote:

Ok, it looks like using slice like this works: buf.array().slice(0,
buf.readableBytes()). Does that make sense?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#671 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABDDK-0w2n0qAoIwWMcWBZ8h16aUD1ooks5q_2CrgaJpZM4K3Isa
.

@rpless
Copy link
Collaborator Author

rpless commented Nov 19, 2016

@vkostyukov
Copy link
Collaborator

Play takes InputStream though. Can we wrap our byte array with
ByteArrayInputStream and specify the length there?

For Spray, let's do System.arraycopy it's still better than 'new String'.

On Sat, Nov 19, 2016 at 1:10 PM Ryan Plessner notifications@github.com
wrote:

Yeah, I'm not a fan of the new array allocation, but it seems like neither
Spray Json nor Play Json expose a parsing function that takes the number of
bytes to read:

https://www.playframework.com/documentation/2.5.x/api/scala/index.html#play.api.libs.json.Json$

https://github.com/spray/spray-json/blob/master/src/main/scala/spray/json/JsonParser.scala#L28
(and also
https://github.com/spray/spray-json/blob/master/src/main/scala/spray/json/JsonParser.scala#L237
)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#671 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABDDK_tX58WsqNgUMHNRIP01LhdK2MDxks5q_2WagaJpZM4K3Isa
.

@rpless
Copy link
Collaborator Author

rpless commented Nov 19, 2016

Can do for Spray, but it looks like I gave you the wrong docs for Play Json. We're currently on 2.3.x which does not provide a parse with InputStream. So we can either bump the Play version or System.arraycopy.

@vkostyukov
Copy link
Collaborator

Arraycopy works for me. Let's update it later.

On Sat, Nov 19, 2016 at 1:50 PM Ryan Plessner notifications@github.com
wrote:

Can do for Spray, but it looks like I gave you the wrong docs for Play
Json. We're currently on 2.3.x
https://www.playframework.com/documentation/2.3.x/api/scala/index.html#play.api.libs.json.Json%24
which does not provide a parse with InputStream. So we can either bump the
Play version or System.arraycopy.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#671 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ABDDK22ipWPepiBj6q_IqnkB5M6wpEX7ks5q_28_gaJpZM4K3Isa
.

@rpless rpless force-pushed the remove-json-string-conversions branch from 75768b7 to 582b59a Compare November 19, 2016 22:29
Decode.json((b, cs) => Try(Json.parse(BufText.extract(b, cs)).as[A]))
Decode.json { (b, cs) =>
val buf = ChannelBufferBuf.Owned.extract(b)
if (buf.hasArray) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a burden, but can we extract that into a function so we can reuse it?

Also, let's check that if array.length equals readableBytes we can skip copying.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not a burden. Any particular place we should extract it to? I was thinking io.finch.Decode.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe the internal package object?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's make it part of internal (whatever format you prefer).

@rpless rpless force-pushed the remove-json-string-conversions branch from 582b59a to a5335d1 Compare November 21, 2016 01:10
@vkostyukov
Copy link
Collaborator

Thanks again @rpless! Merging this.

@vkostyukov vkostyukov merged commit 80f0523 into finagle:master Nov 21, 2016
@rpless rpless deleted the remove-json-string-conversions branch November 21, 2016 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants