Defer creation of Expectations #280

rossabaker · 2021-10-15T01:39:28Z

Given a parser whose hot part looks like this:

    val fieldName = Rfc7230.token.string
    val fieldValue = P.repSep0(Rfc5234.vchar.rep, P.charIn(" \t").rep).string.surroundedBy(Rfc7230.ows)
    val header = ((fieldName <* P.char(':')) ~ fieldValue).map {
      case (k, v) => Header.Raw(CIString(k), v)
    }
    val headers = (header <* P.string("\r\n")).rep0

CharIn.makeError is dominating the stack trace and allocations. It is created at the end of each repetition, only to be nulled out and gc'ed if we've made any progress.

[info] Benchmark                                                         Mode  Cnt       Score       Error   Units
[info] Http1DecoderBench.http1Decoder                                   thrpt    5  194717.904 ± 27142.805   ops/s
[info] Http1DecoderBench.http1Decoder:·gc.alloc.rate                    thrpt    5    2375.264 ±   330.882  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.alloc.rate.norm               thrpt    5   13436.503 ±     0.524    B/op
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Eden_Space           thrpt    5    2378.683 ±   323.143  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Eden_Space.norm      thrpt    5   13456.252 ±   205.497    B/op
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Survivor_Space       thrpt    5       0.008 ±     0.004  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Survivor_Space.norm  thrpt    5       0.044 ±     0.019    B/op
[info] Http1DecoderBench.http1Decoder:·gc.count                         thrpt    5     508.000              counts
[info] Http1DecoderBench.http1Decoder:·gc.time                          thrpt    5     552.000                  ms
[info] Http1DecoderBench.http1Decoder:·stack                            thrpt              NaN                 ---

Here, we wrap every Chain[Expectation] in an Eval, so that the expectations can be created lazily. Care must be taken not to close over mutable state, namely, state.offset. We significantly increase speed and reduce garbage:

[info] Http1DecoderBench.http1Decoder                                   thrpt    5  353576.391 ± 39077.640   ops/s
[info] Http1DecoderBench.http1Decoder:·gc.alloc.rate                    thrpt    5    2293.983 ±   253.014  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.alloc.rate.norm               thrpt    5    7146.434 ±     0.264    B/op
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Eden_Space           thrpt    5    2294.376 ±   247.756  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Eden_Space.norm      thrpt    5    7147.791 ±    94.745    B/op
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Survivor_Space       thrpt    5       0.007 ±     0.002  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Survivor_Space.norm  thrpt    5       0.023 ±     0.006    B/op
[info] Http1DecoderBench.http1Decoder:·gc.count                         thrpt    5     494.000              counts
[info] Http1DecoderBench.http1Decoder:·gc.time                          thrpt    5     519.000                  ms
[info] Http1DecoderBench.http1Decoder:·stack                            thrpt              NaN                 ---

I'll work on cleaning up and providing a runnable benchmark, but wanted to get early feedback.

We might also make it leaner with simple thunks rather than an Eval that's always later.

codecov-commenter · 2021-10-15T01:45:35Z

Codecov Report

Merging #280 (bbfe4d7) into main (9e67814) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #280      +/-   ##
==========================================
+ Coverage   96.32%   96.35%   +0.02%     
==========================================
  Files           8        8              
  Lines         980      988       +8     
  Branches       93       92       -1     
==========================================
+ Hits          944      952       +8     
  Misses         36       36

Impacted Files	Coverage Δ
core/shared/src/main/scala/cats/parse/Parser.scala	`96.26% <100.00%> (+0.03%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e67814...bbfe4d7. Read the comment docs.

johnynek · 2021-10-15T02:25:46Z

The win looks really good and I'm definitely interested in merging when green and we sort out any mima issues.

rossabaker · 2021-10-15T02:25:58Z

Drat. I was afraid stack safety would be an issue with the thunks. I'll pause here and see if people agree with the problem before I pour more into the solution.

johnynek · 2021-10-15T02:31:12Z

I don't see why Eval should hurt much and I imagine flatMap can solve the stack issues no?

This reverts commit dedd158.

rossabaker · 2021-10-15T03:50:36Z

A thunk is slightly lighter than Eval, but not much more so, and Eval does indeed take care of stack safety.

Replacing Eval.later with Eval.always is marginally slower, but eliminates some allocations:

[info] Http1DecoderBench.http1Decoder                                   thrpt    5  337942.154 ± 42747.443   ops/s
[info] Http1DecoderBench.http1Decoder:·gc.alloc.rate                    thrpt    5    2072.398 ±   261.164  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.alloc.rate.norm               thrpt    5    6754.582 ±     0.565    B/op
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Eden_Space           thrpt    5    2075.145 ±   264.493  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Eden_Space.norm      thrpt    5    6763.527 ±   113.341    B/op
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Survivor_Space       thrpt    5       0.008 ±     0.003  MB/sec
[info] Http1DecoderBench.http1Decoder:·gc.churn.G1_Survivor_Space.norm  thrpt    5       0.025 ±     0.009    B/op
[info] Http1DecoderBench.http1Decoder:·gc.count                         thrpt    5     480.000              counts
[info] Http1DecoderBench.http1Decoder:·gc.time                          thrpt    5     520.000                  ms
[info] Http1DecoderBench.http1Decoder:·stack                            thrpt              NaN                 ---

rossabaker · 2021-10-15T14:13:43Z

Do you want a benchmark added to this project? A dumbed down HTTP header parser should do it.

If I can get this reasonably competitive with the gross handwritten ones in http4s, a proper one will appear there.

regadas

This looks great! those numbers look good 😄

regadas · 2021-10-15T14:37:29Z

core/shared/src/main/scala/cats/parse/Parser.scala

+          Chain.fromSeq(
+            ranges.toList.map { case (s, e) =>
+              Expectation.InRange(offset, s, e)
+            }
+          )


@rossabaker last time I saw this #62 are you seeing diff results? maybe worth keep?

It won't show up in my benchmark since we now avoid this path instead of calling it a zillion times. It probably doesn't matter in reality because it's now only called once per failed parse. But I guess there's no reason not to keep the optimization and fail as fast as we can. I'll restore it.

Now that I'm aware of #62, I suspect we see some positive effect from this on those benchmarks.

rossabaker · 2021-10-15T15:33:22Z

I need my CPU back so I can't run a proper number of iterations right now, but looking good on the JSON:

bench/jmh:run -i 4 -wi 2 -f 1 -t 1 .*catsParseParse

Before, on 9e67814:

[info] Benchmark                   Mode  Cnt   Score    Error  Units
[info] BarBench.catsParseParse     avgt    4  ≈ 10⁻⁴           ms/op
[info] Bla25Bench.catsParseParse   avgt    4  36.105 ±  4.108  ms/op
[info] Qux2Bench.catsParseParse    avgt    4  10.878 ±  1.018  ms/op
[info] Ugh10kBench.catsParseParse  avgt    4  84.775 ±  4.723  ms/op

After, on 9561249:

[info] Benchmark                   Mode  Cnt   Score    Error  Units
[info] BarBench.catsParseParse     avgt    4  ≈ 10⁻⁴           ms/op
[info] Bla25Bench.catsParseParse   avgt    4  30.502 ±  1.736  ms/op
[info] Qux2Bench.catsParseParse    avgt    4   9.649 ±  1.508  ms/op
[info] Ugh10kBench.catsParseParse  avgt    4  76.212 ±  6.961  ms/op

Error on both:

[info] java.lang.RuntimeException: Error(256941,NonEmptyList(InRange(256941,,,,)))
[info] 	at scala.sys.package$.error(package.scala:27)
[info] 	at cats.parse.bench.JmhBenchmarks.catsParseParse(JsonBench.scala:47)
[info] 	at cats.parse.bench.jmh_generated.CountriesBench_catsParseParse_jmhTest.catsParseParse_avgt_jmhStub(CountriesBench_catsParseParse_jmhTest.java:190)
[info] 	at cats.parse.bench.jmh_generated.CountriesBench_catsParseParse_jmhTest.catsParseParse_AverageTime(CountriesBench_catsParseParse_jmhTest.java:152)

johnynek · 2021-10-15T16:56:42Z

The json parser got refactored in a way that broke it I think and we aren't testing the code in CI so we didn't noticed.

This was fixed for the read me here:

#258

johnynek · 2021-10-18T20:34:02Z

core/shared/src/main/scala/cats/parse/Parser.scala

        null.asInstanceOf[A]
      }
    }

    final def oneOf[A](all: Array[Parser0[A]], state: State): A = {
      val offset = state.offset
-      var errs: Chain[Expectation] = Chain.nil
+      var errs: Eval[Chain[Expectation]] = Eval.later(Chain.nil)


can we make this Eval.now(Chain.nil) I think that will be less allocation and actually every oneOf hits this path.

actually, can we allocate a single Eval[Chain[Expectation]] as a private val evalEmpty: Eval[Chain[Expectation]] = Eval.now(Chain.nil) in Impl and not have any allocations for this?

johnynek · 2021-10-18T20:35:37Z

core/shared/src/main/scala/cats/parse/Parser.scala

@@ -2180,7 +2195,7 @@ object Parser {
          // we failed to parse, but didn't consume input
          // is unchanged we continue
          // else we stop
-          errs = errs ++ err
+          errs = errs.map(_ ++ err.value)


.value isn't safe. We should do for { e1 <- errs; e2 <- err } yield (e1 ++ e2) or (errs, err).mapN(_ ++ _) but maybe the latter is slightly slower due to dispatch via typeclass.

johnynek · 2021-10-29T06:11:59Z

I'll take the changes I requested.

Thanks for sending a PR!

Defer creation of expectations

ea427b9

rossabaker added 2 commits October 14, 2021 21:53

scalafmt

7de4372

s/Eval/thunks/g

dedd158

rossabaker added 2 commits October 14, 2021 22:50

Revert "s/Eval/thunks/g" -- not stack safe

c52a6bb

This reverts commit dedd158.

Add MiMa filter for CharIn.makeError

bbfe4d7

rossabaker marked this pull request as ready for review October 15, 2021 14:10

regadas approved these changes Oct 15, 2021

View reviewed changes

Restore optimization from typelevel#62

9561249

johnynek requested changes Oct 18, 2021

View reviewed changes

johnynek merged commit eee5032 into typelevel:main Oct 29, 2021

johnynek added a commit that referenced this pull request Oct 29, 2021

Minor changes to #280

c44fd32

johnynek mentioned this pull request Oct 29, 2021

Minor changes to #280 #287

Merged

regadas pushed a commit that referenced this pull request Oct 29, 2021

Minor changes to #280 (#287)

5306e42

This was referenced Nov 6, 2021

Fix json code, run bench #295

Merged

Parser creation performance #279

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defer creation of Expectations #280

Defer creation of Expectations #280

rossabaker commented Oct 15, 2021

codecov-commenter commented Oct 15, 2021 •

edited

Loading

johnynek commented Oct 15, 2021

rossabaker commented Oct 15, 2021

johnynek commented Oct 15, 2021

rossabaker commented Oct 15, 2021

rossabaker commented Oct 15, 2021

regadas left a comment

regadas Oct 15, 2021

rossabaker Oct 15, 2021

rossabaker commented Oct 15, 2021

johnynek commented Oct 15, 2021

johnynek Oct 18, 2021 •

edited

Loading

johnynek Oct 18, 2021

johnynek commented Oct 29, 2021

Defer creation of Expectations #280

Defer creation of Expectations #280

Conversation

rossabaker commented Oct 15, 2021

codecov-commenter commented Oct 15, 2021 • edited Loading

Codecov Report

johnynek commented Oct 15, 2021

rossabaker commented Oct 15, 2021

johnynek commented Oct 15, 2021

rossabaker commented Oct 15, 2021

rossabaker commented Oct 15, 2021

regadas left a comment

Choose a reason for hiding this comment

regadas Oct 15, 2021

Choose a reason for hiding this comment

rossabaker Oct 15, 2021

Choose a reason for hiding this comment

rossabaker commented Oct 15, 2021

johnynek commented Oct 15, 2021

johnynek Oct 18, 2021 • edited Loading

Choose a reason for hiding this comment

johnynek Oct 18, 2021

Choose a reason for hiding this comment

johnynek commented Oct 29, 2021

codecov-commenter commented Oct 15, 2021 •

edited

Loading

johnynek Oct 18, 2021 •

edited

Loading