Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ProcessString to Pipeline. #2972

Merged
merged 3 commits into from
Nov 25, 2020

Conversation

cyriltovena
Copy link
Contributor

Most of the time, you have a buffer that you want to test again a log pipeline, when it passed you usually copy via string(buff).

In some cases like headchunk and tailer you already have an allocated immutable string, since pipeline never mutate the line passed as parameter, we can create ProcessString pipeline method that will avoid re-allocating the line.

I've also added some missing tests.

benchmp:

❯ benchcmp before.txt after.txt
benchmark                                           old ns/op     new ns/op     delta
BenchmarkHeadBlockIterator/Size_100000-16           20264242      16112591      -20.49%
BenchmarkHeadBlockIterator/Size_50000-16            10186969      7905259       -22.40%
BenchmarkHeadBlockIterator/Size_15000-16            3229052       2202770       -31.78%
BenchmarkHeadBlockIterator/Size_10000-16            1916537       1392355       -27.35%
BenchmarkHeadBlockSampleIterator/Size_100000-16     18364773      16106425      -12.30%
BenchmarkHeadBlockSampleIterator/Size_50000-16      8988422       7730226       -14.00%
BenchmarkHeadBlockSampleIterator/Size_15000-16      2788746       2306161       -17.30%
BenchmarkHeadBlockSampleIterator/Size_10000-16      1773766       1488861       -16.06%

benchmark                                           old allocs     new allocs     delta
BenchmarkHeadBlockIterator/Size_100000-16           200039         39             -99.98%
BenchmarkHeadBlockIterator/Size_50000-16            100036         36             -99.96%
BenchmarkHeadBlockIterator/Size_15000-16            30031          31             -99.90%
BenchmarkHeadBlockIterator/Size_10000-16            20029          29             -99.86%
BenchmarkHeadBlockSampleIterator/Size_100000-16     100040         40             -99.96%
BenchmarkHeadBlockSampleIterator/Size_50000-16      50036          36             -99.93%
BenchmarkHeadBlockSampleIterator/Size_15000-16      15031          31             -99.79%
BenchmarkHeadBlockSampleIterator/Size_10000-16      10029          29             -99.71%

benchmark                                           old bytes     new bytes     delta
BenchmarkHeadBlockIterator/Size_100000-16           27604042      21203941      -23.19%
BenchmarkHeadBlockIterator/Size_50000-16            13860654      10660652      -23.09%
BenchmarkHeadBlockIterator/Size_15000-16            4231360       3271363       -22.69%
BenchmarkHeadBlockIterator/Size_10000-16            2633436       1993420       -24.30%
BenchmarkHeadBlockSampleIterator/Size_100000-16     17799137      14598973      -17.98%
BenchmarkHeadBlockSampleIterator/Size_50000-16      7433260       5833137       -21.53%
BenchmarkHeadBlockSampleIterator/Size_15000-16      2258099       1778096       -21.26%
BenchmarkHeadBlockSampleIterator/Size_10000-16      1393600       1073605       -22.96%

Signed-off-by: Cyril Tovena cyril.tovena@gmail.com

Most of the time, you have a buffer that you want to test again a log pipeline, when it passed you usually copy via string(buff).

In some cases like headchunk and tailer you already have an allocated immutable string, since pipeline never mutate the line passed as parameter, we can create ProcessString pipeline method that will avoid re-allocating the line.

benchmp:

```
❯ benchcmp before.txt after.txt
benchmark                                           old ns/op     new ns/op     delta
BenchmarkHeadBlockIterator/Size_100000-16           20264242      16112591      -20.49%
BenchmarkHeadBlockIterator/Size_50000-16            10186969      7905259       -22.40%
BenchmarkHeadBlockIterator/Size_15000-16            3229052       2202770       -31.78%
BenchmarkHeadBlockIterator/Size_10000-16            1916537       1392355       -27.35%
BenchmarkHeadBlockSampleIterator/Size_100000-16     18364773      16106425      -12.30%
BenchmarkHeadBlockSampleIterator/Size_50000-16      8988422       7730226       -14.00%
BenchmarkHeadBlockSampleIterator/Size_15000-16      2788746       2306161       -17.30%
BenchmarkHeadBlockSampleIterator/Size_10000-16      1773766       1488861       -16.06%

benchmark                                           old allocs     new allocs     delta
BenchmarkHeadBlockIterator/Size_100000-16           200039         39             -99.98%
BenchmarkHeadBlockIterator/Size_50000-16            100036         36             -99.96%
BenchmarkHeadBlockIterator/Size_15000-16            30031          31             -99.90%
BenchmarkHeadBlockIterator/Size_10000-16            20029          29             -99.86%
BenchmarkHeadBlockSampleIterator/Size_100000-16     100040         40             -99.96%
BenchmarkHeadBlockSampleIterator/Size_50000-16      50036          36             -99.93%
BenchmarkHeadBlockSampleIterator/Size_15000-16      15031          31             -99.79%
BenchmarkHeadBlockSampleIterator/Size_10000-16      10029          29             -99.71%

benchmark                                           old bytes     new bytes     delta
BenchmarkHeadBlockIterator/Size_100000-16           27604042      21203941      -23.19%
BenchmarkHeadBlockIterator/Size_50000-16            13860654      10660652      -23.09%
BenchmarkHeadBlockIterator/Size_15000-16            4231360       3271363       -22.69%
BenchmarkHeadBlockIterator/Size_10000-16            2633436       1993420       -24.30%
BenchmarkHeadBlockSampleIterator/Size_100000-16     17799137      14598973      -17.98%
BenchmarkHeadBlockSampleIterator/Size_50000-16      7433260       5833137       -21.53%
BenchmarkHeadBlockSampleIterator/Size_15000-16      2258099       1778096       -21.26%
BenchmarkHeadBlockSampleIterator/Size_10000-16      1393600       1073605       -22.96%
```

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
@codecov-io
Copy link

Codecov Report

Merging #2972 (bbe69ce) into master (8ea5fd1) will increase coverage by 0.09%.
The diff coverage is 94.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2972      +/-   ##
==========================================
+ Coverage   61.84%   61.94%   +0.09%     
==========================================
  Files         182      182              
  Lines       14870    14877       +7     
==========================================
+ Hits         9197     9215      +18     
+ Misses       4824     4812      -12     
- Partials      849      850       +1     
Impacted Files Coverage Δ
pkg/ingester/tailer.go 40.94% <50.00%> (ø)
pkg/chunkenc/memchunk.go 73.60% <100.00%> (-0.07%) ⬇️
pkg/logentry/stages/match.go 90.47% <100.00%> (ø)
pkg/logql/log/metrics_extraction.go 81.70% <100.00%> (+2.95%) ⬆️
pkg/logql/log/pipeline.go 87.23% <100.00%> (+62.84%) ⬆️
pkg/promtail/positions/positions.go 46.80% <0.00%> (-11.71%) ⬇️
pkg/querier/queryrange/limits.go 91.66% <0.00%> (-4.17%) ⬇️
pkg/querier/queryrange/downstreamer.go 95.29% <0.00%> (-2.36%) ⬇️
pkg/logql/evaluator.go 91.24% <0.00%> (-0.41%) ⬇️
pkg/logql/log/labels.go 91.04% <0.00%> (+0.74%) ⬆️
... and 2 more

@@ -139,3 +158,11 @@ func ReduceStages(stages []Stage) Stage {
return line, true
})
}

func unsafeGetBytes(s string) []byte {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: so the basic idea is to avoid extra allocation during string <--> []bytes conversion correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes in some case the compiler can guess but not all the time.

Here we know we only read the passed in line, we never mutate it, so we can safely avoid this.

Copy link
Contributor

@kavirajk kavirajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 yoloString returns

pkg/chunkenc/memchunk.go Outdated Show resolved Hide resolved
Co-authored-by: Owen Diehl <ow.diehl@gmail.com>
@cyriltovena cyriltovena merged commit ec725db into grafana:master Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants