Next generation pipeline #4340

andrewvc · 2015-12-11T21:40:36Z

This replaces #4254 which got just a bit too big. It's squashed and rebased off master.

andrewvc · 2015-12-11T22:06:20Z

I've done some benchmarking parsing apache logs with the cloud benchmarker and indexing them to ES. The results are pretty interesting. The upshot is, for this benchmark, the ng_pipeline was 28.67% faster than master.

This is a preliminary benchmark of just apache log file parsing (with grok / geoip / useragent) with the cloud benchmarker. I anticipate other benchmarks to act differently. The comments below discuss its performance on this task.

To start with the new pipeline seems quite a bit faster BUT the tuning is all different. There are three parameters to tune the new pipeline:

-w number of workers
-b batch size
-u batch pop latency

The key takeaways were:

Have more takeaways than cores for either pipeline. -w 8 seemed to be the sweet spot.
The -u setting doesn't change much if anything. We should leave it in, just in case, but the default of 5ms works fine.
Tuning -b is very important! For the apache logs a larger batch size yielded significant gains. We may strongly want to consider size based dynamic batches to help optimize this. Currently its hard to know the size of an event, but with the persistent pipeline that should be easier.
Adjusting master's SizedQueue size did nothing

andrewvc · 2015-12-11T22:08:30Z

I should mention that watching these benchmarks in htop it seemed that the classic pipeline didn't saturate the cores as effectively and used more system time. I think there are some constant time operations in it that impede thread liveness.

This was partially measured in other benchmarks in #4254 by running the two pipelines within the time command. There's probably an even better way to get at this impact.

colinsurprenant · 2015-12-14T20:55:08Z

I just completed a round of tests & benchmarks using 2 different configs and sets of inputs. I will put my results in a table but before I wanted to report I problem I encountered using many workers and bigger batch size so that other can see if they can reproduce.

I am seeing the following error output and getting various output delays (output completely stops for seconds)

Concurrent::TimeoutError {:class=>"Concurrent::TimeoutError", :backtrace=>nil, :level=>:warn}

The config & data used is in https://gist.github.com/colinsurprenant/8143a0dfd30830d57b03 and the command line where I started noticing these errors is with

$ while true; do cat syslog.dat; done | bin/logstash -w 8 -b 8000 -f syslog.conf

using syslog.dat and syslog.conf in the gist ^^

jordansissel · 2015-12-14T23:04:04Z

@andrewvc your most recent patch (the while loop timer) LGTM, butI haven't tested it.

ph · 2015-12-15T01:11:32Z

@andrewvc what was the issue with the task?

colinsurprenant · 2015-12-15T16:37:41Z

@andrewvc perfect! I much prefer this simpler flushing implementation, plus it uses Stud.stoppable_sleep which is preferable for handling the shutdown sequence.

I haven't been able to reproduce the Concurrent::TimeoutError exception 👍

Before I post my benchmarks results, I realize that the high workers + bigger batch runs might have been slowed down by the heap memory pressure, so I will re-run them and post results!

andrewvc · 2015-12-15T17:33:28Z

Thanks @colinsurprenant ! In retrospect, the benchmarks I ran above probably saw declining numbers with greater batch sizes due to memory pressure more than CPU. That said, I think I'll hold off on running further benchmarks till a future date since the main point is that the new pipeline is much more efficient than the old one given similar resources.

ph · 2015-12-15T17:47:35Z

logstash-core/lib/logstash/output_delegator.rb

+  def busy_workers
+    @worker_queue.size
+  end
+end end


woa cleaner +1

colinsurprenant · 2015-12-15T18:34:11Z

Here are my results.

MBP 13r, 2.8 GHz Intel Core i7, 16GB
Java 8 1.8.0_45-b14
logstash config:

input {
  stdin {
    codec => json_lines
  }
}

filter {
  mutate {
    add_field => ["[fields][date]",  "%{+YYYY-MM-dd:HH:mm:ss}"]
    add_field => ["[fields][mixed]",  "foo %{+YYYY-MM-dd} %{foo}"]
    add_field => ["[fields][foo]",  "%{foo}"]
    add_field => ["[fields][bar]",  "%{bar}"]
    remove_field => ["foo"]
    remove_field => ["bar"]
  }
}

output {
  stdout { codec => json_lines }
}

base command line, adjust -w and -b per bench:

$ yes '{"foo":"testfoo","bar":"testbar"}' | LS_HEAP_SIZE=4g bin/logstash -w X -b Y -f bench.conf | pv -Wlart > /dev/null

results for a 1 minute run, the average k lines per sec (or kTPS). Note that only a single run has been measured for each scenario, and there is slight variations from run to run with same parameters.

colinsurprenant · 2015-12-15T18:35:24Z

LGTM

ph · 2015-12-15T18:37:11Z

did another run LGTM, nice work @andrewvc

andrewvc · 2015-12-15T19:04:11Z

Thanks for putting in so much time to review and find bugs in this PR @colinsurprenant and @ph !

elasticsearch-bot · 2015-12-15T19:04:47Z

Andrew Cholakian merged this into the following branches!

Branch	Commits
master	`028576b`, `3a1b0f2`

Fixes #4340

colinsurprenant · 2015-12-15T20:05:04Z

one observation: as we can see, these isn't much a difference in the previous performance numbers, for the same workers and default batch the ng-pipeline is slightly faster. I will not post all the results but I performed a few similar tests using the config at https://gist.github.com/colinsurprenant/8143a0dfd30830d57b03 and the performance gain is higher, for example from 5.5k/s in master to 6.6k/s with ng-pipeline which is a lot more significative.

In the previous benchmarks, the input side had more to do because of the JSON parsing by the codec and this input is not parallelized so I think that the input was the actual choke point while in the other tests, the input is straight stdin line input and everything is done in the filters so the hard work is effectively in the filter stage which benefits the ng batching etc.

This is predicated on the fact that with the ng_pipeline it is expected that workers will spend a significant amount of time in iowait due to outputs like the Elasticsearch output. In benchmarks based on real-world Apache log files the best performance came out of scenarios where pipeline_workers > num_cpu_cores. Setting this to default to the # of cores is a defensive decision that should handle cases where users have particularly IO heavy inputs. For most users we should recommend setting the number of workers to be as high as possible until performance decreases. Previous benchmark information: elastic#4340

This is predicated on the fact that with the ng_pipeline it is expected that workers will spend a significant amount of time in iowait due to outputs like the Elasticsearch output. In benchmarks based on real-world Apache log files the best performance came out of scenarios where pipeline_workers > num_cpu_cores. Setting this to default to the # of cores is a defensive decision that should handle cases where users have particularly IO heavy inputs. For most users we should recommend setting the number of workers to be as high as possible until performance decreases. Previous benchmark information: #4340 Fixes #4414

Next generation pipeline

ff5c88f

andrewvc mentioned this pull request Dec 11, 2015

Next Gen Pipeline #4254

Closed

andrewvc force-pushed the squashed_ng_pipeline branch 2 times, most recently from 2ae204a to c02dd53 Compare December 14, 2015 23:06

Use a simple timer thread for flushes instead of a task

77c1ee0

andrewvc force-pushed the squashed_ng_pipeline branch from c02dd53 to 77c1ee0 Compare December 15, 2015 02:11

andrewvc mentioned this pull request Dec 15, 2015

[WIP] Stoppable shutdown controller #4350

Closed

ph reviewed Dec 15, 2015
View reviewed changes

logstash-core/lib/logstash/output_delegator.rb

def busy_workers

@worker_queue.size

end

end end

Copy link

Contributor

ph Dec 15, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woa cleaner +1

andrewvc closed this in 028576b Dec 15, 2015

andrewvc added a commit that referenced this pull request Dec 15, 2015

Use a simple timer thread for flushes instead of a task

3a1b0f2

Fixes #4340

suyograo mentioned this pull request Dec 16, 2015

NG Pipeline #4363

Closed

andrewvc mentioned this pull request Dec 17, 2015

Sane defaults / output worker setup for Logstash 2.2 ng_pipeline #4367

Closed

andrewvc mentioned this pull request Jan 4, 2016

Up default # of pipeline workers to 100% of cores #4414

Closed

imriz mentioned this pull request Mar 20, 2016

New pipeline might break plugins by calling the register method twice #4851

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Next generation pipeline #4340

Next generation pipeline #4340

andrewvc commented Dec 11, 2015

andrewvc commented Dec 11, 2015

andrewvc commented Dec 11, 2015

colinsurprenant commented Dec 14, 2015

jordansissel commented Dec 14, 2015

ph commented Dec 15, 2015

colinsurprenant commented Dec 15, 2015

andrewvc commented Dec 15, 2015

ph Dec 15, 2015

colinsurprenant commented Dec 15, 2015

colinsurprenant commented Dec 15, 2015

ph commented Dec 15, 2015

andrewvc commented Dec 15, 2015

elasticsearch-bot commented Dec 15, 2015

colinsurprenant commented Dec 15, 2015

Next generation pipeline #4340

Next generation pipeline #4340

Conversation

andrewvc commented Dec 11, 2015

andrewvc commented Dec 11, 2015

andrewvc commented Dec 11, 2015

colinsurprenant commented Dec 14, 2015

jordansissel commented Dec 14, 2015

ph commented Dec 15, 2015

colinsurprenant commented Dec 15, 2015

andrewvc commented Dec 15, 2015

ph Dec 15, 2015

Choose a reason for hiding this comment

colinsurprenant commented Dec 15, 2015

colinsurprenant commented Dec 15, 2015

ph commented Dec 15, 2015

andrewvc commented Dec 15, 2015

elasticsearch-bot commented Dec 15, 2015

colinsurprenant commented Dec 15, 2015