Add throughput performance tests for OTLP exporter #1491

NathanielRN · 2020-12-17T06:49:50Z

Description

According to the Performance testing Spec, we should have performance tests to test the throughput of the OTLP Exporter.

In the Java equivalent tests, they actually run the Collector.

However, for these tests, I thought it would be sufficient to turn the gRPC export command into a No-op. We can then say that these tests test how much the SDK side exporter can export in a second.

I think it is possible that some spans won't be exported before the benchmark finishes, but it looks like that was not a concern for the Java tests, should we be concerned with that?

Type of change

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Running tox -e test-exporter-otlp runs these benchmarking tests

Does This PR Require a Contrib Repo Change?

No.

Checklist:

Followed the style guidelines of this project
~~- [ ] Changelogs have been updated~~
~~- [ ] Unit tests have been added~~
~~- [ ] Documentation has been updated~~

NathanielRN · 2020-12-18T01:42:11Z

Looks like pypy3 tests are failing in the latest action/setup-python release actions/setup-python#171

codeboten · 2020-12-18T15:22:44Z

Thanks for catching the pypy issue @NathanielRN, looks like it was addressed and a new release went out an hour ago, re-running the jobs

toumorokoshi

overall looks good, I'd consider re-designing the batch exporter benchmark slightly, depending on what columns of pytest-benchmark you want to be accurate.

toumorokoshi · 2020-12-20T14:58:21Z

...er/opentelemetry-exporter-otlp/tests/performance/benchmarks/test_benchmark_trace_exporter.py

+            )
+        span.end()
+
+    benchmark(create_spans_to_be_exported)


I'm assuming this uses pytest-benchmark. In that case, I would note that the table that it outputs here will show you the average load per create_spans_to_be_exported call.

In this case, one particular call will be much more expensive, since the batch export thread will activate and consume a lot of CPU to process them all.

Overall it should show up in the average measurement, just saying that you have to watch out for others like min / max. those will be very misleading.

Yes it is using pytest-benchmark! That's really insightful, thanks I didn't know! Sounds good, even our performance tests graph uses the "mean" measurement to display the "# of iterations"

toumorokoshi · 2020-12-20T15:01:54Z

...er/opentelemetry-exporter-otlp/tests/performance/benchmarks/test_benchmark_trace_exporter.py

+def test_batch_span_processor(benchmark):
+    tracer = get_tracer_with_processor(BatchExportSpanProcessor)
+
+    def create_spans_to_be_exported():


If you want to be complete, you can run force_flush.

It'll require some calculation, but you may be better off re-writing this benchmark to run a combination of 10k create_spans_to_be_exported, followed by a force_flush. This ensures a consistent benchmark, rather than the batch span processor running when it needs to, which could be different from run to run.

But it looks like max_queue_size on the batch span exporter will force flushes anyway. (default 2048)

Thanks for your super helpful comments as always :)

I thought a lot about what you said, and it makes sense to me, but I also think it makes sense to keep these BatchExportSpanProcessor tests as is with only 1 span created every time.

I noticed that when I changed it to "create 2048 spans and then export" you could no longer compare the "Processors" because the Batch Processor could only run 5 times (but it exported 2048 * 5 = 10,240 spans) while the Span Processor ran 134 times. Further, I reasoned that if that 2048 * 4th span finished right before the 1 second limit by pytest-benchmark, it would run the function again and add another 2048 spans to the benchmark when really the test should have been over by that point (and it is for the SimpleExportSpanProcessor).

It's true that the "time-to-export" for the BatchExportSpanProcessor won't be consistent, but when I ran it locally it never got so inconsistent that it seemed to produce crazy results. Over my trials, I got the following # of spans completed in 1 second:

Trial 1: 8036
Trial 2: 7282
Trial 3: 7965
Trial 4: 7642
Trial 5: 7871

I'm not sure when the Processor decides to export, but I think what's great about benchmarking 1 single span creation allows us to see how much better it is than the SingleExportSpanProcessor here over the same trials:

Trial 1: 138
Trial 2: 134
Trial 3: 134
Trial 4: 129
Trial 5: 139

This means the BatchExportSpanProcessor never regresses so badly in its "time-to-export" decisions that it gets slow like the SimpleExportSpanProcessor which exports on every span.

Once we have time to include Self-hosted runners I would expect its decisions to get consistent, but even so, these tests will make sure that regressions in the Processor "time-to-export" decisions don't go unnoticed :)

TL;DR I think it's better for both comparisons against other Processors and accuracy for # of spans finished during the 1s benchmark to leave the test as is. This is because even the "Batch" tests should be consistent enough to not have sharp changes unless the algorithm changes.

toumorokoshi · 2020-12-20T15:02:56Z

However, for these tests, I thought it would be sufficient to turn the gRPC export command into a No-op. We can then say that these tests test how much the SDK side exporter can export in a second.

sounds good! Looking at the server stub it looks pretty comprehensive expect for drain from the channel, so should reproduce CPU load pretty well.

I think it is possible that some spans won't be exported before the benchmark finishes, but it looks like that was not a concern for the Java tests, should we be concerned with that?

Left a comment. I presume you refer to BatchExportSpanProcessor, as SimpleExportSpanProcessor calls export on every call.

codeboten · 2020-12-22T22:08:29Z

@NathanielRN any thoughts on what's happening w/ CI here? it doesn't look like it's passing any of the checks

NathanielRN · 2020-12-22T22:16:48Z

@codeboten I'm not sure what's going on, I've tried force pushing it multiple times just to get the tests to run. It's in a permanent "queued" state neither failing nor passing the tests :/

The only thing I can think of is that I changed the test.yml file to have a find . -name '*${{ matrix.package }}*-benchmark.json' > output.json` command, so I can try changing that and seeing what happens.

NathanielRN · 2020-12-23T00:45:36Z

@codeboten I've fixed the tests finally! This should be ready to merge :)

NathanielRN requested review from a team, toumorokoshi and aabmass and removed request for a team December 17, 2020 06:49

codeboten added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Dec 17, 2020

NathanielRN force-pushed the add-exporter-throughput-tests branch 13 times, most recently from dd5fe0b to bc208bb Compare December 18, 2020 01:38

codeboten approved these changes Dec 18, 2020

View reviewed changes

toumorokoshi approved these changes Dec 20, 2020

View reviewed changes

NathanielRN force-pushed the add-exporter-throughput-tests branch 3 times, most recently from 448a643 to 6c182c1 Compare December 21, 2020 22:22

NathanielRN added 2 commits December 22, 2020 14:18

Add throughput performance tests for exporter

755a123

Add note about interpretting batch export benchmark

ce8c8a4

NathanielRN added 2 commits December 22, 2020 14:18

Smarter check for deciding to run benchmarks

ecf346e

Fix naming for general tracer provider func

e8e9d5d

NathanielRN force-pushed the add-exporter-throughput-tests branch 2 times, most recently from a408146 to e8e9d5d Compare December 22, 2020 22:19

Remove test condition from comparison statement

b29627a

NathanielRN force-pushed the add-exporter-throughput-tests branch 14 times, most recently from 6c6781c to 84c6ebc Compare December 23, 2020 00:29

Fix the syntax for running sub command in workflow

302432b

NathanielRN force-pushed the add-exporter-throughput-tests branch from 84c6ebc to 302432b Compare December 23, 2020 00:33

Merge branch 'master' into add-exporter-throughput-tests

d98b412

codeboten merged commit 8ebd6c8 into open-telemetry:master Dec 23, 2020

NathanielRN deleted the add-exporter-throughput-tests branch December 23, 2020 02:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add throughput performance tests for OTLP exporter #1491

Add throughput performance tests for OTLP exporter #1491

NathanielRN commented Dec 17, 2020

NathanielRN commented Dec 18, 2020

codeboten commented Dec 18, 2020

toumorokoshi left a comment

toumorokoshi Dec 20, 2020

NathanielRN Dec 21, 2020

toumorokoshi Dec 20, 2020

NathanielRN Dec 21, 2020

toumorokoshi commented Dec 20, 2020

codeboten commented Dec 22, 2020

NathanielRN commented Dec 22, 2020

NathanielRN commented Dec 23, 2020

Add throughput performance tests for OTLP exporter #1491

Add throughput performance tests for OTLP exporter #1491

Conversation

NathanielRN commented Dec 17, 2020

Description

Type of change

How Has This Been Tested?

Does This PR Require a Contrib Repo Change?

Checklist:

NathanielRN commented Dec 18, 2020

codeboten commented Dec 18, 2020

toumorokoshi left a comment

Choose a reason for hiding this comment

toumorokoshi Dec 20, 2020

Choose a reason for hiding this comment

NathanielRN Dec 21, 2020

Choose a reason for hiding this comment

toumorokoshi Dec 20, 2020

Choose a reason for hiding this comment

NathanielRN Dec 21, 2020

Choose a reason for hiding this comment

toumorokoshi commented Dec 20, 2020

codeboten commented Dec 22, 2020

NathanielRN commented Dec 22, 2020

NathanielRN commented Dec 23, 2020