Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add throughput performance tests for OTLP exporter #1491

Merged

Conversation

NathanielRN
Copy link
Contributor

Description

According to the Performance testing Spec, we should have performance tests to test the throughput of the OTLP Exporter.

In the Java equivalent tests, they actually run the Collector.

However, for these tests, I thought it would be sufficient to turn the gRPC export command into a No-op. We can then say that these tests test how much the SDK side exporter can export in a second.

I think it is possible that some spans won't be exported before the benchmark finishes, but it looks like that was not a concern for the Java tests, should we be concerned with that?

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Running tox -e test-exporter-otlp runs these benchmarking tests

Does This PR Require a Contrib Repo Change?

  • No.

Checklist:

  • Followed the style guidelines of this project
    - [ ] Changelogs have been updated
    - [ ] Unit tests have been added
    - [ ] Documentation has been updated

@NathanielRN NathanielRN requested review from a team, toumorokoshi and aabmass and removed request for a team December 17, 2020 06:49
@codeboten codeboten added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Dec 17, 2020
@NathanielRN NathanielRN force-pushed the add-exporter-throughput-tests branch 13 times, most recently from dd5fe0b to bc208bb Compare December 18, 2020 01:38
@NathanielRN
Copy link
Contributor Author

Looks like pypy3 tests are failing in the latest action/setup-python release actions/setup-python#171

@codeboten
Copy link
Contributor

Thanks for catching the pypy issue @NathanielRN, looks like it was addressed and a new release went out an hour ago, re-running the jobs

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good, I'd consider re-designing the batch exporter benchmark slightly, depending on what columns of pytest-benchmark you want to be accurate.

)
span.end()

benchmark(create_spans_to_be_exported)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this uses pytest-benchmark. In that case, I would note that the table that it outputs here will show you the average load per create_spans_to_be_exported call.

In this case, one particular call will be much more expensive, since the batch export thread will activate and consume a lot of CPU to process them all.

Overall it should show up in the average measurement, just saying that you have to watch out for others like min / max. those will be very misleading.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is using pytest-benchmark! That's really insightful, thanks I didn't know! Sounds good, even our performance tests graph uses the "mean" measurement to display the "# of iterations"

def test_batch_span_processor(benchmark):
tracer = get_tracer_with_processor(BatchExportSpanProcessor)

def create_spans_to_be_exported():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to be complete, you can run force_flush.

It'll require some calculation, but you may be better off re-writing this benchmark to run a combination of 10k create_spans_to_be_exported, followed by a force_flush. This ensures a consistent benchmark, rather than the batch span processor running when it needs to, which could be different from run to run.

But it looks like max_queue_size on the batch span exporter will force flushes anyway. (default 2048)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your super helpful comments as always :)

I thought a lot about what you said, and it makes sense to me, but I also think it makes sense to keep these BatchExportSpanProcessor tests as is with only 1 span created every time.

I noticed that when I changed it to "create 2048 spans and then export" you could no longer compare the "Processors" because the Batch Processor could only run 5 times (but it exported 2048 * 5 = 10,240 spans) while the Span Processor ran 134 times. Further, I reasoned that if that 2048 * 4th span finished right before the 1 second limit by pytest-benchmark, it would run the function again and add another 2048 spans to the benchmark when really the test should have been over by that point (and it is for the SimpleExportSpanProcessor).

It's true that the "time-to-export" for the BatchExportSpanProcessor won't be consistent, but when I ran it locally it never got so inconsistent that it seemed to produce crazy results. Over my trials, I got the following # of spans completed in 1 second:

Trial 1: 8036
Trial 2: 7282
Trial 3: 7965
Trial 4: 7642
Trial 5: 7871

I'm not sure when the Processor decides to export, but I think what's great about benchmarking 1 single span creation allows us to see how much better it is than the SingleExportSpanProcessor here over the same trials:

Trial 1: 138
Trial 2: 134
Trial 3: 134
Trial 4: 129
Trial 5: 139

This means the BatchExportSpanProcessor never regresses so badly in its "time-to-export" decisions that it gets slow like the SimpleExportSpanProcessor which exports on every span.

Once we have time to include Self-hosted runners I would expect its decisions to get consistent, but even so, these tests will make sure that regressions in the Processor "time-to-export" decisions don't go unnoticed :)

TL;DR I think it's better for both comparisons against other Processors and accuracy for # of spans finished during the 1s benchmark to leave the test as is. This is because even the "Batch" tests should be consistent enough to not have sharp changes unless the algorithm changes.

@toumorokoshi
Copy link
Member

However, for these tests, I thought it would be sufficient to turn the gRPC export command into a No-op. We can then say that these tests test how much the SDK side exporter can export in a second.

sounds good! Looking at the server stub it looks pretty comprehensive expect for drain from the channel, so should reproduce CPU load pretty well.

I think it is possible that some spans won't be exported before the benchmark finishes, but it looks like that was not a concern for the Java tests, should we be concerned with that?

Left a comment. I presume you refer to BatchExportSpanProcessor, as SimpleExportSpanProcessor calls export on every call.

@NathanielRN NathanielRN force-pushed the add-exporter-throughput-tests branch 3 times, most recently from 448a643 to 6c182c1 Compare December 21, 2020 22:22
@codeboten
Copy link
Contributor

@NathanielRN any thoughts on what's happening w/ CI here? it doesn't look like it's passing any of the checks

@NathanielRN
Copy link
Contributor Author

@codeboten I'm not sure what's going on, I've tried force pushing it multiple times just to get the tests to run. It's in a permanent "queued" state neither failing nor passing the tests :/

The only thing I can think of is that I changed the test.yml file to have a find . -name '*${{ matrix.package }}*-benchmark.json' > output.json` command, so I can try changing that and seeing what happens.

@NathanielRN NathanielRN force-pushed the add-exporter-throughput-tests branch 2 times, most recently from a408146 to e8e9d5d Compare December 22, 2020 22:19
@NathanielRN NathanielRN force-pushed the add-exporter-throughput-tests branch 14 times, most recently from 6c6781c to 84c6ebc Compare December 23, 2020 00:29
@NathanielRN NathanielRN force-pushed the add-exporter-throughput-tests branch from 84c6ebc to 302432b Compare December 23, 2020 00:33
@NathanielRN
Copy link
Contributor Author

@codeboten I've fixed the tests finally! This should be ready to merge :)

@codeboten codeboten merged commit 8ebd6c8 into open-telemetry:master Dec 23, 2020
@NathanielRN NathanielRN deleted the add-exporter-throughput-tests branch December 23, 2020 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Skip Changelog PRs that do not require a CHANGELOG.md entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants