Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance Improvement] Pool gzip.Writer and bytes.Buffer #19

Merged
merged 6 commits into from
Mar 14, 2024

Conversation

pakio
Copy link
Contributor

@pakio pakio commented Feb 28, 2024

Summary

Added sync.Pool to pool gzip.Writer and bytes.Buffer for better performance.

Background

While I was investigating profile report of my application, noticed transport client is not pooling gzip writer and buffer.
As discussed in this Reddit thread and many other places, pooling them will benefit us by reusing allocated memory space rather than allocating and freeing them on every request.

Performance Comparison

These are the benchmark result on my machine (Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz)

Compress body

ns/op B/op allocs/op
Before 129744 815064 32
After(Without pooling) 117462 815014 31
After(With poolong) 20462 1152 12

bench result

Running tool: /usr/local/bin/go test -benchmem -run=^$ -bench ^BenchmarkTransport$ github.com/elastic/elastic-transport-go/v8/elastictransport

goos: darwin
goarch: amd64
pkg: github.com/elastic/elastic-transport-go/v8/elastictransport
cpu: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz

=== RUN   BenchmarkTransport/Compress_body_(pool:_false)
BenchmarkTransport/Compress_body_(pool:_false)
BenchmarkTransport/Compress_body_(pool:_false)-8                    9450            117462 ns/op          815014 B/op         31 allocs/op
=== RUN   BenchmarkTransport/Compress_body_(pool:_true)
BenchmarkTransport/Compress_body_(pool:_true)
BenchmarkTransport/Compress_body_(pool:_true)-8                    57690             20462 ns/op            1152 B/op         12 allocs/op

====================
=== before the change ===
====================

=== RUN   BenchmarkTransport/Compress_body
BenchmarkTransport/Compress_body
BenchmarkTransport/Compress_body-8                    9697            129744 ns/op          815064 B/op         32 allocs/op

@pakio pakio changed the title [Performance Improvement] Pool and reuse gzip.Writer and bytes.Buffer [Performance Improvement] Pool gzip.Writer and bytes.Buffer Feb 29, 2024
@pakio
Copy link
Contributor Author

pakio commented Feb 29, 2024

Hi @Anaethelion, would you mind reviewing this PR whenever you are available?
I've been using this library via go-elasticsearch in production and this change would really help our application run more stable with high traffic volume environment.

@Anaethelion Anaethelion self-assigned this Feb 29, 2024
@Anaethelion Anaethelion self-requested a review March 7, 2024 11:38
@Anaethelion
Copy link
Contributor

Hi @pakio

I've taken some time to review and evaluate this PR.
While this seems like a worthy improvement for some use cases, it turns out that this could hurt a lot others and create some contention around the pool which could lead to OOM.

I'm not going to merge this in its current state, however I'm interested to know if you have more examples of real world use cases beyond the benchmark ? How does it help you application being more stable ? What kind of usage do you have where this helps ? Search or Bulk indexing ?

We still have the option of maintaining the current behavior and adding the pool as an optional behavior.

@pakio
Copy link
Contributor Author

pakio commented Mar 10, 2024

hi @Anaethelion , thanks for your feedback!

While this seems like a worthy improvement for some use cases, it turns out that this could hurt a lot others and create some contention around the pool which could lead to OOM.

Understood. Just to clarify, does this mean you've actually experienced OOM with sync.Pool or with this implementation, or have general concern about it?

however I'm interested to know if you have more examples of real world use cases beyond the benchmark

sure, here are some examples of use cases of sync.Pool along with gzip.NewWriter

grpc-go: link
prometheus golang client: link

And our team have also been using the feature on the platform for more than a year and is operating without any problem.

How does it help you application being more stable ? What kind of usage do you have where this helps ? Search or Bulk indexing?

Our team uses go-elasticsearch for both data indexing pipeline and search application.
While I was exploring the CPU time profile graph of our search application, interestingly more than 5% of the time were consumed on elastictransport.Perform, specifically to allocate memory for copying bytes to buffers . The size of requests we sending is generally around 3~5KB. Unfortunately we don't have profile for indexing pipeline but using esutil.BulkIndexer to batch-send requests with threshold of 10MB, therefore I suspect the impact would be larger.

We still have the option of maintaining the current behavior and adding the pool as an optional behavior.

I'm happy to make this change as well as adding more practical test case using actual json file, but firstly I would like to understand your concern regarding OOM by using sync.Pool so that I can properly address the issue.

@Anaethelion
Copy link
Contributor

Understood. Just to clarify, does this mean you've actually experienced OOM with sync.Pool or with this implementation, or have general concern about it?

This is just general concern around edge cases while pushing the BulkIndexer within its limits, I have a feeling this could lead to memory exhaustion if the machine that runs the indexing tries to do other tasks and the pool has grown too big.
My point is, it could break existing usages and I'd like not to do that. :)

sure, here are some examples of use cases of sync.Pool along with gzip.NewWriter

Thank you!

Our team uses go-elasticsearch for both data indexing pipeline and search application. While I was exploring the CPU time profile graph of our search application, interestingly more than 5% of the time were consumed on elastictransport.Perform, specifically to allocate memory for copying bytes to buffers . The size of requests we sending is generally around 3~5KB. Unfortunately we don't have profile for indexing pipeline but using esutil.BulkIndexer to batch-send requests with threshold of 10MB, therefore I suspect the impact would be larger.

Got you. It matches what I've benchmarked locally as well. The greater the payload, the greater the gain which makes sense. At the same time for search heavy use cases I've found this could slow down the client depending on the compression that has been set up.

I'm happy to make this change as well as adding more practical test case using actual json file, but firstly I would like to understand your concern regarding OOM by using sync.Pool so that I can properly address the issue.

That concern only led me to consider making this more pluggable at least at first and then in time or for a new major we can flip the switch and make that the default. But as much as we can I'd like not to break existing behavior.

@pakio
Copy link
Contributor Author

pakio commented Mar 12, 2024

I have a feeling this could lead to memory exhaustion if the machine that runs the indexing tries to do other tasks and the pool has grown too big.

It sounds a fair concern.
Will update the PR soon to make it optional, and will ping you when it's ready!

@pakio pakio force-pushed the pool-gzip-writer branch from d93c6a5 to 9438f6b Compare March 12, 2024 14:02
@pakio
Copy link
Contributor Author

pakio commented Mar 12, 2024

I've made a few changes on my implementation

  • Moved gzip compression logic to separate file
  • Added gzipCompressor interface, added simpleGzipCompressor and pooledGzipCompressor inherits the interface
  • Added Config.PoolCompressor to determine which compressor to use from client

I also found a bug in the benchmark that 1. reader was always at the end of the cursor except first loop 2. tp was always recreated in the bench which means the pool is not utilized properly. I updated them as well as the benchmark result in the description.

ptal when you get a chance! @Anaethelion

Copy link
Contributor

@Anaethelion Anaethelion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you @pakio for the great addition!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants