-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Bulk stats track the bulk per shard #52208
Conversation
Pinging @elastic/es-core-features (:Core/Features/Stats) |
@elasticmachine TBR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this @zhichen! I took a look and left a few comments that we should address, let me know what you think.
Another thing that occurs to me is whether we should take this opportunity to also track the exponentially weighted moving average for the time and size of shard bulk requests, so we can have an idea for a "recent average" for time and size. What do you think? If so, we already have an ExponentiallyWeightedMovingAverage
class we could use and track it alongside the totals (that way we could track both the overall average and the "more recent" average)
server/src/main/java/org/elasticsearch/action/bulk/TransportShardBulkAction.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/bulk/stats/BulkStats.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/bulk/stats/BulkStats.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/bulk/stats/ShardBulkStats.java
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/index/bulk/stats/BulkStatsTests.java
Outdated
Show resolved
Hide resolved
@elasticmachine ok to test |
Author of the following commits did not sign a Contributor Agreement: Please, read and sign the above mentioned agreement if you want to contribute to this project |
Author of the following commits did not sign a Contributor Agreement: Please, read and sign the above mentioned agreement if you want to contribute to this project |
@probakowski Since indexing performance is incredibly important, would you mind running your methodology past someone on the @elastic/es-perf team (e.g., maybe @danielmitterdorfer?) to ensure there are not any flaws? Any regressions here would be concerning. |
@probakowski I'd like to understand better the fluctuations. A few methodology questions:
|
@dakrone sorry, I mistakenly operated on a review request, please ignore it. |
@probakowski @jasontedor is there need any more indexing performance test before merging it. |
Yes, there is. I want to understand the methodology that was employed here. Most of the results have indicated performance regressions, I'm not convinced that they are noise. I want to understand the methodology with the questions that @dliappis asked, and also understand where these benchmarks were run (a laptop?). Indexing performance is entirely too important, we need to be cautious here. |
FYI we've synced up with @probakowski offline a few days ago to take a more critical look on the methodology, and he's currently working on a more thorough iteration, including using a higher amount of shards (since this PR adds stats per shard), isolated load driver and nodes, better choice of instances etc. |
hi @probakowski is there any update? |
Hi @zhichen, very sorry for the late update. I was finally able to get stable environment for testing and better testing methodology (thanks @dliappis and @danielmitterdorfer!) and was able to confirm that there's no visible impact on performance here (difference in median throughput was less than 0,5%, the same range as between different runs of master). I'll resolve conflicts and merge/backport the change. Thanks for your work! |
Thanks @probakowski . It's nice to see that this PR will be merged so that we can use this feature on 7.8 or 8.0 |
* Add Bulk stats track the bulk sizes per shard and the time spent on the bulk shard request (elastic#50536)(elastic#47345)
@probakowski this change doesn't appear to have been backported to 7.8: elasticsearch/server/src/main/java/org/elasticsearch/action/admin/indices/stats/CommonStats.java Lines 109 to 165 in e142d69
Should it be backported, or should the v.7.8.0 label be removed? |
This hasn't been backported to 7.9.0 either. Please hold off on any backports now (I will remove the version label as well) as this is possibly interfering with other work that we are doing in this area (related to #58885). We will need a cohesive plan first. |
Add Bulk stats track the bulk sizes per shard and the time spent on the bulk shard request
It might make sense to track the average bulk sizes per shard , since a large bulk request may be chopped down into much smaller shard level bulk operation on an index with high numbers of shards. This makes more sense to me than just tracking at the shard level since most clients are not partitioning by shard already.
Regarding the statistics of shard bulk size, considering the high cost of re-serialization, only the source field of IndexRequest and the doc field of UpdateRequest are calculated here, while the DeleteRequest in bulk will be counted as 0.
example output:
Relates (#50536)(#47345)