forked from elastic/elasticsearch
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[ML] Persist data counts and datafeed timing stats asynchronously (el…
…astic#93000) When an anomaly detection job runs, the majority of results originate from the C++ autodetect process, so can be persisted in bulk. However, there are two types of results, namely data counts and datafeed timing stats, that are generated wholly within the ML Java code and where there are serious downsides to batching them up with the output of the C++ process. (If we batched them and the C++ process stopped generating results then the input side stats would also stall, so it is better that the input side stats are written independently.) The approach used in this PR is to write data counts and datafeed timing stats asynchronously _except_ at certain key points, like job flush and close, and datafeed stop. At these key points the latest stats _are_ persisted synchronously, like before. When large amounts of data are being processed the code will generate updated stats documents faster than they can be indexed. The approach taken here is to skip persistence of the newer document if persistence of the previous document is still in progress. This can lead to the stats being slightly out of date while a job is running. However, at key points like flush and close the data counts will be up-to-date, and the datafeed timing stats will get written at least once per datafeed `frequency`, so should not be more out-of-date than that.
- Loading branch information
1 parent
603dbfa
commit 69914bf
Showing
14 changed files
with
275 additions
and
91 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 93000 | ||
summary: Persist data counts and datafeed timing stats asynchronously | ||
area: Machine Learning | ||
type: enhancement | ||
issues: [] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.